Michael Cooley's Genetic Genealogy Blog
17 July 2022

More Headway Made on the R1a-YP4248 Y-DNA Project

The R1a-YP4248 project is named for the YP4248 marker, a single-point mutation at position 7662815 of the Y chromosome. Simply, a molecule of guanine (G) became a thymine (T) molecule at the birth of one of our shared paternal ancestors more than a thousand years ago. This genetic modification, like the others described here, passed down from the date of its inception to all male lineage descendants. Each and every tester in the project has a copy of it in the trillions of cells that make up our bodies. In addition to the genetics, this article looks at the genealogy, history, and the geographical locations at which our ancestors likely lived. A summary of the surnames is provided and, to a degree, their etymologies. Great value is found wherever these disciplines are found to intersect.

For starters, the R1a-YP4248 Y-DNA Project is comprised of the following surnames: Cochran, Cooley, Coombs, Cummings, Gray, Hackett, Hawley, Hazelet, Higdon, Mann, Rankin, Semple, Stearrett, Story, Tyrie, and Whitfield, all of whom descend from a man born about a thousand years ago. These names are deeply connected, most likely within two or more groups of Vikings that arrived in the British Isles a millennia ago. This is deep genealogy, but not so deep that specific issues cannot be resolved, and not so deep that ancestors of about 200 and more years ago cannot be identified.

The Pennines separates east from west

Current genetic results reveal the presence of at least two distinct genetic groups below YP4248. They're designated as subclades R1a-YP4253 and R1a-YP5007. Others may exist, but it's impossible to make predictions about that as the detection of new subclades is wholly dependent on what testers might show up in the future. In point of fact, there are large numbers of Scandinavians with STR haplotypes quite close to our modal haplotype, yet their SNP markers reside outside the R1a-YP4248 haplogroup and are not included in this study. But that these testers are almost exclusively Scandinavian, is one proof that we're almost certainly descended from Vikings, verified by the high concentration of the markers in the region. I suspect, however, there are sibling subclades of YP5007 and YP4253, but that their descendants are likely still in Scandinavia.

My thinking about the origins of YP4248 and its subclades changed somewhat with the suggestion by a Scandinavian researcher that the ancestors of the two subclades stepped ashore at separate places and at different times. Looking at the geographical location for our paternal ancestors and at certain attributes of the genetics, the comment increasingly makes sense. The English origins of the YP4253 subclade triangulates to the northwestern region of England, somewhat corresponding with the arrival of Vikings at the Wirral Peninsula, just south of Liverpool and the River Mersey. Subclade YP5007 clearly has Scottish origins and might have descended from a Viking who landed in Northumberland. Having only a rudimentary understanding about geography of Great Britain, I was pleased that linguist and collaborator, John Cleary, informed me that east and west in the north of England is divided by the Pennine Mountains and that passage in those days wasn't a simple matter. This map clearly supports the two immigrant hypothesis.


Definitions

Before I continue, here are definitions and explanations for the terms and concepts referenced in the above paragraph. For the benefit of those who might not want to join, I'm turning on techie mode by sidebarring the following paragraphs with a blue border. But please pick up afterwards as there's more fascinating history below.

First, remember that only men have a Y chromosome. It passes straight from father to sons until the lineage becomes extinct. Because it holds the male sex gene (SRY), anyone who inherits it (it comes only through the dad) will develop male genitalia. Furthermore, 100 percent of the chromosome is inherited.

Although men inherit their father's Y-DNA, an occasional mutation will sneak in. These mutations rarely affect the small number of genes on the Y and are harmless. Two types of mutations are of particular interest to Y-DNA studies. STRs (Short Tandem Repeats) are shown on the results page at FTDNA.com. The numbers represent the number of times a string of genetic letters repeats in a specific region of the Y. For example, every project member, but two, has 13 repeats of AGAT at the first listed position (DYS393). A haplotype is the specific set of markers found in a tester's sample. These are represented horizontally across the page. A Modal haplotype represents the most common values, vertically, for each position among a group of testers. This is how STR haplotypes are represented on my server, the top grey line being the modal.



But these repeats can regularly go up and down through the generations, rendering STRs unreliable for phylogenetic trees (such as the SNP tree, below.) Instead, STRs are good for grouping similar results, as found in the FTDNA's tables, and can distinguish between surnames that have differing genetic inheritances. On the other hand, trees can be constructed from SNPs (Single Nucleotide Polymorphisms). SNPs are single genetic letters (out the 57 million found on the Y chromosome) that mutate from one value to another, For example, our R1a-YP4248 haplogroup is defined by the YP4248 SNP, a simple mutation from a G to a T at position 7662815 of the Y chromosome. It occurred at the birth of man at least a thousand years ago and every male lineage descendant has it. Now, some SNPs live in squirrelly regions of the Y, within STRs, for example, and are not seriously. Like STRs, you never know whether it will quickly mutate to another value. It's in the vast regions of the Y (in a microscopic sense!) that SNPs are highly stable. Some such SNPs are known to have passed through male-lineage descendants for more than 300,000 years. This stability, from inception to the expiration of the lineage, is their most remarkable feature. The result is a vast archive of any male lineage's genetic history. This study is about excavating that archive and determining when, where, and with what birth the markers first emerged. It behooves every genealogist to understand them.

Because of Y-SNP's strict inheritance pattern and because they emerge at different points in a lineage, they can be arranged into trees. The nodes are known as haplogroups and each, in parent-child fashion, have subclades. For example, here's a barebones haplogroup tree for a Duncan group I'm working with. I've chosen it only because it's smaller and a bit more concise. The testers' kit numbers are at the bottom. I call this a Quick Tree.


D U N C A N





In this example, FGC3214 is a subclade of A1147. In turn, FGC3214 has three defined subclades. And note that the older a haplogroup is, the more descendants it will have, just as a 5th great-grandfather will generally have more living descendants than your father. It's this that allows us to make the trees. As we'll see, however, such trees can contain a great deal more data.


The full YP4248 SNP tree is far more detailed and larger than that above. The reader might want to bring it up in a separate tab or window in order to follow along through the next two sections.

The R1a-YP4248 haplogroup, which includes eleven individual SNP markers, is shown at top, dead center. It has two known subclades, YP4253, at the left, and further along the much bigger YP5007 subclade. In turn, YP4253, has two subclades, A12124 and YP4491, the latter to which I belong. I'll talk more about it later, but first YP5007, which has a fascinating historical context.


The YP5007 Subclade

To date, five subclades of YP5007 can be found. The first shown on the tree doesn't have a subclade name yet. (A subclade needs at least two testers in order to form a genetic family). The one tester (and others will be found) is descended from Thomas Hazelet (1600-). The other four are associated with the surnames (going from left to right) Semple (Sempill), Storry, Rankin, and Cochran. All these names are found in Renfrewshire, Scotland. But some genetic context before I proceed.

The major R1a haplogroup is East European in origin, whereas R1b is the dominant haplogroup in the West. An R1a population eventually made its way into Scandinavia and become dominant there. Scandinavian men were and are, of course, of various haplogroups and were represented among the Vikings. R1a's presence in the British Isles is small, generally attributed to the Vikings, and are present largely in the northern regions of Great Britain, in Ireland, and Iceland. But keep in mind that all the names in our project are also found the largely R1b populations of the Isles. Therefore, R1a is found minimally in surname projects. So, when you meet a Cochran, he's most likely to be of an R1b haplogroup.

I've decided to make the summation on the YP5007 surnames easy on myself, and simply quote from House of Names.1 (Nothing scholarly here.)


SEMPILL

The surname Sempill was first found in Renfrewshire (Gaelic: Siorrachd Rinn Friù), a historic county of Scotland, today encompassing the Council Areas of Renfrew, East Renfrewshire, and Iverclyde, in the Strathclyde region of southwestern Scotland, where they held a family seat from early times and their first records appeared on the early census rolls taken by the early Kings of Scotland to determine the rate of taxation of their subjects.

STORRY

The surname Storry was first found in Northumberland where they were said to be descended from an ancient line of Viking settlers of knightly degree and with episcopal rank. Roughly translated from the Viking records the name means "dweller by large and rough water". They moved north into Renfrewshire, Scotland and acquired considerable estates. In Scotland, William Storie was a charter witness in Dundee in 1281. Walter Stori was canon of Aberdeen in 1320 and Adam Story was one of 'burgenses rure manentes in Aberdeen, 1317. A few years later, in England, the Yorkshire Poll Tax Rolls of 1379 revealed Thomas Storre; Johannes Storre; and Roger Storre as all holding lands there at that time. "Storey is still among the most familiar of Yorkshire names, but it has become, of necessity, mixed with Storer, which also is well established in that county."

RANKIN

The age-old Hebrides islands and the west coast of Scotland are the ancestral home of the Rankin family. Their name comes from the personal name Randolph, with the addition of the diminutive suffix -kin. The surname Rankin was first found in Ayrshire (Gaelic: Siorrachd Inbhir Àir), formerly a county in the southwestern Strathclyde region of Scotland, that today makes up the Council Areas of South, East, and North Ayrshire. "There is a tradition of descent from one John, son of a knight called Jacob de Rankine, burgomaster of Ghent, who married a daughter of the head of the house of Keith, and became progenitor of the Rankines. This tradition is difficult to prove but was nevertheless authored by M. H. Rankin, Esq.

COCHRANE

The Strathclyde-Briton people of ancient Scotland were the first to use the name Cochran. The Cochran family lived in Renfrewshire, where they took on the name of the lands of Cochrane in the parish of Paisley, near Glasgow. This place name is of uncertain derivation, perhaps stemming from the Welsh word "coch," meaning "red." The surname Cochran was first found in Renfrewshire ... where the first record of the name was Waldeve de Coueran, who was witness to a charter issued by Dugal, son of Syfyn, to Walter Stewart, fifth Earl of Menteith, regarding several lands in Kintyre. William de Coughran of Lanark swore an oath of allegiance to King Edward I of England during his short conquest of Scotland in 1296. Walter Cochrane was the first record of the more popular spelling used today in 1262. His son William Cochrane, the second chief of the Clan, also rendered homage to King Edward I in 1296.

Some R1a Cochrans claim descent from this same Waldeve de Coueran (1215-1275). The documentary proof, however, has yet to be found. Still, it makes sense both geographically and genetically. And that they had familial connections to the Rankins and Sempills is also significant. But the lineage broke when the Cochran estates, and name, were inherited by Alexander Blair through his marriage to Elizabeth Cochrane, heir and daughter of William Cochrane (1548-1603). This Cochrane/Blair lineage is still extant in the form of Iain Alexander Douglas Blair Cochrane, 15th Earl of Dundonald. The earl has two sons and a daughter, so the legacy will continue into the future. The Blairs are of a different haplotype altogether so they can be ruled out as paternal ancestors to our group. Yet Cochrans might well have descended from a cadet branch of the original Lords Cochran. We can be reasonably certain that this issue is unlikely to be resolved through the written record. On the other hand, the genetic record is still available, and more SNP testing will get us closer. After all, we already know that the well-connected Sempills, Rankins, and Cochrans are (and were) connected through the Y chromosome. More said on that below.

Although I've just started my research on Hazelet (Haslett), everything I've seen to date points to English origins for the name. Clearly, the Hazelet ancestor came from the Scots, as does all the YP5007 subclade. The first mystery is to determine how the name came into the lineage. Was it adopted? Did the name have other origins. Were they a collateral branch of a more familiar Scottish name?

Not only were these families connected in Renfrewshire, we find the names associated in Ulster. Glenn Hazelet, and admin along with myself at the Haslett DNA Project, has provided the following historical background.

The Faggs Manor Presbyterian Church was organized in 1739 as the (New) Londonderry Congregation of Fagg's Manor. The name was changed to Fagg's Manor Presbyterian Church in 1793. The significance of this is that both Hazlets and Cochrans were elders in the church so both families lived in the area.

Listed in the church records are:
James Rankin d. 1789, age 25
James Cochran: 1740 (first Cochran)
William Hazlet and Stephen Cochran listed as elders in 1786.



The YP4253 Subclade

Too few people have tested for this clade, but there might be a number of reasons for that. Names such as Cooley and Hackett lack the romance of the famous Scottish and Irish clans. Descendants may simply not be that curious because they've never seen their name listed among the great families. To borrow Gertrude Stein's remark about Oakland, California, "There's no there there." And it's entirely possible there are relatively few descendants to test. A later arrival to Britain might account for part of that. In other words, most carriers of YP4253 might still reside in Scandinavia. Indeed, if we study the "cousin" haplogroups, we find a great many Scandinavian names. Yet the jury is still out.

These names appear to triangulate in the northwest of England; hence my assertion of Wirral Viking descent, described above. And the Y-DNA for our testers is undeniably Scandinavian. So far, all data is pointing to the region around Liverpool and Wirral. But Viking or not, most etymological sources consistently give Old English origins for the Y4253 surnames. Judge for yourselves.


PREFIX
SUFFIX
OLD
ENGLISH
OLD
NORSE
MEANING
-fieldfeldakrfield
-ton (-don)tūntúnenclosure, village, town
-leylēah, lēaġen/aclearing
coo-kýrcow
haw-hægberien/ahawthorn berry
hig-híegheyhay
whit-hwīthveiti
(wheat)
white, (wheat)

The Old Norse meanings for tún include dwelling, farmstead, abode, enclosure, courtyard, (enclosed) field, hayfield, homefield, home meadow, (poetic) dwellings and precincts. And the Old Norse for wheat field is hveitiakr, i.e., whitacre. Interestingly, Derbyshire records show a marriage between William Cowley (c1714-) and Mary Whitacre. The couple had one son, John Cowley (c1738-). I find no further record for them in Derbyshire after about 1750. (Our John Cooley was born in England about 1738 and first appears in Virginia records in 1755.) Hackett is a little more difficult to break down. The online material I find suggests the name came from the Old English word haki meaning hooked. In any event, we have something like this,2


Cooleycow field
Hacketthooked (?)
Hawleyhaw field
Higdonhay enclosure
Whit(field)white/wheat (field)

As I mentioned above triangulation is the tool for the pursuit, whether it's triangulating on geography, era, surnames, or genetics. Of course, all these surnames can be found across England. Genetics, then, is our biggest clue and the northwest of England is a relative hotbed for R1a haplogroups. So far, however, the DNA pickings are slim for YP4253, as are the known lineages. We've yet to find a Cooley tester with a known Cooley lineage back to England. And note that the Whitfield testers are so close an STR match to the descendants of John Cooley (c1738-1811) that we believe they derived from an NPE among the Cooleys in England, and not too many generations prior to their arrival in Virginia. But this is yet to be proved. Yet note that Derbyshire has villages Cowley and Whitfield.

There are only two lineages we can discuss with certainty. This is far from proving the whole crew resided in the area, but it's a beginning. The lineages trace back as follows,


HackettDerbyshire, 1747
HawleyCheshire, 1790

That alone is certainly not enough evidence with which to draw any real conclusions. But we have several pieces coming together: name origins, genetics, history, geography, and the few lineages we can piece together.

I've pulled the entire project together, combining both the genealogical and SNP lineages, as a R1a-YP4248 hybrid tree. I've found that it helps to visualize how these two methods overlay one another.


Timeline Report for Haplogroup YP4248

I'm turning techie mode back on and would not feel slighted in the least should a reader or more choose to skip to the conclusion. But I'm equally happy to have company as I dive a bit deeper into this thoroughly interesting world.

As stated above, new SNP markers are present at the birth of specific men. In fact, the mutations randomly occur at the creation of the sperm, a sperm cell that had one in tens of millions of chances of entering one of mother's hundreds of eggs. Each, sperm and egg, will have characteristics that allow for a unique individual, even the child's sex, determined by the presence of a Y or an X chromosome in the sperm.

Once a new SNP is discovered, it's arbitrarily given a name, such as YP4248, the 4248th SNP named in the YP series at YFull.com. Whether we look at the Quick Tree or the full SNP tree, the descendants of any one haplogroup share a Most Recent Common Ancestor (MRCA) a specific man whether we know his name or not. Any advanced SNP tester can follow their SNP trail straight up the tree to whichever haplogroup their care. In this example, all testers are shown going back to YP4248.

But I've already listed your SNPs for each of you. You need only your kit number to follow along. Keep in mind that each SNP represents the very man in which the SNP first appeared. And remember, all new SNPs are passed down through the male lineage to all succeeding generations beginning from its inception, and also that these SNP creations are, on average, spaced by about four generations. For example, the first list is a direct representation of twelve individuals, that they're about one-quarter of a lineal descent of perhaps 48 men. Although we know the order of the haplogroups, the order of the members of the haplogroups will not be determined until the right tester comes along. (These entries, by the way, can be copied and pasted.)


Known Descendant SNPs of R1a-YP4248

YP4253
YP4254
YP4255
YP4257
A12124
FTB6689
FTB8950
FTB6983
FTB7334
FTB7682
FTB7956
FTB6655
YP4253
YP4254
YP4255
YP4257
A12124
A12126
A12127
A12128
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
A29946
A7498
A7497*
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
A7411
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
Y76109
FTA84182
FTA84863
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
FT140057
A17721
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
FTA58904*
FTA56664
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
A14496
A14495
YP4253
YP4254
YP4255
YP4257
YP4491
YP4492
YP4493
YP4494
FT144702*
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
FTC56584
FTC57682
FTC64925
FTC52730
FTC53593
FTC64953
FTC64954
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY27664
FTB60783
FTB61216
FTB59852
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY27664
BY27665
Y70911
Y71660
Y72774
Y75160
BY134046
Y79767
BY149694
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY27664
BY27665
Y96328
Y98299
FTC70073
3617477=CA
MF222177
6361862=CA
FTC68597
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY27664
BY27665
Y96328
Y98299
BY226047
BY65753
BY69379
BY74965
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
FT335052
FT335764
FT336598
FT336903
FT337379
FT337526
FT338291
FT338904
FT341599*
FT334720
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
BY30796
BY30797
Y128826
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
BY30796
BY30797
Y108621
FT106926
Y112515
FTC22147
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
BY30796
BY30797
Y108621
FT106926
Y112515
FT168230*
FT170856
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
BY30796
BY30797
Y108621
FT106926
Y112515
FT106724
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
BY30798
BY30796
BY30797
Y108621
FT106926
Y112515
FT106724
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
FTA31692
FT34159
FTA35194
FTA86160
FTA31313
FTA31466
FTA29885
FTA29998
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
FTA31692
FT34159
FTA35194
FTA86160
FTA74193
FT290693
FTA75212
FTA75293
FTA76438
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
BY116184
FT389363
MF120379
FT390045
FT387814*
FT387882*
FT388324
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
FT407422
Y61144
Y61720
Y69015
Y76518
FT407989
FT407407
FT407449
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
FT407422
Y61144
Y61720
Y69015
Y76518
Y64331
Y80033
FTA26782
FTA26374
FTA26964
FTA26988
FTA26723
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
FT407422
Y61144
Y61720
Y69015
Y76518
Y64331
Y80033
FTA26782
FTA26374
Y67035
Y70195
FTA50409
FTA54389
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
FTB49459
FTB49825
FTB55341
FTB55589
FTB55647
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
15960214=GT
2863654=CG
4115247=GA
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
Y65968
FT123924*
Y72633
FT123352
FT123647
BY80739
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
FTA20728
BY165580
FTA52411
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
FTA22447
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
FTA22447
FTB28865
FTB29764
FTB29862
FTB29960
FT238436
FTB31291
F17858
YP5007
Y37656
YP5008
YP5009
YP5010
YP5011
YP5244
YP5245
YP5242
YP5246
YP5247
FTA22447
FTA22181
FTA21889
FTA21949
FTA21965
FGC41633
#MI17444
#323704
#57597
#N3690
#558118
#651501
#573208
#520597
#910648
#951881
#962459
#343609
#935042
#B161280
#915713
#183830
#IN114058
#IN57019
#IN96910
#N23144
#552162
#149142
#772584
#IN89936
#378638
#378637
#B5637
#775878
#391280
#943658
#855341
#171069
#184037
12
8
11
9
11
10
10
10
9
13
10
15
15
14
17
10
13
14
13
13
14
15
15
16
20
21
16
14
17
14
12
19
17
Alternating colors distinguish between the haplogroups, which align with the timeline. The order of the SNPs within the groups, however, is not known. "Lead" SNPs, those for which the haplogroups are named, are arbitrarily chosen among the others.

The arithmetic calculations are straightforward.

Total SNPs: 447
Number Kits: 33
Avg SNPS per: 14
TMRCA: 550-774 CE

The average SNP mutation count of 14 for YP4248 is arrived at by dividing the total number of SNPs by the number of testers. Because the rate of mutation can be different for every tester, the total per lineage can vary considerably. Indeed, only five testers in the project have the "predicted" 14 SNPs. And I estimate that new markers appear about every four generations. I typically use 100 years to define that period. Others use 84 years. In other words, we merely need to multiply the average (14) by both factors to get a range. Moreover, and this is a very important point, the result needs to be subtracted from "the present," which is scientifically defined as the year 1950. And I can personally justify that figure because I happened to be born that year. All the inherited mutations I have were present on that date. It's not surprising that so many Y-DNA testers' births bracket that year. Hence, the validity of the start start point of proposed timeline.

Let's use the same method for the Cochrans, all descended from the YP5242 haplogroup, a two-step subclade below YP5007.

1950 - ((32 SNPs / 7 kits) * 100 years) = c1493 CE

It's reasonable to estimate, with some confidence, that the seven men had a common ancestor living sometime in the 1500s in Scotland. Most of the group, however, presently trace their lineages only back to Ireland in the mid-1700s and later. Finding Cochran testers with lineages that can be documented into Scotland should tell us more.

And this brings us to the subject of personal variants (PVs), those SNPs present in an individual's SNP stream that have yet to been matched. Simply, once a matching tester comes along, a new subclade has been discovered. In other words, one tester is an individual. Two makes a family — with shared genetic markers.

Keep in mind that there are 57 million genetic letters on the Y chromosome. A tester having, say, four PVs is a minuscule difference in context to the whole. (Of course, there can be thousands of differences between very remote haplogroups.) And even without matches, these small handful of markers can tell us something. Obviously, a kit having zero PVs tells us that further testing by the individual will tell us nothing more. He's done. (But distant relatives might still yield new data.) Of course, that has not yet been the case with our Hackett tester, the very first advanced SNP tester in the YP4248 haplogroup. Hackett was found to have 18 PVs when he tested eight years ago. And those markers are still present in the SNP tree (with a 19th added for reasons I've forgotten). Here is a breakdown of his SNP lineage, the last line being his presently defined PVs. This is how haplogroups are discovered.


YP4248 FT106799 FT106962 Y16296 Y37657 YP3936 YP3938 YP4249 YP4250 YP4251 YP4256
YP4253 YP4254 YP4255 YP4257
A12124

PVs: A12126 A12127 A12128

YP4253, fifteen SNPs at that time, was defined when the first Cooley tester came along. (The fifteen were the present four SNPs of YP4253 combined with the eleven YP4248 SNPs.) And YP4248 was carved out of that once the first Cochran tested. The current state of Hackett's "side of the family" occurred when our Hawley tester matched with Hackett's previous PV, A12124. Using SNP counting, Hackett's remaining three SNPs might have emerged since about 1700, give or take by an unknown factor. (An average of one tester doesn't tell us much.) We've come a long way, but tests from a 5th cousin or so of each man can further break things down.

But what of someone with ten PVs, such as the case with kit #915713? At first blush, we would expect a thousand-year lineage back to BY30798. By averaging the SNPs with his "clan mates" we arrive at something more like the 1400s. That's much better but not good enough for genealogy. We need to find a tester that matches on several of his PVs.

The object here, then, is to further break down PVs and move them up into the tree. The more PVs an two or more individuals have, the more distantly related they are to one another. Looking at the Cochrans again, especially the last two listed kits, #171069 and #184037, we discover they're separated by 12 SNPs, an average of 6. That could indicate that their MRCA lived in about 1350, but the zero PVs for the third FTA22447 clan mate modifies that to 1650. We're getting there but the testers can take their lineages back to only the 19th century. Better resolution can be obtained, but they'll need to find a "cousin" with an MRCA who lived in the 1700s. In time, they might find a tester that knows the lineage back two hundred years earlier.

To reiterate, we started with a very large distance between most of the testers in YP5007, but matching SNPs found only old haplogroups. Closer related testers will get us closer to the era in which the names of the MRCAs are a bit more recent and more likely to fall within known lineages. We've come a long way, but have a long way to go. We can short cut some of this by testing folks in the realm of 5th cousins or so to our current testers. (In fact, I'm a 4th or 5th cousin to most of the Cooley testers.)

One additional point about the Cooleys and our cousin Higdon. We share the four SNPs of the YP4491 haplogroup. Using the SNP count method, it's estimated that the MRCA was born in about 1750, and that's really close to John Cooley's birth year. However, the Higdon EKA (earliest known ancestor) was born in 1657, a century earlier. We first need a second Higdon tester to confirm the SNP lineage, but Higdon's Y-111 (STRs should never be too heavily relied on) shows a significant genetic distance, reaffirming the idea that the YP4491 haplogroup might well have been fully formed in the early 1600s. If this is in fact the case, the Cooleys and Whitfields are apparently pretty much played out SNP-wise. We'll need to cast a wide net to find other Cooleys, Higdons, and others to better sort out the upper reaches of the lineage.


Conclusion

I'm certainly aware that I've introduced several concepts here, some new to many of you. But you don't need to understand it (that's why I'm here!). Still, I enjoy providing information to those who want to dig deeper and come to a working knowledge as to the hows and whys of it. And, of course, there's much more to discuss, including additional haplogrouping concepts, the lynchpin to understanding the whole process. Yes, it can get complex, but its born from an extraordinarily simple fact: men pass 100% of their Y chromosome to their sons. There's only one pathway up through the male lineage, not the multitudes found in autosomal testing.

I've also attempted to illustrate how important other disciplines are to gaining the most from surname genetic genealogy, and not just genealogy and genetics but history and geography. Indeed, archaeology, population genetics, and paleogenetics contribute in a big way to the bigger picture, as demonstrated by this remarkable two-minute video by Eswke Willerslev of the Natural History Museum of Denmark.

And I hope I've demonstrated how the lineages can intersect with haplogroups and provide distinct Y-DNA prints for your ancestors. Knowing those SNPs provide highly practical benefits for the genealogical efforts for you and for others not yet tested. To better explain this, here's one last important concept not yet realized in our project — anchor SNPs. For this, I need to turn to the Pettit-Mellowes Y-DNA Project.


Pettit Anchor SNPs




BY198412 is "anchored" to George Pettit, F7174 is anchored to George's distant cousin, Samuel Pettit, Jr. In this case, we know their names, we also know the SNPs' dates and locations of the SNP's arrival. This can be determined because their respective descendants are positive for those SNPs and none of the MRCA's brothers' descendants have them. Simply, no one above or alongside the MRCAs (the brothers, cousins, etc) had them. Therefore, the SNP mutations first occurred at the birth of George and Sam. Now that that's been established anyone (or her brother) can prove descent by testing for the appropriate SNP and at a much lower cost ($18-$39) than the advanced NextGen sequencing (NGS), which looks at millions of positions and runs into the hundreds of dollars. In other words, someone testing positive for BY198412 is a descendant of George and someone positive for F7174 descends from Sam, Jr. Two caveats, however. First, a perspective tester needs to have a really good reason to believe he will be positive. Arbitrarily testing multiple SNPs can cost a needless bundle. And, secondly, a positive test should be followed up by at least one upstream SNP test. After all, DNA mutations are random. Any tester across the entire world-wide SNP tree could have randomly inherited a SNP normally found elsewhere. Better yet is to test 37 or more STRs. Now, it's true that STRs cannot make at tree but they can demonstrate just which ballpark the tester is playing in.

In short, testing helps you, me, your relatives (distant and close), and any researcher, genetic, historical, or genealogical who may be interested in what we've come up with. And, indeed, years ago Y-DNA testing disproved a disgracefully bad Cooley genealogy from the 1930s, one that generated unfounded myths among several unrelated Cooley families. (That was my first success.)

Finally, these reports prove, unlike your autosomal DNA, that your Y-DNA results hold no unique personal genetic data. All but the tiniest sliver of your Y is shared by thousands of people with roots going back more than a thousand years. Despite what you might know about DNA in general, most of it doesn't apply to the repeatedly cloned Y chromosome. And there's no health data. The SNPs we look at were once referred to as Junk DNA. It's in the small percentage of genes, certainly not junk, that hold such indicators. And those 70 genes living in the Y are generally involved only in the regulation of of male hormones. In short, there are no significant privacy concerns regarding Y-DNA. What you have in your Y is an archive of your paternal genetic lineage. I'm pleased you've allowed me to excavate, examine, and report on that archive.

As always, I'm more than happy to answer any questions about any of this. It's my "job."



1. Sempill, https://www.houseofnames.com/sempill-family-crest; Storry, https://www.houseofnames.com/storry-family-crest; Rankin, https://www.houseofnames.com/rankin-family-crest; Cochran, https://www.houseofnames.com/cochran-family-crest

2. https://old-engli.sh/dictionary.php, https://en.wiktionary.org/wiki/; https://www.vikingsofbjornstad.com/Old_Norse_Dictionary_E2N.shtm; https://www.etymonline.com/