Michael Cooley's Genetic Genealogy Blog GEN • GEN
10 February 2017

Whitfield and the Cooley DNA Project

Sometimes, things aren't crystal clear.


For some years, strong Y-DNA matches have been noted between the Cooleys of Stokes County, North Carolina, and a Whitfield clan. The connection has long been a mystery, but a descendant of William Whitfield (1751-c1835), FTDNA kit #520597, has stepped forward and taken the Big Y. He is a 100% match to the Cooley CF01 Y-STR modal. So far (there's more to come), and as suggested by the STRs, the Big Y results show nothing to distinguish him, genetically, from the other testers in the group. But read on.

The Big Y extracts genetic data from ten millions points along the Y chromosome. Because the Y includes the male sex gene, only men have it, and because it doesn't have a female counterpart — nothing with which to recombine with — it is passed down as a virtual clone from father to son — every generation, although the occasional mutation creeps in.

We now have four Big Y testers in group CF01. The first to test, a Hackett (kit #323704), has STR1 values close enough to be included. The SNP2 mutations, however, are sufficient to suggest the Cooley/Hackett MRCA (Most Recent Common Ancestor) may have lived 800 years ago. Two, kits N3690 and 57597 (that's me!), share John Cooley (c1738-1811) of Stokes County, NC as the MRCA.

That two of John's descendants — each through different sons — have Big Y-tested is significant. It makes sense they'd share genetic artifacts passed down from John, and that's precisely the power in such tandem testing. My Y is a ninth-generation clone of John's, the other Big Y tester is seven generations removed from him. Because we match one another, we know we have John's Y chromosome, as well as the same surname. In the same manner that we can triangulate the surname, we can triangulate the DNA. What we both have, John had. And that includes four mutations: YP4491, YP4492, YP4493, and YP4494, known collectively as haplogroup YP4491. No one else shared these SNPs — not Hackett, not the more-distantly situated Cochrans — until Whitfield came along. The question du jour is, of course, why are the Cooleys and Whitfields Y-genetically identical?

The Pennsylvania Tell-Tale Branch

There's a third CF01 group that needs to be brought into the picture, those CF01 Cooleys who were almost certainly not descended from John. I refer to this group as the "Pennsylvania-born" or the Pennsylvania branch of the CF01 Cooleys. This group includes three descendants of William Henry Cooley (1797-1877) and a descendant of James Cooley (1809-c1872). Both men were born in Pennsylvania and lived near one another in Ohio and Illinois — until James moved his family to Jack County, Texas and became a virtual neighbor to a rather infamous family of Cooleys descended from the North Carolina branch.3 William and James were young enough to have been John's grandsons but we can place none of John's sons in Pennsylvania. But more than that, the descendants of both men share STRs not found in the Stokes County Cooleys. The diagram in article 10 illustrates that the values for markers DYS576 and DYS464b may be the values that were ancestral to John. (See also the last diagram in article 22.) If that's true — that they have the ancestral STRs — they are of a collateral branch having an unknown degree of cousinship to John. And considering the Whitfields are genetically identical to the Stokes County Cooleys having no suggestion of a significant cousinship, we can come up with a roughly-sketched genealogy:



I'm not making a declarative statement, but this suggests that, somewhere along the line, a Cooley became a Whitfield. There are presently no genealogical clues as to how that happened, and there are (presently) virtually no genetic differences to fall back on, but let's look closer at the SNPs.

Mutation Rate

If SNP mutations occurred on a regularly scheduled basis, we could count the number of SNP differences in any two lines, multiply by the anticipated interval, and determine just when the MRCA lived. But all mutations are acquired randomly. Still, that doesn't stop statisticians. Looking at the 40,000-plus known SNP mutations, it's estimated that they occur in a lineage, on average, once every 144 years. To illustrate the point, at that rate a one hundred SNP difference between two men would separate two testers by 14,400 years, a ten SNP difference by 1,440 years.) Perhaps in the long haul, over several thousand years, that works well, but averages can't be used to make specific arguments, especially over a relatively small number of generations, even hundreds of years. But let's use this 144-year figure for the sake of argument. With that, at least, we can play with guessing about the four SNPs that had been considered unique to the CF01 Cooleys, those of the YP4491 haplogroup.

We'll start by imagining that the last of the four mutations occurred with John's own birth c1738. Using the static SNP timetable, the first event would have occurred about 400 years earlier, mid-14th century. That's a lot of time for separate lines to evolve, Cooley or otherwise. But Whitfield has all four SNPs, which provides a very tight schedule to work with. Accordingly, the MRCA would have lived within the prior 144 years to John's birth. Given that, it's possible that John Cooley and William Whitfield shared the same paternal grandfather.

(Note that that Pennsylvania branch has at least three of the YP4491 SNPs.)

Not So Fast!

However, my own Big Y results demonstrate that I not only have all the Cooley/Whitfield SNPs, I have four that (so far) are unique to my line, having emerged since John. There was insufficient time (just over 200 years) from John's birth and my own in 1950 for the four to have have entered the lineage per the one-per-144-year schedule. So, what schedule did those SNPs observe? I've tested a 4th cousin, a man with whom we share a great-great-great grandfather, David Cooley (1815-1865), John's great-grandson. Although he has the YP4491 "control" SNP (no doubt he's a Cooley!) my cousin has none of the four. No new SNPs, then, had entered the lineage at the birth of John's son Edward, his son John, nor his son David. So, now I have four SNPs "born" sometime between the birth of my great-great grandfather, Greenbury Cooley (1844-1899), and my birth 106 years later. One SNP per four successive generations would be highly unusual, even improbable. It's more likely the four occurred together in one or two events. If that could be the case with my four personal SNPs, could it have been true for the four Cooley/Whitfield SNPs? Absolutely! And that simply means that, despite all the testing, we cannot yet come to any concrete age for the Cooley/Whitfield MRCA. Nevertheless, we have useful data. Moreover, as it turns out, Whitfield has two private SNPs (A14495 and A14496) that had emerged in his lineage since the creation of the YP4491 haplogroup, which we know was fully born by the time of John Cooley's birth c1738. This fits neatly into the "144-year plan."

Hope is Not Lost

A partial genetic solution to the mystery may yet be found. The Big Y also extracts nearly 500 STRs from the sample. Although the public display of the Cooley STR results shows two differences in my STR values to the other Big Y tester, there are 15 differences found among the 500. This is as opposed to 54 differences between myself and the Hackett tester. This "genetic difference" may give us a better idea as to when the Cooley/Whitfield MRCA lived: the smaller the number, the closer the relationship; the bigger the number, the more distantly-related. FTDNA, however, does not report on these markers. Although I've written software that will extract the SNP values from the raw data file (called a BAM), I do not yet have enough data by which to recognize all 500 STRs. For that, we must turn to YFull.com. It takes the company a few weeks to report SNP data but several months for the STRs. Unless I can figure it out soon, we have quite a wait before we learn the genetic distance between the Whitfields and the Cooleys. (Keep in mind, like SNPs, the mutation rate is not predictable. We can look only at the overall trend.)

The Whitfields

Of course, this is genetic genealogy. The genetic side of it makes no sense without a genealogical view. To fully appreciate the data, we need to relate it to the known Whitfield genealogy, as I've done above, to the extent that I can, with the Pennsylvania branch of the CF01 Cooleys.

John Cooley first appears in Caroline County, Virginia in 1755. William Whitfield is first reliably recognized from his marriage record (14 December 1772 to Mary Toler) in nearby Goochland county, and he is later found in Person County, North Carolina, where a number of his progeny still reside. Although the migration patterns between John and William were similar, and they lived near one in another in a couple of locations, there is nothing known that puts them in the same place at the same time. For now, at least, that's a dead end.

Four matching Whitfield Y-STR testers are found in the FTDNA database. Three of them are descended from William's grandson, Henry Edwin Whitfield (1819-1884). The lineage of the fourth is known only back to Sam P Whitfield (1852-1933). Using the information gathered so far, then, we have proved only Henry's Y-print. This is because it takes two distinct lineages from an individual to be reasonably assured of the results, which belonged, after all, to a long-deceased man. In other words, we may have at least three descendant testers from Henry, but only one attested descendant (Henry himself) from William. An active search is underway to locate descendants of one or more of Henry's male Whitfield cousins. Matches would demonstrate William Whitfield's bonafide Y-DNA print. Although what we have is likely be the case anyway, the future hunt for the Cooley/Whitfield MRCA depends on establishing it as fact. There's work to be done on that front.

"It's the MRCA, Stupid"

It's really all about identifying the MRCA. He's the glue that binds collateral lineages together. By joining the Cooley and Whitfield lineages, we will have taken both lineages back a generation or more. In fact, anyone having any of the four YP4491 SNPs is going to share an MRCA with the Cooleys and Whitfields — the fewer shared, the older the relationship, and vice versa. If such a person, regardless of surname, can take their lineage back to 17th or 18th century England, then we've hit pay dirt. But even if the name of the MRCA is unknown, something will have been learned about the geographic origins of the Cooley/Whitfield lineage. And that's a start.

1 Short Tandem Repeats, For example, at the stated position, the sequence TTTC may be repeated 33 times, 34 times, etc.
2 Single Nucleotide Polymorphism is a single point mutation, a C, for example, miscopied as an A.
3 William Scott Cooley, former Texas ranger turned soon-to-be-dead gunslinger, was of this family.