Michael Cooley's Genetic Genealogy Blog GEN • GEN
12 May 2019

DYS464 and the Herbert Y-DNA Project

Chris Herbert has recently taken me on to assist him with the Herbert Y-DNA Project. I've rearranged the results, as described below. The new template should help testers better determine their relation among other Herberts. But I'm not a Herbert genealogist and recommend that you confer with Chris regarding any such questions. I can, however, offer a lot of information about genetic genealogy, particularly regarding the Y chromosome -- the focus of this study -- and Big Y testing.

I came to the project in a roundabout way. A distant Pettit cousin, a Herbert, asked for advice about his own Y testing. Looking online, I saw that Chris was advertising for help and I volunteered. That simple. I'll do in the coming months and years just as I do with the other twelve projects I maintain and make the rounds about once a month, grouping any new results that might have come in. This is perhaps the most essential job for an admin since further conversation and analysis cannot reasonably proceed without a degree of organization. Naturally, I can't have both eyes (even one eye) in all places. If anything looks out of kilter, please feel free to bring it to my attention.

The public Herbert results are at https://www.familytreedna.com/public/Herbert?iframe=yresults. Chris had them arranged by haplogroup, which is a great first start. But once I sorted through things, I ended up with four groups that descend from haplogroup I. Remarkably, I was able to quickly identify them with the four copies of the marker known as DYS464. But first, a brief discussion on Short Tandem Repeats (STRs).

Each number on the results page signifies how many times a specific string of genetic letters (A, C, T, and G) repeats at the given position. For example, DYS449 is comprised of the nucleotide sequence of TTTC. A father's gamete cells (sperm) can contain a small variation in count of alleles (values) from one cell to another, usually in the form of an addition or a deletion of a repeat of the given sequence. For example, all but one Cooley in Cooley Group CF01, including my dad, has 33 repeats of TTTC at DYS464. I'm the exception. There's no doubt that the vast majority of the gametes that were swimming toward my mother's egg on that eventful day in 1949 were DYS449=33. But one of them, having 34 repeats (not 32 or 35), won the race. This graphic, from yhrd.org, shows the distribution of the DYS449 allele frequencies as found in the general population.

But the really essential thing to understand about STRs is that they're fast-mutating. Although a son's Y chromosome is going to be virtually identical in this regard to the father's, variations frequently show up. But even more interesting is that the repeat value can go either up or down. In other words, the value of only one STR marker doesn't necessarily say much. This is where genetic distance (the number of total allele differences between testers) comes into play. Over all, men that are related to the nth degree are estimated to have X or so number of differences. (Up or down, doesn't necessarily, nth, estimated, or so, and differences are much needed cautionary phrases.) There's a degree of art involved in STR interpretation.

The four copies of DYS464 are among the more volatile STRs. Because it has four copies and easy to eyeball, it's very useful when it's found to have been reasonably stable in a lineage. Group 01 of the Strother DNA Project is essentially comprised of two versions of DYS464: 15-15-17-17, found only in Struthers, and 15-15-15-17, with some variation, found among the others. This pattern has been supported through advanced Big Y testing.

But that's certainly nothing to rely on and it's absolutely not the whole story. The four "I" subgroups might break down further with closer study. For example, I based HF03 on DYS464=11-13-15-15. But there's significant variation among its five members. For example, kits 90604 and N20302 have a genetic distance of 4 in just the first 12 markers. These men are definitely not related within the genealogical timeframe. On the other hand, HF02 is remarkably cohesive. As is HF05, apparent descendants of Étienne Hébert (1630-1670).

For now, I'll leave HF03 as it is. But there's a far better way of classifying results -- through Single Nucleotide Polymorphisms (SNPs). SNPs are single point mutations, an A to a C, for example. As a genetic genealogist, I look for reliable SNPs -- those that are found outside STRs and other unstable regions of the Y chromosomes, such as the centromere. Stable SNPs are permanent landmarks that guide us deep into the lineage. Some are known to be more than 300,000 years old.

A15905, for which two HF02 testers are positive for, is such a marker. It's one of ten SNPs that now comprise the I-A15905 haplogroup. Because SNPs don't pop into a lineage like clockwork, we can only guesstimate the age of the haplogroup. But we might expect that ten SNPs emerged over the course of about 800 to 1200 years. Its "parent" haplogroup is I-Y7198 from which descends Herberts, McNeills, Walkers, Egglestons, Leemings, Busbeys, Weylands, Wheatlands, Slatterys, and more. In other words, I-Y7198 is a large and diverse haplogroup seemingly of British origins. Having evolved in the course of about a thousand years, it's not likely that I-A15905 is comprised of only Herberts. But at least one of the ten markers is certainly Herbert-specific. With more testing among the group, it's possible to isolate the Y-DNA signature for the family. And it doesn't necessarily mean taking the expensive Big Y test. SNPs can be tested for individually at Yseq.net for $18 each.

Here's another example from the Cooley DNA Project. Group CF02 undoubtedly came out of Tring, Herfordshire, England. We have found SNP markers that parse the group into at least four lineages: the descendants of Benjamin Cooley, of Samuell Coley, of Nehemiah Cooley, and of a group I call the Goshen (New York) Cooleys. There are good genealogical records for the Cooleys in Herfordshire from the late 16th and early 17th centuries. It's entirely possible that DNA, when compared to the historic record, will more fully sort out the genealogy for these families.

And so it goes. Any scientific or sociological project (and genetic genealogy is both) requires data. The more data, the better the results. I congratulate the Herbert Y-DNA Project for what it has accomplished so far. But genetic genealogy is still in its infancy. There's much more to come!