GEN • GEN: Michael Cooley's Genetic Genealogy Blog
[ ARTICLES]*
20 June 2017
Big Y STRs for Cooley Group CF02
To date, three members of Cooley group CF02 have taken the Big Y test:
two descendants of early American immigrant Benjamin Cooley (1614-1684), and
a descendant of Nehemiah Cooley (c1685-1759), both born in Tring,
Herfordshire. There's an advantage in having two or more descendants of the
same person test their Y chromosomes: the matching DNA could only have come
from one person — the MRCA (Most Recent Common Ancestor). Matching
markers, then, provide a DNA print for the ancestor. This is true for both
Short Tandem Repeats (STRs) and Single Nucleotide Polymorphisms (SNPs).
There's recent news on both fronts.
We've known for some time that the descendants of Benjamin Cooley have
two SNPs, A12022 and A12024, that are not found in other CF02 testers.
YFull.com has recently finished their analysis of the Big Y for kit #559256,
the descendant of Nehemiah Cooley, uncovering a third mismatched SNP,
Y23835. This new discovery has been confirmed by two other non-Benjamin
test subjects, both negative for the SNP. The SNP, however, hasn't altered
the basic CF02 SNP tree structure. Instead, it has merely added an
additional marker to the A12022 SNP block:
In addition to peeking at ten million nucleotide bases, the Big Y
extracts up to 500 STRs. Whereas SNPs are single-point mutations, an "A"
becoming a "G," for example, STR mutations alter the count of a set of
nucleotide bases. In other words, 13 sets (or repeats) of AGAA, might
become 12 or even 15 repeats in subsequent generations. Perhaps it's simply
because so many more elements are involved in STRs (as opposed to the single
base in a SNP mutation) that there's a larger degree of randomness. In any
case, STRs tend to be somewhat flaky and maintain consistency only to a
point. It stands to reason, then, that the more STRs that are sequenced,
the greater the number of discovered mutations. For example, the two
Benjamin Big Y testers have only four differences among the 37 STRs tested
at FTDNA, but 22 differences (per YFull's count, 29 by my count) among the
500 Big Y STRs.1
But counting the genetic distance (GD) between any two testers isn't
really the way to go. For one thing, the comparisons become rather complex
when dealing with multiple testers. Instead, a modal haplotype is
determined, values that are found to be the most common values among testers
in the group. Group members are then compared to that.
For example, among the five testers for marker DYS458 (line 13 in the
next table), the value of 19 is most common, and 19 is, therefore, the modal
value (pink column) at that position. We can readily see that kit
#559256 is a GD of 2 for DYS458.
Any one tester in a group, even if genealogically proved to be of the
group, can have a higher than usual GD, especially when looking at a smaller
number of STR markers. A higher marker count can provide more accurate
averages. The 500 Big Y STRs, then, can be very useful when trying to
determine an approximate relationship between two testers. This appears to
have worked out well with the three Big Y testers in CF02.
Remember that the Nehemiah tester lacks three SNPs that are found to
belong to Benjamin's descendants. This means that the MRCA for both groups
lived much earlier and that Nehemiah's lineage branched off first. Note
also that the modal is skewed toward Benjamin's own haplotype because most
of the group members are his descendants. This means that Nehemiah's GD
should be higher than that for Benjamin's descendants. And that's just what
we found: both Benjamin testers (the first two columns) have a GD of 18 from
the modal, but the Nehemiah tester has a GD of 27.
Let's take it a step further. Benjamin was born 336 years before the
present era (standardized to be 1950). The average rate of mutation,
then, for Ben's descendant's 18 mutations is just under 19 years. Multiply
that by Nehemiah's GD of 27 and we have about 513 years to the MRCA, or roughly
the year 1440. This could well be an exaggerated figure, but it's a start,
and it works well with the three-SNP difference between the Benjamins and
non-Benjamins.
CDY Redux
I'd mentioned in article
38 that a pattern is emerging with the two CDY markers. The values of
37/38 appear to indicate a lack of the three Benjamin SNPs. So far, that
has borne out in four testers. There are a total of eight testers in CF02
who match this pattern. And there's circumstantial evidence in their
genealogies that some of the lineages (except for Samuell and Nehemiah)
originated in Goshen (Orange County), New York. I hope to have all of these
tested by the end of summer.
Most FTDNA customers have tested for 37 STR markers. Indeed, I feel
there's little advantage in doing 67 markers, and I've seen no evidence that
111 markers is much better. Naturally, there's a large difference between
37 and 500 markers. Still, FTDNA chose those 37 because they tend to
display the most variety. (In other words, they've been shown to be a bit
more fickle than the others.) Let's look at them in respect to the
non-Benjamins, paying attention only to the area that shows genetic
variation, loci 13-32.
There's one huge caveat that needs to be noted at this point. Each of
these ancestors has only one tester. Any number of these mutations could
have been acquired in the generations between the birth of the ancestor and
that of the tester. Subsequent testing can alter the above table
dramatically.
But if we take this at face value, Samuell and Nehemiah were not only
distantly related from one another, it appears to have been true in their
relationship to the other six testers listed above. Of those, Thaddeus and
his father Jabez are believed to have gone west from the Goshen area.
Abraham and Thadeus were brothers. They went to Virginia but are also
thought to have been from Goshen. In any event, the GD among these six is
very low. But here's the kicker, with the exception of the CDY markers, the
STRs for the last six look very Benjamin-like.
<speculation>
What can this mean? STR mutations are sometimes associated with SNP
mutations. This is the reason FTDNA can make predictions about haplogroups
without testing any SNPs. I'm neither a chemist nor a geneticist, but I
suppopse it's possible the three A12022 SNP mutations caused a disturbance
in the area where the CDY markers reside. And this could have been a
one-time event occurring not long before Benjamin was born. It would
explain several things: that Benjamin and the ancestor of these "other"
Cooleys were closely related to one another but less so to Samuell and
Nehemiah; and that the "others" were not descended from Benjamin and, in
fact, might have had an immigrant ancestor all their own.
Abraham has long been said to have been born in England. I discounted
that once we saw that his STRs, and those of his brother Thadeus, are
Benjamin-like. But what if the Goshen Cooleys were of a family that
immigrated separately, perhaps in the mid- to late-seventeenth century? And
what if their early records were lost, perhaps simply because the family
broke up early on, some going west and others south? Perhaps that's why
we've been unable to tie any of them together. That's a lot of speculation.
But if the DNA evidence holds up, we could have something like this:
</speculation>
The more data we have, the better it can be developed. I'd first
encourage all CF02 testers to test for the following SNPs: A12020, A12022,
A12024, and Y23835. They can be had at FTDNA for $39 each and at yseq.net
for $17.50 each. We also need more testers for Samuell and Nehemiah so that
we can more accurately determine the genetic distance between what is now
appearing to be four distinct groups: the Benjamins, the Samuells, the
Nehemiahs, and the hypothetical Goshen, NY group.
I'm always available for questions.
1Genetic Distance (GD) is determined by
finding the accumulated absolute difference (-2 becomes +2) between
each occurrence. Although I haven't confirmed with YFull, it looks to me
that they simply summed the number of instances of differences.