As the initial release of the '4466 South Irish TRMCA Case Study' I built two workbooks. The actual release includes (336) members who have tested CTS4466+ and / or match the STR signature of those who are 4466+. In addition I added (143) members with a single mutation from this 4466 STR Signature of (336) members so you can see which new marker alleles may test 4466+ in the future.
1.0014 4466 South Irish TMRCA Case Study
excel 2010 version
1.0014 4466 South Irish TMRCA Case Study 97-03
excel 97-2003 version
|(332) 4466+ and members who
match the 4466+ STR signature
Includes the newly discovered 4466/5714/3974/8358/L270 SNP branch
Family histories are being updated, as well as newly discovered SNPs. Periodically STR datasets will be updated.
See QuickStart Guide - for the basics. This is a long term project for research and analysis. You can quickly learn the analysis process without understanding the science behind it. We have an excellent group of admins with very deep skills to help you get up to speed. There is a great deal of knowledge to share and training to provide. What you see in the research workbooks is just the tip of an iceberg of work. You are welcome to join our 'TMRCA Case Study' forum at: http://groups.yahoo.com/neo/groups/R-L21_TMRCA_CaseStudies/info to ask questions. I will add you immediately to the members list. The forum is currently restricted only because so many discoveries are being made that we need to protect public access to them. The result of this current work will be a published paper.
A summary of the research work includes:
The identification of members who meet the 4466+ STR signature;
Identification of the members who test + for the CTS4466, CTS5714, CTS3974, CTS8358, CTS7141, F2517, Z454, Pf112, L247, L270, and L1312 (Private);
Separating those members into sub branches then building phylogenetic trees and sub branches to view the members close relationships;
Calculation of the entire base haplotype and sub branch TMRCA (time to the most common ancestor);
Viewing members close matches by mutation count within the sub branches.
An iterative approach to studyng the 4466 South Irish base haplotype:
When I first studied the South Irish base haplotype we worked with Dr. Nordvedt's STR signature. There was no absolute way of knowing how many mutations the haplotype included in any direction. In fact, building any phylogenetic tree for the haplotype was a guessing game until the CTS4466+ SNP was found. I found that even a single misplaced yDNA result in the tree would skew the tree and sometimes dramatically. The TMRCA calculations generally showed that more closely related yDNA needed to be added with the correct STR formula.
Since then I have taken a conservative approach to include validated data only. I have used the list of 4466+ members to build the range for the 4466+ STR signature. I included those who have tested 4466+ and those who meet the range qualifications in the research group. The phylogenetic tree is then 100% validated with quality data, however is missing yDNA needed to fill in the gaps from those who haven't tested yet and whose additional new marker alleles will increase the 4466+ STR signature while increasing the yDNA results in tree.
When you open the main research workbook, 1.0 4466 South Irish TMRCA Case Study release above, you'll see 336 members in the '1. yDNA tree', the '2. PT add phylogenetic graphic' phylogenetic tree version next to the member yDNA data, and '3. TMRCA worksheets' for the entire base haplotype (3. TMRCA Base Haplo 336 4466sig) and sub branches (3. TMRCA Sub L270). The TMRCA calculations need work because many more members need to be added with their up-to-date SNP results. Be patient, we'll get there in time. What you will see is your current relationship to others in your sub branches. Of course, the upper and lower sub branch order will change in time. At first this will feel uncomfortable because 'how can your relationships change, you still have the same STRs and SNP'. It's because as each new member is added, the relationships change in respect to each other. The sub branches will become stable as soon as the combination of yDNA results starts to truly represent the tree. This can be seen with the 4466-T2-C sub branch; see '3. TMRCA Sub 6 75606-270685' in column BA that includes the 4466-T2-C label. I have built approximately 12+ full phylogenetic trees for the 4466 South Irish and the 4466-T2-C have always grouped. Other sub branches change with every new tree as more yDNA is added.
What is a TMRCA Calculation and why is it necessary?
In addition to analyzing members STRs and SNP by sub branch, calculating a TMRCA formula for the group is very useful to understanding the distance or closeness in the relationship of those in the sub branches.
Why do TMRCA Calculations seem to create such heated debates?
I believe there are two major reasons: the first is that the calculations may be complex and difficult to understand for those who do not have the math background and the second are people who have the knowledge to discover and / or use TMRCA calculations may differ in opinions.
The former often have the strongest feelings on the subject which has caused much controversy in the Yahoo forums where members join to work together on yDNA issues. This is very unfortunate because the negative feelings appear to color the conversation more than the substance of the calculation itself.
When members who have great knowledge about TMRCA calculations have disagreements with their peers, it's often the battle of Titans with everyone wanting to get out of their way. Principal scientists may respectfully disagree on their own versions of the TMRCA calculations but understand how key they are in their own work.
I use the TMRCA calculations of Dr. Anatole Klyosov because he includes a confidence level with a margin of error that provides critical information to me: are the members in the sub branch close enough to have a viable TMRCA calculation. If not, I need to add more yDNA results to fill in the gaps or increase the quality of the data tomatch the research group requirements.
Using TMRCA Calculations:
One of the principal errors in using TMRCA calculations is to simply gather a group of yDNA results and apply a TMRCA formula. In fact, you first need to build the yDNA results into a phylogenetic tree to make sure there is a relationship in the data list. You can't add or remove yDNA data that is inconvenient. Then you apply your favorite TMRCA formula.
Phylogenetic trees sub branches for STRs and SNPs:
All the yDNA STRs in the world fit into a single phylogenetic tree. Random SNPs occur which are inherited by desdendants. Immediate relatives in the same generation do not inherit these SNPs. When groups split and migrate in different directions, you can track the group through the number of mutations in its STRs and unique SNPs.
In analyzing a base haplotype you initially identify its STR signature and range, as well as SNPs. You gather the yDNA data from members that meet your criteria, however before building them into a phylogenetic tree you first need to separate them into SNP sub branches. Each SNP sub branch gets their own phylogenetic sub tree with those remaining staying in the main branch until their unique SNPs are found.
In this initial research workbook, the L270+ SNP has been found in 3 members with a 4th member meeting the STR range requirements. You'll see this sub branch in it's own phylogenetic tree on page '1. yDNA tree' on lines 4 through 7. The rest of the yDNA is found in lines 8 through 339.
I'll walk through the interpretation of the L270 sub branch including building a sub branch TMRCA worksheet, calculation of the TMRCA, interpretation of the TMRCA and adding family history to help build a common history.
Walking through the L270 sub branch analysis: See QuickStart Guide - for the basics