One baby, alone on a PCA island

Gene ExpressionBy Razib KhanApr 25, 2012 5:40 AM


Sign up for our email newsletter for the latest science news

A week ago I reported that according to 23andMe I'm 40% Asian, and she is 8% Asian (in the future if I say "she" without explanation, you know of whom I speak). Obviously something is off here. The situation resolved itself when I tuned my parameters and increased my sampled populations in Interpretome. By now I've already done the estimates of recombination on the chromosomes which came together to produce her, and the realized value of 8 percent instead of 20 percent "Asian" simply can not be due to a particular set of unlikely crossing over events. From what I can gather

it seems like ancestry painting should be viewed as a qualitative rather than a quantitative assessment.

This sounds really strange when you are given percentages, but the results are strange, and obviously wrong too often in terms of the specific values. Here's an admixture plot which shows more realistically informative values:

I've run several admixture plots already with my daughter, and one thing that seems clear to me is that she received more than her "fair share" of East Asian ancestry from me. By this, I mean that I usually come out as about 15 percent or so East Asian. My daughter seems to consistently be more than 7.5 percent East Asian. This could be some sort of bias in the method, but it seems just as likely that it's the natural outcome of sample variance. I don't have that much East Asian to go around, so it isn't surprising if there's a large error in my transmission. The rest checks out as you'd expect. There are few ancestral components where her mother and me overlap in small portions (e.g., a "West Asian" one which spans Central Europe to North India), so it isn't always so easy to see where to draw the "50:50" line. But one thing that I want to emphasize is that these plots don't show you "real" ancestral components. There's no such thing. Populations and ancestries are ultimately reducible down to genetic variation. These visualizations, and the components generated by their hypotheses, reduce a set of human non-readable information to human-readable format. If the argument outlined in Reconstructing Indian History is correct then the "Gujarati B" ancestral element in this plot is actually a stabilized admix between a West Eurasian component, and a very diverged South Eurasian one. It is therefore just as accurate, and historically more informative, to state that my daughter is ~20 percent South Eurasian, ~70 percent West Eurasian, and ~10 percent East Eurasian. When I read ADMIXTURE bar plots I try hard (and do not always succeed) to remember that they are telling with excellent precision relative relationships, but they are not telling me absolute truths. By modulating the populations sampled or changing random seeds one can obtain radically different results. From this we should not conclude that reality is a fiction. Rather, our methods are incomplete and imprecise mappings upon reality. All that said it is generally difficult to distort the rough topology of relationships out of these plots so that East Eurasians are genetically closer to Africans than they are to West Eurasians. The details may be twisted and stretched, but the general outline of relations will remain.

23andMe has a PCA where they project you upon the HGDP data set variation. The north-south axis is Eurasia vs. Africa, and west-east is Europe vs. East Asia. My daughter is in green. She's about halfway between her parents, somewhere in the Central/South Asian cluster. This plot seems to be much more robust to what you throw at it than ancestry painting. People are where they "should be." I suspect that's because the PCA methods require fewer markers. But frankly I wish they would give you more options in terms of what you could see. For example, both South Asians and Oceanians are rendered as linear combinations of the variance components dominated by Africans, Europeans, and East Asians. This is not optimal. Going down the PC dimensions would almost certainly allow for the shake out of South Asian and Oceanian informative dimensions. But you can click on the regions, and get a PCA plot which places you within your geographic context.

This turns out to be useless for my daughter. She's shoehorned into a cluster where she's closer to South Asian populations than the Balochi samples? I presume the issue here is that she's being projected upon South Asian variation, which works for half her genome. But her European ancestry is a lot less informative here, and a lot of the variance in the plot is taken up by the inbred and distinctive Kalash. I really hope that 23andMe improves this feature over the summer, it can't be that hard. I don't think you need to recompute the PCA, but if you did PCA doesn't take up nearly as much horsepower as hypothesis based inference or chromosomal ancestry assignments. Naturally I wanted to give it a shot. So I took the data set which I used above in ADMIXTURE and ran it through a PCA. With the Gujarati data set from the HapMap the South Asian component was more fully fleshed out. But perhaps more importantly I discarded African, Amerindian, and Oceanian populations. Basically my daughter is Eurasian, and I wanted to flesh out Eurasian variation.

Obviously the fact that my daughter is out there "alone" is a function of lack of sampling of much of West Asia. I suspect that her position on the PCA is similar to a Turkic Iranian population; mostly West Asian, but with some East Asian ancestry. Of course, position on the PCA here brings together two very distinct types of individuals. West Asian with an East Asian component, and someone who is a synthesis of South Asian and Northern European, with an East Asian component. When you are young there often comes a day when you ask your parents "Where am I from? Who are my people?" At least in the genetic sense my daughter's generation will be robbed of such mystery, or delivered from confusion, depending on how you look at it. For the past few decades it has been chic to have Native American ancestry, at least purported. Genotyping can now answer whether this ancestry rises to the level of detectability. By and large I think this is a good thing, but your mileage may vary. Unfortunately my daughter is only one type of "Indian."

1 free article left
Want More? Get unlimited access for as low as $1.99/month
Already a subscriber? Log In or Register
1 free articleSubscribe
Want unlimited access?

Subscribe today and save 70%


Already a subscriber? Log In or Register
More From Discover
Recommendations From Our Store
Shop Now
Stay Curious
Our List

Sign up for our weekly science updates.

To The Magazine

Save up to 70% off the cover price when you subscribe to Discover magazine.

Copyright © 2021 Kalmbach Media Co.