Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

Health

Is Daniel MacArthur 'desi'?

Gene ExpressionBy Razib KhanDecember 10, 2012 3:07 PM

Newsletter

Sign up for our email newsletter for the latest science news

My initial inclination in this post was to discuss a recent ordering snafu which resulted in many of my friends being quite peeved at 23andMe. But browsing through their new 'ancestry composition' feature I thought I had to discuss it first, because of some nerd-level intrigue. Though I agree with many of Dienekes concerns about this new feature, I have to admit that at least this method doesn't give out positively misleading results. For example, I had complained earlier that 'ancestry painting' gave literally crazy results when they weren't trivial. It said I was ~60 percent European, which makes some coherent sense in their non-optimal reference population set, but then stated that my daughter was >90 percent European. Since 23andMe did confirm she was 50% identical by descent with me these results didn't make sense; some readers suggested that there was a strong bias in their algorithms to assign ambiguous genomic segments to 'European' heritage (this was a problem for East Africans too). Here's my daughter's new chromosome painting:

daugpaint.png

One aspect of 23andMe's new ancestry composition feature is that it is very Eurocentric. But, most of the customers are white, and presumably the reference populations they used (which are from customers) are also white. Though there are plenty of public domain non-white data sets they could have used, I assume they'd prefer to eat their own data dog-food in this case. But that's really a minor gripe in the grand scheme of things. This is a huge upgrade from what came before. Now, it's not telling me, as a South Asian, very much. But, it's not telling me ludicrous things anymore either! But in regards to omission I am curious to know why this new feature rates my family as only ~3% East Asian, when other analyses put us in the 10-15% range. The problem with very high values is that South Asians often have some residual 'eastern' signal, which I suspect is not real admixture, but is an artifact. Nevertheless, northeast Indians, including Bengalis, often have genuine East Asia admixture. On PCA plots my family is shifted considerably toward East Asians. The signal they are picking up probably isn't noise. Almost every apportionment of East Asian ancestry I've seen for my family yields a greater value for my mother, and that holds here. It's just that the values are implausibly low. In any case, that's not the strangest thing I saw. I was clicking around people who I had "shared" genomes with, and I stumbled upon this:

dgm.png

As you can guess from the screenshot

this is Daniel MacArthur's profile.

And according to this ~25% of chromosome 10 is South Asian! On first blush this seemed totally nonsensical to me, so I clicked around other profiles of people of similar Northern European background...and I didn't see anything equivalent. What to do? It's going to take more evidence than this to shake my prior assumptions, so I downloaded Dr. MacArthur's genotype. Then I merged it with three HapMap populations, the Utah whites (CEU), the Gujaratis (GIH), and the Chinese from Denver (CHD). The last was basically a control. I pulled out chromosome 10. I also added Dan's wife Ilana to the data set, since I believe she got typed with the same Illumina chip, and is of similar ethnic background (i.e., very white). It is important to note that only 28,000 SNPs remained in the data set. But usually 10,000 is more than sufficient on SNP data for model-based clustering with inter-continental scale variation. I did two things: 1) I ran ADMIXTURE at K = 3, unsupervised 2) I ran an MDS, which visualized the genetic variation in multiple dimensions Before I go on, I will state what I found: these methods supported the inference from 23andMe, on chromosome 10 Dr. MacArthur seems to have an affinity with South Asians (i.e., this is his 'curry chromosome'). Here are the average (median) values in tabular format, with MacArthur and his wife presented for comparison.

You probably want a distribution. Out of the non-founder CEU sample none went above 20% South Asian. Though it did surprise me that a few were that high, making it more plausible to me that MacArthur's results on chromosome 10 were a fluke:

ADMIXTURE results for chromosome 10

K 1K 2K 3

CEU0.040.020.93

GIH0.870.050.08

CHD0.010.970.01

Daniel MacArthur0.290.070.64

Ilana Fisher0.010.060.94

dis.png

And here's the MDS with the two largest dimensions:

Screenshot-from-2012-12-10-003824.png

inter.png

Again, it's evident that this chromosome 10 is shifted toward South Asians. If I had more time right now what I'd do is probably get that specific chromosomal segment, phase it, and then compare it to various South Asian populations. But I don't have time now, so I went and checked out the results from the Interpretome. I cranked up the settings to reduce the noise, and so that it would only spit out the most robust and significant results. As you can see, again chromosome 10 comes up as the one which isn't quite like the others. Is there is a plausible explanation for this? Perhaps Dr. MacArthur can call up a helpful relative? From what recall his parents are immigrants from the United Kingdom, and it isn't unheard of that white Britons do have South Asian ancestry which dates back to the 19th century. Though to be totally honest I'm rather agnostic about all this right now. This genotype has been "out" for years now, so how is it that no one has noticed this peculiarity??? Perhaps the issue is that everyone was looking at the genome wide average, and it just doesn't rise to the level of notice? What I really want to do is look at the distribution of all chromosomes and see how Daniel MacArthur's chromosome 10 then stacks up. It might be a random act of nature yet.

browndan.png

Also, I guess I should add that at ~1.5% South Asian that would be consistent with one of MacArthur's great-great-great-great grandparents being Indian. Assuming 25 year generation times that puts them in the mid-19th century. Of course, at such a low proportion the variance is going to be high, so it is quite possible that you need to push the real date of admixture one generation back, or one generation forward.

3 Free Articles Left

Want it all? Get unlimited access when you subscribe.

Subscribe

Already a subscriber? Register or Log In

Want unlimited access?

Subscribe today and save 70%

Subscribe

Already a subscriber? Register or Log In