Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

Health

Who are those Houston Gujus?

Gene ExpressionBy Razib KhanFebruary 15, 2011 2:38 AM

Newsletter

Sign up for our email newsletter for the latest science news

guj1.png

The figure to the left is a three dimensional representation of principal components 1, 2, and 3, generated from a sample of Gujaratis from Houston, and Chinese from Denver. When these two populations are pooled together the Chinese form a very homogeneous cluster. They don't vary much across the three top explanatory dimensions of genetic variance. In contrast, the Gujaratis do vary. This is not surprising. In the supplements of Reconstructing Indian population history it was notable that the Gujaratis did tend to shake out into two distinct clusters in the PCAs. This is a finding you see over and over when you manipulate the HapMap Gujarati data set. In reality, there aren't two equivalent clusters. Rather, there's one "tight" cluster, which I will label "Gujarati_B" from now on in my data set, and another cluster, "Gujarati_A," which really just consists of all the individuals who are outside of Gujarati_B cluster. Even when compared to other South Asian populations these two distinct categories persist in the HapMap Gujaratis. Zack has already identified a major difference between the two clusters: Gujarat_A has some individuals with much more "West Eurasian" ancestry. To be more formal about this in the future I simply assigned individuals in my merged data set to one of the two Gujarati clusters based on their position in the first two PCs. Yesterday night I ran ADMIXTURE K = 2 to 10, with 75,000 SNPs. I also removed the Native American groups, and added more European and East Asian samples from the HapMap. Below are some populations at K = 4:

guju2.png

Let's drill down to the level of individuals. Here are the Gujarati individuals, along with Sindhis, and my parents (Bengali). I've sorted by the "European" and then "South Asian" components (light blue and green respectively, while purple is modal in Papuans and red in East Asians):

Guj3.png

The ADMIXTURE plots are in total alignment with the PCA. In the PCA Gujarati_A exhibit a spectrum of distance from the European cluster, and in the ADMIXTURE you see the same. In contrast, Gujarati_B is relatively uniform. So what's going on? I will be posting something similar over at Sepia Mutiny soon. But my guess is that Gujarati_B are a subset of Patels. In other words, they're a genetically distinct jati. I suspect that Gujarati_A are a more diverse bunch from a number of different jatis. Does this matter? I believe it does. If Gujarati_B are a distinct ethno-social group which is a subset of Gujaratis, then they may not be as good a proxy for South Asian medical genetics as Gujarati_A. More concretely, Gujarati_B may have relatively high frequency rare disease alleles because they're an inbred clan. In contrast, while Gujarati_A may exhibit all the hallmarks of South Asian endogamy, if they're a larger number of different groups, then they'll have all sorts of different rare alleles. The ones they have in common may be more generally South Asian.

2 Free Articles Left

Want it all? Get unlimited access when you subscribe.

Subscribe

Already a subscriber? Register or Log In

Want unlimited access?

Subscribe today and save 70%

Subscribe

Already a subscriber? Register or Log In