The Pith: Afro-Indians are mostly African, with a substantial Indian minority ancestry. The latter is disproportionately female mediated. It also seems that that ancestry is more northwest Indian, and that natural selection has been operating upon them outside of the African environment.

Along the western coast of South Asia, from Makran in southwest Pakistan, down to the Konkan coast of southwest Iindia, there are isolated communities of Afro-Indians. They are called Siddis or Habshi. Their African origin is clear in their physical appearance, as well as aspects of their folk customs which tie them back to Sub-Saharan African. Nevertheless, they have assimilated to many Indian cultural traits. They generally speak the local language, and practice Islam, Hinduism, or Roman Catholic Christianity (in that order in proportion). How and why did the Siddis arrive in India? The earliest date for their arrival almost certainly must be bounded by the period when Indo-Islamic polities rose to prominence in the early second millennium. The cosmopolitan melange of the armies of the Muslim warlords included diverse groups of Africans, some of whom took power, and established their own self-conscious Afro-Indian dynasties, set apart from the Turkish, Afghan, Persian, and Arab inflected statelets. Were these the sources of the modern Siddi communities? The oral history of the Siddi of the western coast of South Asia suggests not. In fact the geographical concentration of these Afro-Indian tribes along the Arabian sea fringe is indicative of different historical actors: the Portuguese. In much of Asia, out to China, the role of Africans was very different from that in the New World. They were objects purchased as for elite consumption, not production. They served at court, guarded the harem, etc. Lowland Asia had no need for imported labor, as there was human stock aplenty. Whereas in much of the New World black African slaves were critical cogs in the capitalist system of production, in Asia, as in the Arab world outside of a few areas such as southern Iraq, they were signals of luxurious consumption by the high and mighty (this was in vogue at European courts for a period as well). Two new papers published yesterday in the American Journal of Human Genetics examine the genetics of the Siddi of India with an eye toward elucidating the details of their historical ethnogenesis. Though the papers overlap to a great extent, there are subtle differences which result in complementation. Shah et al. uses a far thicker set of markers, while Narang et al. look at many more populations, but due to removing SNPs which don't span their populations the marker set is much thinner. Let's review the papers in turn. Indian Siddis: African Descendants with Indian Admixture:

The Siddis (Afro-Indians) are a tribal population whose members live in coastal Karnataka, Gujarat, and in some parts of Andhra Pradesh. Historical records indicate that the Portuguese brought the Siddis to India from Africa about 300–500 years ago; however, there is little information about their more precise ancestral origins. Here, we perform a genome-wide survey to understand the population history of the Siddis. Using hundreds of thousands of autosomal markers, we show that they have inherited ancestry from Africans, Indians, and possibly Europeans (Portuguese). Additionally, analyses of the uniparental (Y-chromosomal and mitochondrial DNA) markers indicate that the Siddis trace their ancestry to Bantu speakers from sub-Saharan Africa. We estimate that the admixture between the African ancestors of the Siddis and neighboring South Asian groups probably occurred in the past eight generations (∼200 years ago), consistent with historical records.

The major value-add of this paper is a estimate of the time of admixture with Indians. I'll get to that, but let's look at the phylogenetic relationships really quickly:

The PCA and admixture estimate are perfectly consistent. The Siddis are more African than not, but, they are clearly admixed with the Indian populations. To obtain more fine-grained understanding the authors also looked at uniparental lineages. Note the striking discordance between maternal mtDNA and paternal Y ancestral estimates. And more curiously, note the far closer value using the autosomal estimates, a proxy for total ancestry, and the paternal lineage quanta. I think there's a rather good explanation for what's going on: the transport of slaves from Africa was strongly male-biased. These African-born males assimilated into the native Afro-Indian community, which had a strong local Indian component in the early years via women who had married in. But once a significant Siddi community had developed it assimilated new arrivals, who were male, and beefed up the African quanta of autosomal and Y chromosomal ancestry, but not the mtDNA. Like Argentina the matriline of the Siddis is a shadow of the initial generations, when the boundaries between the Afro-Indians and locals were more permeable. And that initial generation is likely to have been somewhat recent, as the authors estimate that the average date of admixture was ~8 generations before the present, with a standard error of 1 generation. This comes rather close to falsifying the proposition that the Siddis derive in the main from the first generations of Indo-Islamic arrivals. Rather, the Siddis seem more likely to date to the Indian ocean trade in human beings which post-dates the arrival of the Portuguese, as suggested in their oral history. It is important to remember that Omani Arabs and others were also involved in this trade, but the Portuguese were during the 16th and 18th centuries uniquely placed to transport Africans from their East African strong-points to the fortifications on the west coast of India. The manner in which they estimated this admixture event is rather straightforward. Geographically distinct populations have their own unique genetic variants. If you take two individuals from very distinct populations, they pass a single strand out of the two they carry (granting recombination's confounding of the two parental strands). That means that the offspring are going to have two homologous chromosomes which are reflective of very different ancestral histories. To give a concrete example, if someone had an Indian parent and an African parent, then one of their DNA strands would have a sequence of genetic variants extremely associated with the ancestry of the parent from which that DNA strand was passed. That is why first generation mixed-race individuals have very high rates of heterozygosity and few runs of homozygosity; their paired strands are very unlikely to have recent common ancestry. This also implies that in a first generation population of mixed-race individuals you'll see a whole lot of linkage disequilibrium (LD). This means that markers x, y, and z, associated with population 1, are going to be likely found on the same DNA strands, while markers a, b, and c, associated with population 2, are going to be find on other DNA strands. Therefore, you'll get long haplotypes, sets of distinctive markers across genes, indicative of the shared demographic history of the two parent populations.

But I stipulated the first generation, because over time LD will decay due to genetic recombination. The schematic to the left illustrates what's going on. Recall that during meoisis the parental chromosomes segregate and assort, and haploid gametes are formed which transmit the single strands to the offspring. But this process is not always without incident. In particular, the parents' distinctive strands can break and recombine to form a new haplotype on the strand level. For example, say your mother has one strand which is maternal and another that is paternal. Through recombination she may transmit to her offspring a strand which is 2/3 maternal and 1/3 paternal in reference to her own parents, because the strands may recombine. Therefore, in the first generation the hybrids have a perfect association between ancestry across single strands, but recombination will break apart these associations. First generation Afro-Indians might transmit a strand which is 25% African and 75% Indian to their offspring. Over the generations this mixing & matching with break apart the associations generated through admixture. If one assumes that this rate of recombination is constant, then the extent of linkage disequilibrium and the length of haplotype blocks can give us a sense of time since admixture. This method is relatively powerful if the admixture was recent, as over the generations the extent of LD will asymptotically approach the baseline one might expect without an admixture event. In other words, there is precision toward events near in time, but relatively little to ancient ones. As noted in the paper, the Uyghur population exhibits a signature of an admixture event ~2,000 years before the present, while the African American population exhibits admixture on the order of hundreds of years. One of the authors of Shah et al. is David Reich, who was coauthor on a paper which famously (to readers of this weblog!) posited that South Asians are an ancient admixture between "Ancestral North Indians" (ANI) and "Ancestral South Indians" (ASI). This event is too ancient for LD methods to peg a date, at least the ones they use here. The Siddi resemble New World African populations in the date of their admixture event, but, their sex bias is very different. In the New World the maternal lineages are overwhelmingly African, while the paternal lineages are more European (though some African groups have Amerindian paternal lineages). I think this tells us something about the peculiarities of the Siddi community in India. Interestingly, I think that they may resemble Ashkenazi Jews and Roma in this tendency, with the paternal lineage being more associated with their cultural and physically salient characteristics, with exogenous admixture occurring through the female lineages. Finally, in the analysis of the uniparental lineages they show that there seems to be a clear association between the Bantu people of Africa and the Siddi, and that the admixture events were unidirectional insofar as the nearby Indian groups don't have African admixture. These samples were from Gujarat and Karnataka, and because the Siddi tend to be Muslim while their neighbors are likely to be Hindu, I think we should be careful to generalize too much. An analysis of the HGDP shows non-trivial African admixture among some South Asian groups to the north and west. I would assume that this is a touch older, and dates back to West Asian groups which were somewhat admixed, but it makes sense Pakistani Muslims are more likely to be able to assimilate another Muslim population, exotic though it may be. One of the Pakistanis I analyzed privately exhibited a clear African ancestral signal which they were not able to explain, so it may be a part of the genetic background of many South Asian Muslims, though not Hindus.

So what about the second paper? Narang et al. has a wider variation in populations in an intra-Indian sense, but a smaller number of markers. While Shah et al. used ~800,000 markers, the combined set of Narang et al. is ~20,000, and, they paired it down in some cases to ~3,000 ancestrally informative markers. ~20,000 is sufficient for PCA from what I've seen, but for intra-continental differences it is on the bubble for analysis of admixture between putative ancestral populations (i.e., the bar plots produced by Structure, Admixture, frappe, etc.). Additionally, while Shah et al. used Siddi samples from Karnataka and Gujarat, Narang et al. focused on Gujarati Siddis only. The biggest result seems to confirm something hinted at in Shah et al.: the Indian admixture into the Siddis exhibits a regional bias. Shah et al. concluded that using an ASI-skewed Indian sample was less effective than using an ANI-skewed sample. Narang et al. confirms this, showing that the Gujarati Siddis exhibit and admixture cline more toward northwest Indian groups than not. Some of this may be European or Middle Eastern admixture, but I suspect that the best explanation is that as a predominantly Muslim population these Siddis had interactions disproportionately with individuals of Indo-Islamic background. In particular, a disproportionate number of transplants from northern and northwest India (today Pakistan) who relocated to central and southern India with the collapse of the original Delhi Sultanate. These would be the elites purchasing the Siddis in the first place more often than not (though some Hindu potentates also purchased or received gifts of black slaves, their international connections were more tenuous, and their polities were often more land than sea-based). Because of the thinner marker set the authors couldn't much more about the admixture event except that it was recent. But, there was this interesting bit about functionally relevant genes:

We also wanted to see whether there were some biological processes that were selectively enriched in the admixed populations from either of the ancestors. Considering the SNPs that have an FST value ≥0.1 between the two ancestral populations, we selected 3396 of the 18,534 SNPs for functional analysis. Of these, 1218 SNPs were filtered out because their frequencies in the OG population were within 5% of the expected frequency, which is the ancestry proportionate weighted average of the allele frequencies of the two ancestral populations. The remaining SNPs were classified into two groups of 1240 and 938 SNPs on the basis of their closeness, in terms of allele frequency, to the Indian and African ancestral populations, respectively. Analysis of gene classes in these groups revealed significant enrichment of cadherins, potassium channels, membrane proteins, and solute carriers as well as protein kinases from the group close to IE and kinases and immune-related genes from the group close to African ancestry. Further functional annotation clustering (FAC) revealed significant enrichment of processes related to axonogenesis and potassium transport in genes from the group for which the frequency of SNPs is close to that of the Indian ancestral population (Table 5). However, FAC did not reveal any specific enrichment of the processes contributed by the other group.

In other words, there's a deviation from what you'd expect just from ancestry alone. Why? I suspect there was some sort of release of functional constraint due to the high pathogen load common in Africa in relation to South Asia (yes, South Asia has a low pathogen load compared to Africa!). It isn't as if the climate is that different. Here the categories of genes which seem to be overrepresented in the Siddi population in relation to the ancestral Indian component (in other words, the proportion of "Indian" ancestry is higher at this locations than expectation):

Here's the elaboration in the discussion:

.... However, we wanted to examine whether the OG have retained any enriched biological processes from either of the ancestors. Our search for functional enrichments was directed at the AIMs that were associated with genes and whose frequency in OG was close to either of the ancestral populations. We observed a significant enrichment of processes related to ion-channel activity and cadherin genes; the genotypic spectrum in these enriched processes was close to that of the IE ancestors (Figure 7). Selection in ion-channel genes among populations of African ancestry has been a long-term global enigma. However, the fact that the population resides in an extremely saline region of the country and has shown deviations in these genes was intriguing and made it compelling to speculate that this finding is biologically relevant. This is especially interesting in the light of the fact that a recent GWAS study of hypertension and blood pressure in African Americans implicated a similar family of genes related to ion channels, cadherins, and calmodulins.

IE here means "Indo-European." Since the samples are from Gujarat, an Indo-European speaking region, one would expect this affinity, though recall that the Siddis are biased toward a more northern affinity than that. In any case, the implications of constraint and selection on these loci have long been discussed, and the Afro-Indian case serves as an interesting replication of the larger pattern. Summary points: 1 - The Siddis are relatively recent in time in their origin. Post-1500, and possibly early British. 2 - Admixture with South Asians was more "female mediated." That is, Indian ancestry tends toward a maternal origin, though not exclusively so. 3) The ancestry also seems somewhat biased toward north and western South Asian sources. Shah et al. had a Karnataka sample, which is in a Dravidian speaking region (albeit, with Indo-Aryan minority populations), and they still found that in that group a North Indian ancestral population was a better fit than a South Indian one. The main caveat is that this may be due to exogenous West Asian or European ancestry against a South Indian background. 4) There seems some evidence of changes in the selective constraints and pressures, which have had a genome-wide impact even in ~10 or so generations. On a final note: if the numbers quoted here are correct then I believe that the majority of the African ancestral element within the boundaries of South Asia is distributed amongst South Asian Muslims. A generous estimate of the number of culturally identified Siddis seems to be ~250,000. If 0.25% of the genome of Pakistanis is African, which I think is plausible, then that would be ~400,000 Siddis! I suspect that Indian Muslims, even some Bangladeshis with Middle Eastern ancestry (such as my mother), also have a non-trivial African ancestral element due to the cosmopolitanism of the Dar-ul-Islam, and the ubiquity of black slaves as consumption signals and military shock troops amongst Islamic elites. As for how much is found in the Hindu population, that will be a good gauge I think not of the intermarriage of Africans with Hindus, but the assimilation of liminal Muslim groups, in particular sects considered heterodox by India's Sunni rulers, into the Hindu caste system. Citation:

Anish M. Shah, Rakesh Tamang, Priya Moorjani, Deepa Selvi Rani, Periyasamy Govindaraj, Gururaj Kulkarni, Tanmoy Bhattacharya, Mohammed S. Mustak, L.V.K.S. Bhaskar, Alla G. Reddy, Dharmendra Gadhvi, Pramod B. Gai, Gyaneshwer Chaubey, Nick Patterson, David Reich, Chris Tyler-Smith, Lalji Singh, & Kumarasamy Thangaraj (2011). Indian Siddis: African Descendants with Indian Admixture American Journal of Human Genetics : 10.1016/j.ajhg.2011.05.030


Ankita Narang, Pankaj Jha, Vimal Rawat, Arijit Mukhopadhayay, Debasis Dash, Indian Genome Variation Consortium, Analabha Basu, & Mitali Mukerji (2011). , Recent Admixture in an Indian Population of African Ancestry American Journal of Human Genetics : 10.1016/j.ajhg.2011.06.004

Addendum: Am the only one a touch weirded out by the face of the black person in the first figure? It isn't as if illiterates are going to be reading the paper! Kind of funny though.

