Genetic association studies can be used to identify factors that may contribute to disparities in disease evident across different racial and ethnic populations. However, such studies may not account for potential confounding if study populations are genetically heterogeneous. Racial and ethnic classifications have been used as proxies for genetic relatedness. We investigated genetic admixture and developed a questionnaire to explore variables used in constructing racial identity in two cohorts: 50 African Americans and 40 Nigerians. Genetic ancestry was determined by genotyping 107 ancestry informative markers. Ancestry estimates calculated with maximum likelihood estimation were compared with population stratification detected with principal components analysis. Ancestry was approximately 95% west African, 4% European, and 1% Native American in the Nigerian cohort and 83% west African, 15% European, and 2% Native American in the African American cohort. Therefore, self-identification as African American agreed well with inferred west African ancestry. However, the cohorts differed significantly in mean percentage west African and European ancestries...and in the variance for individual ancestry...Among African Americans, no set of questionnaire items effectively estimated degree of west African ancestry, and self-report of a high degree of African ancestry in a three-generation family tree did not accurately predict degree of African ancestry. Our findings suggest that self-reported race and ancestry can predict ancestral clusters but do not reveal the extent of admixture. Genetic classifications of ancestry may provide a more objective and accurate method of defining homogenous populations for the investigation of specific population-disease associations
So how does this jive with Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies:
We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity--as opposed to current residence--is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.
In a word: psychology. The identity of African Americans is socially constructed, and their own self-perceptions in terms of their ancestral origin are confounded by the vicissitudes of history as well as cultural context. Unfortunately there then tends to be an assumption by some that the malleability of the social dimensions of racial identity perfectly reflect the nature of the underlying population substructure. Obviously they do, but not with perfect correlation, rather, genetic structure is one of the background conditions which strongly shapes social realities. In the American context the nature of hypodescent is critical, since one drop of black blood rendered an individual black in the eyes of white Americans the effect of within group variation in the extent of European ancestry was not as explicit or notable as in other societies where grades of admixture were acknowledged. In South Africa for example populations of mixed white and black ancestry are distinguished from both parental populations, and had social status equidistant. Though the black elite has traditionally been lighter skinned and presumably more European in ancestry than the majority of the population, one presumes that the power of phenotype as opposed to genealogy was paramount (genealogy being a sketchy affair due to the nature of the period of slavery). In a randomly mating population the association of ancestry with perceived appearance will weaken because the latter is strongly weighted to a few traits which might be coded by a few genes which exhibit a great deal of within family variance. The black American population was not randomly mating (e.g., "the talented tenth" was always a self-conscious subgroup), but it was not perfectly assorted by ancestry or class either. The point is that one should take all priors into account in terms of shaping a picture of reality. As noted in the paper above the reason to ascertain the extent of admixture and its variance within a population are eminently pragmatic; population substructure is medically useful information. The nature of that substructure may manifest on a genetic level, but it is often shaped and constrained by exogenous sociological parameters. And obviously this doesn't apply just to African Americans. Here's a figure from a paper I recently blogged about the genetic structure of of Ashkenazi Jews and Northern Europeans:
See the Askhenazi Jews who are outside of the tight central cluster? I bet you that these probably wouldn't be able to explain to you why this might be....