Most heritable surnames, like Y chromosomes, are passed from father to son. These unique cultural markers of coancestry might therefore have a genetic correlate in shared Y chromosome types among men sharing surnames, although the link could be affected by mutation, multiple foundation for names, nonpaternity, and genetic drift. Here, we demonstrate through an analysis of 1678 Y-chromosomal haplotypes within 40 British surnames a remarkably high degree of coancestry that generally increases as surnames become rarer. On average, the proportion of haplotypes lying within descent clusters is 62%, but ranges from zero to 87%. The shallow time-depth of many descent clusters within names, the lack of a detectable effect of surname derivation on diversity, and simulations of surname descent suggest that genetic drift through variation in reproductive success is important in structuring haplotype diversity. Modern patterns therefore provide little reliable information about the original founders of surnames some 700 years ago. A comparative analysis of published data on Y diversity within Irish surnames demonstrates a relative lack of surname frequency dependence of coancestry, a difference probably mediated through distinct Irish and British demographic histories including even more marked genetic drift in Ireland.
Interesting points: 1) Really rare surnames share a common Y chromosomal lineage. This is a pretty good indicator of relatedness. Additionally, many of these lineages are extremely rare in the general population, reducing the likelihood of independent "non-paternity events" introducing the haplotypes. In contrast, common surnames don't seem have gene lineages any more related than the general population. 2) There are a variety of data which suggest to the authors that the high frequencies of some of these lineages is due to genetic drift filtering the diversity. The time until extinction of a given lineage is inversely proportional to effective population size. In plain English large populations are far less likely to lose information in the form of genetic diversity due to sampling variance from generation to generation. In real populations the sampling variance is due to a variety of factors, including reproductive skew (which is modeled as a poisson process). This is the same insight which scientists have had a hard time explaining to the public when it comes to "mitochondrial Eve." Just because one gene lineage comes down to the present day in a straight line through mothers does not mean that other females did not exist, and reproduce, and so contribute to the genetic diversity today. The same is true here for modal haplotypes in surname groups of rare frequency. Though the Y chromosomes are related in rare frequency surnames, I'd be curious what total genome content would tell us (uniparental lineages are subject to stronger drift because of smaller effective population sizes). Here's a figure which shows the English data well:
Interesting, a review of Irish data does not show such a strong relationship between rarity and common Y lineages. Rather, some of the common surnames also share relationships. The authors speculate that this maybe due to different demographic and historical factors. Perhaps, for example, the Irish were polygynous than the English so that extremely fecund high status lineages could come to dominate numerous descent groups?