Recently an evolutionary geneticist told me that his colleagues who worked with mice really didn't have their stuff together. Actually, his language was a touch more colorful than that. But the gist of the argument seemed plausible enough to me. I tend to avoid reading papers using the mouse as a model organism in genetics because I recall getting confused by the pedigrees and various strain acronyms and abbreviations (nonstandard acronyms and abbreviations have also been a problem for me whenever I try to read developmental genetics). If I want to look at the genetics of a mammal besides a human being I often like to focus on dogs. The breeds of dogs actually mean something to me. There's only so many skinned mouse hides I want to stare at.
With all that said there is a huge scientific complex devoted to the mouse. If the house of mouse is a mess, then someone needs to do cleaning at some point. A new paper in Nature Genetics starts the process, using SNPs and variable intensity oligonucleotides (VINOs) to assess the relationships between distinct lab strains as well as wild subspecies. Ignorant as I am of the biology of the mouse I was vaguely aware that like elegans much of the laboratory stock derives from a very small founding population. In psychology there's the problem of the outlook and dispositions of Western university students getting extrapolated to the whole species of man, so I have wondered about this problem for some of the model organisms now then. If you're studying very general biological processes this shouldn't be that big of an issue, but for evolution obviously characterizing the nature of variation is of the essence. The paper title & the abstract, Subspecific origin and haplotype diversity in the laboratory mouse:
Here we provide a genome-wide, high-resolution map of the phylogenetic origin of the genome of most extant laboratory mouse inbred strains. Our analysis is based on the genotypes of wild-caught mice from three subspecies of Mus musculus. We show that classical laboratory strains are derived from a few fancy mice with limited haplotype diversity. Their genomes are overwhelmingly Mus musculus domesticus in origin, and the remainder is mostly of Japanese origin. We generated genome-wide haplotype maps based on identity by descent from fancy mice and show that classical inbred strains have limited and non-randomly distributed genetic diversity. In contrast, wild-derived laboratory strains represent a broad sampling of diversity within M. musculus. Intersubspecific introgression is pervasive in these strains, and contamination by laboratory stocks has played a role in this process. The subspecific origin, haplotype diversity and identity by descent maps can be visualized using the Mouse Phylogeny Viewer....
There are four subspecies of Mus musculus described in the paper: - Mus musculus castaneus (southern and southeastern Asia) - Mus musculus domesticus (western Europe, southwestern Asia, Americas, Africa, and Oceania) - Mus musculus musculus (eastern Europe and northern Asia) - Mus musculus molosinus (Japan) There were 198 mice genotyped, including 36 wild-caught mice, 62 wild-derived laboratory strains and 100 classical laboratory mice. The SNP-chip was tuned to the variation present within domesticus, so there was some ascertainment bias in which their tests would underestimate the variation in the other lineages because their SNPs were undersampled. The VINO analysis compensated for this problem. Looking over the hundreds of thousands of markers they fixed upon loci which were highly informative of subspecies origin. In other words, those alleles which were representative of a particular subspecies (castaneus, domesticus, and musculus). The figure below illustrates their results. Observe that to the right the three wild subspecies are perfectly separated into distinct colors.
According to these results classic lab strains of mice then are predominantly derived from domesticus, but with a small residual proportion of musculus which varies by lineage, as well a trace level of castaneus. The wild-derived strains aren't that surprising, they mostly resembled the wild subspecies from which they're ostensibly descended, but there's obviously been some admixture with other lines. The "H" lines, the leftmost wild-derived lab strains above, are hybrids. Now some of you might be curious about the fact that the wild strains are pure, while the inbred lab strains exhibit some admixture. There is a common confusion that people sometimes have a hard time understanding that admixed populations can also be inbred. A simple thought experiment shows how this could be. Imagine a couple with a set of children. The father is white, the mother is black. Their children would be biracial. If the children mated and produced offspring the grandchildren would be inbred and racially admixed. There's no contradiction here. The markers which distinguish lineages above are informative across the subspecies, and so are a very small proportion of the markers within the genome. Inbreeding is a reflection of the much broader genomic patterns which are the outcome of particular mating systems or events.
Next let's look at how the admixture of musculus plays out against the dominant background of domesticus in the classic lab strains. To the left is a figure which shows the assignments of ancestry on chromosome 6 of all the lines above. It is rotated 90 degrees counterclockwise from figure 1. At the top you see the three wild lineages, which separate perfectly on the chromosomal level just as they did on the total genome scale. Next you see the wild derived lab strains. Now you notice admixture across the genomes. Finally you have the classic lab strains, predominantly domesticus, but with musculus and castaneus regions also visible. Remember that each horizontal line represents a specific lineage, while points left to right correspond to positions in the genome. Observe how the musculus and castaneus ancestry is strongly correlated across many of these lineages, showing up as vertical bands. In their classic lab strain samples the ancestry from musculus and castaneus was ~5 and ~0.30 percent respectively. But, musculus admixture was only found across ~47 percent of the genome, while castaneus admixture only across ~3 percent of the genome. If the class lab strains derived their admixture from many different crosses in the putative "primal horde" of ur-mice there should be signatures across the whole genome of these distinct events, rather than concentrated dollops in a such localized fashion. This evidence of admixture in only a few regions of the genome is suggestive of the likelihood that the classic lab strains derive from a very small founder population. More specifically the authors believers that the musculus admixture of the predominantly domesticus classic lab strains is from the Japanese molosinnus, which itself is a hybrid between eastern musculus and castaneus. And sure enough when they focused on the regions which are attributable to musculus in the classic lab strains, these tended to cluster with eastern musculus lineages, rather than western ones. A further check was that the identity-by-descent (IBD) measure, which is correlated with genetic similarity (though it strictly measure descent from a common ancestor), shows that the musculus regions of the classic lab strains have ~98% identity with molosinnus derived lab strains (as opposed to ~83% IBD with generic musculus). As molosinnus is is a hybrid between musculus and castaneus this explains the presence of the latter in the classic lab strain ancestry, as well as the fact that the castaneus segments seem to be nested within and between the musculus regions. Let's jump to the discussion:
...One study concluded that the genome of these strains is 68% M. m. domesticus, 10% M. m. molossinus, 6% M. m. musculus, 3% M. m. castaneus and 13% of unknown origin6. On the other hand, we previously concluded that 92% is of M. m. domesticus, 6% is of M. m. musculus and 1% is of M. m. castaneus origin.... ... ...The results presented here conclusively show that classical inbred strains are overwhelmingly derived from M. m. domesticus, that the non–M. m. domesticus contribution to their genomes is largely of M. m. molossinus origin and that intersubspecific introgression is common in wild-derived laboratory strains. ... In summary, our observation of residual heterozygosity among inbred mouse strains, the striking local differences in the level of genetic similarity between substrains, the identification of large deletions of different ages and prevalence of contamination emphasizes the importance of deep, unbiased and frequent genetic characterization of laboratory stocks. Our genome browser provides access to the trees and links between recombination intervals, local trees and the maps for subspecific origin and haplotype diversity. Our analysis shows that classical inbred strains are in fact mosaics of a handful of haplotypes present in the founder fancy mice population. The genetic divergence among these haplotypes varies widely both locally and across the genome. Furthermore, the contribution of subspecies other than M. m. domesticus is limited, and its distribution highlights the complex population structure in these strains. On the other hand, wild-derived laboratory strains represent a deep reservoir of genetic diversity untapped in classical strains and are in many cases analogous to the three-way intersubspecific hybrids that classical inbred strains were thought to be. Our previous work...combined with the results of the deep survey of mouse resources presented here shows that the laboratory mouse is an unparalleled model for genetic studies in mammals.
Obviously it is important to know if your "pure" wild-derived lab strain isn't so pure, especially if you're using it as some sort of reference. From what I can gather in this paper many researchers had to just take on faith the genetic relationship between the various strains which they were utilizing in their research, relying upon 20th century methods such as pedigrees. In the near future widespread genotyping might eliminate any concerns, but until then this sort of study should give researchers a better understanding of the confounding issues which might be introduced because of a false perception of genetic variation or uniformity in their stock. The ancestry estimates quoted above seem all over the place, so it was obviously time that someone did the survey reported in this paper. Finally, I'm struck by the mighty-mouse roar at the end of the paper. Unparalleled indeed! Citation:
Yang, Hyuna, Wang, Jeremy R, Didion, John P, Buus, Ryan J, Bell, Timothy A, Welsh, Catherine E, Bonhomme, Francois, Yu, Alex Hon-Tsen, Nachman, Michael W, Pialek, Jaroslav, Tucker, Priscilla, Boursot, Pierre, McMillan, Leonard, Churchill, Gary A, & de Villena, Fernando Pardo-Manuel (2011). Subspecific origin and haplotype diversity in the laboratory mouse Nature Genetics : 10.1038/ng.847
Image credit: Wikimedia