The language families of Europe fall into a few broad categories. There are the Indo-European languages, which include the Romance, Germanic, Slavic and Celtic subgroups, along with Greek and Albanian. The Iranian languages and most of the languages of India are also Indo-European. Then there are the languages of Finland and Hungary, which are hypothesized to be of a broader Finno-Ugric family.
Whatever the validity of this cluster, the relationship of Hungarian and Finnish to languages which are extant deep into Eurasia, beyond the Urals and into Siberia, are not disputed. Turkic and Semitic families have a toehold in Europe via Turkish and Maltese. And finally, you have the Basque dialects. Basque is not related to any other language in the world; it is a linguistic isolate. There have been attempts to connect Basque to languages in the Caucasus, but these are highly speculative conjectures.
So where did Basque come from? A common assumption is that Basque is the autochthonous speech of the Iberian peninsula, perhaps related to the pre-Latin dialects extant to the south and east of the peninsula (the Romans arrived on the scene at a time when Spain was also partially dominated by Celtic tribes). Many go further and assert that the Basques are the pure descendants of the first modern humans to arrive on the European continent, heirs of the Cro-Magnons. Even if this claim is a bit much, many would cede that the Basque populations derive from the hunter-gatherers who were extant on the continent when the Neolithic farmers arrived from the Middle East, and Indo-European speakers pushed in from the east.
In terms of historical genetics these assumptions result in the Basque population be used as a “reference” for the indigenous component of the European ancestry which reaches back to the Last Glacial Maximum, and expanded from the Iberian refugium after the ice retreated.
One of reasons for the assumption of Basque antiquity & purity are genetic peculiarities of the Basques. Foremost among them is that the Basque seem to have the highest frequency of Rh- in the world, primarily because of the high frequency of the null allele within the population (it is a recessively expressed trait). Rh- is very rare outside of Europe, but its frequency exhibits a west-east gradient even within the continent. It has been suggested that the mixing of Rh- and Rh+ blood groups reflects the mixing of hunter-gatherers and farmers in after the Ice Age.
The map above the illustrates the frequencies of this trait, and you can see how the Basque region is cordoned off. It’s an old map because blood group were widely collected in the early 20th century.
Because of the early knowledge of this heritable trait you have a lot of weird anthropological theories which hinge around blood group genetics having emerged in the early 20th century. But even as late as the mid-90s L. L. Cavalli-Sforza reported in The History and Geography of Human Genes using classical markers that the Basques exhibited some distinctiveness. Over the years with the rise of Y and mtDNA phylogenetics this distinctiveness has taken a hit.
I think the data have a tendency of confirming expectations, or it is often interpreted as such. But the recent story of the R1b haplogroup strongly implied that the Basques are no different from other west Europeans, and are likely the descendants of Neolithic farmers themselves! A new paper in Human Genetics supports the contention that the Basque are just like other Europeans, A genome-wide survey does not show the genetic distinctiveness of Basques:
Basques are a cultural isolate, and, according to mainly allele frequencies of classical polymorphisms, also a genetic isolate. We investigated the differentiation of Spanish Basques from the rest of Iberian populations by means of a dense, genome-wide SNP array. We found that F ST distances between Spanish Basques and other populations were similar to those between pairs of non-Basque populations.
The same result is found in a PCA of individuals, showing a general distinction between Iberians and other South Europeans independently of being Basques. Pathogen-mediated natural selection may be responsible for the high differentiation previously reported for Basques at very specific genes such as ABO, RH, and HLA. Thus, Basques cannot be considered a genetic outlier under a general genome scope and interpretations on their origin may have to be revised.
They use a SNP-chip to look at lots of genetic variation in different groups from Spain and France, with a particular focus on Basque-vs-non-Basque differences, as well as the European HGDP sample. They had about 30 individuals in 10 groups unique to their sample. Initially they looked at population level Fst, but I think the PCA is really more informative: They limited it to 109 SNPs which were the most informative out of the hundreds of thousands on the chip. There is no real difference between Basques and non-Basques. One thing to remember is that it’s rather well attested that the Basque dialects were more widespread in the early historical period than they are today, so there are many Spanish speaking residents of Navarre and French Gascons who are almost certainly descendants of Basque speakers. Nonetheless, there’s no sharp bifurcation that you’d expect from the total national samples which might point to a cryptic Basque & non-Basque genetic chasm.
Because of ancient DNA extraction the historical genetic history of Europe is in flux right now. Uniparental haplogroups which in the early aughts were presumed to be relics of the hunter-gatherer substrate may not be that at all. The new research on R1b suggesting that it originated in Anatolia, and its high frequency in the Basques also puts into doubt the idea that the Basques are pure descendants of Paleolithic Europeans.
Why did people think that the Basque were so special? Mostly because their language is special. It is non-Indo-European. As I stated above, it seems that at the time of the Roman conquest much of Spain, especially away from the coastal Mediterranean fringe, was undergoing a process of Celticization. Eventually Indo-Europeanization was completed by the Romans through the spread of Latin. But, the loci of Roman cultural expansion were colonies which were concentrated along the coastal regions of the Mediterranean. Iberia which faced the ocean was a marginal frontier where Latinization seems to have proceeded rather slowly and fitfully until the Western Empire collapsed. With the re-barbarization of inland and Atlantic Iberia the Basques managed to carve out a niche for themselves as forceful actors (they famously harried the troops of Charlemagne as they returned to Francia after their expedition in northern Iberia).
Behind mountains on the fringes of Europe and against the ocean the Basques evaded Indo-Europeanization. Likely it was simply luck and a random act of history. There are plenty of candidates for non-Indo-European languages across Europe, generally known from isolated inscriptions, but whatever the truth of it is seems that in the few thousand years before Christ Indo-European dialects spread across most of the continent. Only in Iberia did the process occur late enough so we catch glimmers of it in the textual record. It may be that the Finnic people of northeast Europe are also pre-Indo-European, preserved by the peculiar ecology of their region (the other model is that the Finns are themselves newcomers who pushed along the Arctic fringe from the Urals.)
But before the Indo-Europeans there were likely other waves of migrants bringing their own culture, foremost among them the Neolithic farmers. It is likely that the Minoans of Crete spoke a pre-Indo-European language, and may have been descendants of this wave of farmers from the Middle East. At this point I think it is as likely that the Basques are descendants of Neolithic settlers sweeping across the littoral fringe of Europe as that they are Paleolithic populations, though it is fair to note that it is unlikely that they are “pure” in either sense.
Let me finish with the authors’ conclusion:
Our analysis showed that, when a genome-wide perspective is applied, Basques are not particularly differentiated from other Iberian populations. The contradiction with previous reports that depicted Basques as genetic outliers can be resolved if we consider that the polymorphisms accounting for most of this differentiation lie in genes such as ABO, RH, and the HLA complex that are, given their involvement in host-pathogen interactions, obvious targets for natural selection in the ancestral populations even at a microgeographic scale.
This is yet another example of the sound insights in population genetics that can be achieved with a dense map of genome-wide SNPs, even if only the simplest statistical descriptor, namely, allele frequencies, is pressed into service. Future data with hundreds of thousands of SNPs typed individually in large samples will have to confirm the present findings.
There are practical reasons why blood group data was analyzed and interpreted first. But there’s now evidence that blood group distributions are not random, and may emerge as responses to disease pressures. In other words, they aren’t neutral markers which give a good sense of ancestry. This particular issue, combined with Basque genetic (at least on those loci) and linguistic uniqueness, make it understandable why a thesis of Basque local antiquity would be attractive. But the old order must now likely give to the new.
Note: Because of where I grew up I knew a fair number of American Basques, and generally they were very proud of their distinctive heritage. In hindsight I think it is notable that none of them identified as Latino or Hispanic, or claimed Spanish heritage. They were most definitely Basque, which was different.
Citation: Laayouni et al., A genome-wide survey does not show the genetic distinctiveness of Basques, Hum Genet DOI 10.1007/s00439-010-0798-3