search this blog

Saturday, January 30, 2016

Ancient Greeks and Romans may have imported a whole new genetic cline into Europe (or not)


Is anyone else thinking what I'm thinking? The Principal Component Analysis (PCA) below should be self-explanatory. But if you're having problems with the abbreviations and acronyms, consult the list of definitions here.


Update 15/09/2017: Modern-day Greeks & Italians vs Mycenaeans

See also...

First Neolithic genomes from Greece

The enigmatic headless Romans from York

Tuesday, January 26, 2016

Four major ancestries in mainland India


PNAS has just released a new paper on the population history of India. It's not a bad effort, but very speculative and not particularly insightful, mainly because it doesn't include any ancient DNA from South Asia. Let's be honest, nowadays, if you want a really hard hitting paper of this sort, you need some ancient DNA. It's open access. Here's the abstract.

India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.

Basu et al., Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure, PNAS, Published online before print January 25, 2016, doi: 10.1073/pnas.1513197113

See also...

The Poltavka outlier

Saturday, January 23, 2016

The enigmatic headless Romans from York


My dataset was recently enriched with six ancient individuals from Roman York, courtesy of Martiniano et al. 2016.

They were either gladiators or soldiers. Each one was decapitated. This may have been a coup de grĂ¢ce or a burial rite. At least one, 3DRIF-26, was not native to Britian.

In fact, isotopic evidence suggests that he spent his childhood in a region with a hot and dry climate such as North Africa or the Levant. Moreover, his top matching population in terms of pairwise Identical-by-State (IBS) allele sharing are present-day Saudis (see here).

However, I thought it might be useful to revisit 3DRIF-26's genetic affinities after taking into account his non-trivial Sub-Saharan admixture. This can be done with qpAdm. The best ten models are listed below.

Please note that in the last model I had to use 3DRIF-26 as a mixture source for present-day Egyptians, because he has less Yoruba-related admixture than the Egyptians.

Anatolia_Neolithic 0.528
Caucasus_HG Kotias 0.379
Yoruba 0.093
chisq 2.813 tail prob 0.421347

Samaritan 0.940
Yoruba 0.060
chisq 3.706 tail prob 0.447229

Cypriot 0.915
Yoruba 0.085
chisq 4.564 tail prob 0.334981

Lebanese_Druze 0.933
Yoruba 0.067
chisq 5.961 tail prob 0.202081

BedouinB 0.998
Yoruba 0.002
chisq 6.311 tail prob 0.17709

Lebanese_Christian 0.929
Yoruba 0.071
chisq 6.660 tail prob 0.155006

Lebanese_Muslim 0.943
Yoruba 0.057
chisq 6.874 tail prob 0.142671

Druze 0.933
Yoruba 0.067
chisq 8.235 tail prob 0.0833513

Iraqi_Jew 0.924
Yoruba 0.076
chisq 8.443 tail prob 0.0766321

Egyptian
Roman_outlier 0.900
Yoruba 0.100
chisq 9.262 tail prob 0.0548746

I'd say these results provide rather convincing evidence that 3DRIF-26's West Eurasian ancestry is derived from the Levant. Moreover, his relatively high level of Sub-Saharan admixture suggests that he came from the southern Levant or perhaps a nearby region, like the Sinai Peninsula.

Interestingly, the best models feature a couple of religious minorities (Samaritans and Lebanese Druze), an island population (Cypriots), and a fairly unique group in terms of genetic structure from Israel's Negev Desert (BedouinB). This suggests that 3DRIF-26 may have belonged to a similar religious or geographic isolate population, or, alternatively, that most of the Levant has experienced significant genetic shifts since he was alive.

The rest of the headless Romans were, in all likelihood, born and raised in or near Britain. However, two of the individuals, 3DRIF-16 and 6DRIF-3, show elevated IBS affinity to Lithuanians and Poles. At the same time, they both belong to Y-chromosome haplogroup R1b-U106 (aka M405), which is a marker generally thought to have arrived in Britain with Anglo-Saxons and Scandinavians. This might be a coincidence, but probably not.

D-stats confirm that they do show elevated Northeastern European affinity relative to the other three Romans. Only one of the Z-scores is statistically significant (>3), but most of the others would probably also reach significance with more SNPs and higher quality sequences.


My guess is that 3DRIF-16 and 6DRIF-3 were Britons of mixed origin, with recent ancestry from Scandinavia and/or East Central Europe. Indeed, they can be modeled with qpAdm as part Swedish and Polish.

England_Roman 0.869
Swedish 0.131
chisq 1.784 tail prob 0.775339

England_Roman 0.884
Polish 0.116
chisq 1.971 tail prob 0.741124

Data source and citation...

Martiniano, R. et al. Genomic signals of migration and continuity in Britain before the Anglo-Saxons. Nat. Commun. 7:10326 doi: 10.1038/ncomms10326 (2016).

Friday, January 22, 2016

Y-HG J2 has a deep and complex history in South Asia


Open access at Nature Scientific Reports:

The global distribution of J2-M172 sub-haplogroups has been associated with Neolithic demic diffusion. Two branches of J2-M172, J2a-M410 and J2b-M102 make a considerable part of Y chromosome gene pool of the Indian subcontinent. We investigated the Neolithic contribution of demic dispersal from West to Indian paternal lineages, which majorly consists of haplogroups of Late Pleistocene ancestry. To accomplish this, we have analysed 3023 Y-chromosomes from different ethnic populations, of which 355 belonged to J2-M172. Comparison of our data with worldwide data, including Y-STRs of 1157 individuals and haplogroup frequencies of 6966 individuals, suggested a complex scenario that cannot be explained by a single wave of agricultural expansion from Near East to South Asia. Contrary to the widely accepted elite dominance model, we found a substantial presence of J2a-M410 and J2b-M102 haplogroups in both caste and tribal populations of India. Unlike demic spread in Eurasia, our results advocate a unique, complex and ancient arrival of J2a-M410 and J2b-M102 haplogroups into Indian subcontinent.

Singh et al., Dissecting the influence of Neolithic demic diffusion on Indian Y-chromosome pool through J2-M172 haplogroup, Scientific Reports 6, Article number: 19157, (2016) doi:10.1038/srep19157

Tuesday, January 19, 2016

Ancient genomes from Iron Age, Roman and Anglo-Saxon Britain (Martiniano et al. & Schiffels et al. 2016)


Open access at Nature Communications at this LINK:

The purported migrations that have formed the peoples of Britain have been the focus of generations of scholarly controversy. However, this has not benefited from direct analyses of ancient genomes. Here we report nine ancient genomes (~1 ×) of individuals from northern Britain: seven from a Roman era York cemetery, bookended by earlier Iron-Age and later Anglo-Saxon burials. Six of the Roman genomes show affinity with modern British Celtic populations, particularly Welsh, but significantly diverge from populations from Yorkshire and other eastern English samples. They also show similarity with the earlier Iron-Age genome, suggesting population continuity, but differ from the later Anglo-Saxon genome. This pattern concords with profound impact of migrations in the Anglo-Saxon period. Strikingly, one Roman skeleton shows a clear signal of exogenous origin, with affinities pointing towards the Middle East, confirming the cosmopolitan character of the Empire, even at its northernmost fringes.



Martiniano, R. et al. Genomic signals of migration and continuity in Britain before the Anglo-Saxons. Nat. Commun. 7:10326 doi: 10.1038/ncomms10326 (2016).

And another one at this LINK:

British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain.

Schiffels, S. et al. Iron Age and Anglo-Saxon genomes from East England reveal British migration history. Nat. Commun. 7:10408 doi: 10.1038/ncomms10408 (2016).

See also...

The enigmatic headless Romans from York

Hinxton ancient genomes roundup

Monday, January 11, 2016

The Poltavka outlier


Anyone who still thinks that Y-chromosome haplogroup R1a originated in South Asia should burn this map into their brains. It'll come in useful over the next few years as we learn from ancient DNA about the conquest of the Indian subcontinent, and indeed much of Asia, by pastoralists from the western Russian and Ukrainian steppes.


X marks the spot of the burial site of Poltavka sample I0432 from the Mathieson et al. 2015 dataset. This individual belongs to Y-chromosome haplogroup R1a-Z93(Z94+), which today accounts for well over 90% of the R1a lineages in Asia and peaks in frequency at over 60% in the northern parts of South Asia.

Moreover, the dating of his burial site, 2925-2536 calBCE, suggests that he lived not long after the Z93 and Z94 mutations came into existence. That's because Z93 doesn't appear to be much older than 5,000 years based on full Y-chromosome sequence data (see here and here, including the comments).

So I0432 could well turn out to be a crucial piece in the puzzle of the peopling of South Asia.

Interestingly, this individual was flagged as an outlier in the Poltavka sample set by Mathieson et al., hence his other moniker: the Poltavka outlier. However, this wasn't because of any ancestry from South or even Central Asia. In fact, it was because he was too western.

Principal Component Analyses (PCA) featuring a wide range of present-day and ancient samples from Europe and Asia, like the one below, show that Poltavka outlier clusters further west than most Corded Ware individuals from Germany. Right click and open in a new tab to view full size.


In the past, using qpAdm, I modeled Poltavka outlier as 63.7% Yamnaya Samara and 36.3% German Middle Neolithic. This is probably not very far from the truth, but qpAdm offers a supervised mixture test in which the results are heavily reliant on the choice of outgroups, so I thought I'd revisit the issue with TreeMix, which allows an unsupervised analysis.

In a dataset including seven relatively high coverage Copper Age (CA), Early Bronze Age and Middle Neolithic (MN) European genomes, TreeMix picked out Poltavka outlier as the most likely sample to be admixed, showing a mixture edge of 33% from the base of the branch leading to the Iberian MN individual to that of Poltavka outlier.



This outcome is very similar to my qpAdm model, but it suggests an even more western source of admixture in Poltavka outlier. Could this admixture actually be from Iberia? I wouldn't discount this possibility, considering the presence of Bell Beaker communities, possibly of Atlantic or even Iberian origin, as far east as present-day Poland. Indeed, according to Cassidy et al. 2015, German Beakers show high affinity to MN and CA Iberians (see page 51 in the supp info here).

I double checked my TreeMix result with D-stats, and yep, when placed in a clade with Poltavka or Samara Yamnaya, Poltavka outlier shows the strongest signal of admixture from the Iberia MN individual.

At the same time, however, the signal from the Early Neolithic (EN) Iberian fails to reach significance (Z=<3), which suggests that, in fact, TreeMix and D-stats might be seeing the Iberia MN sample as the most attractive mixture source due to her high level of Western European hunter-gatherer (WHG) ancestry, which Poltavka outlier also has plenty of, rather than anything specific to Iberia.



In any case, it's clear enough that Poltavka outlier was the result of mixture between Yamnaya-related western steppe pastoralists and the descendants of Middle Neolithic Europeans with a high ratio of WHG ancestry. Where this admixture actually took place and which archaeological cultures were involved will have to be resolved with further sampling of ancient remains from Central and Eastern Europe.

However, it's already impossible to place the origin of Poltavka outlier anywhere in Asia, which suggests that both Z93 and Z94 are also from well inside the generally accepted borders of Europe.

This obviously has implications for the origins of the Indo-Iranians, because the widespread presence of these mutations in Asia gels very nicely with the idea, and indeed academic consensus, that Indo-Iranian languages expanded rapidly from the Eurasian steppe into Asia during the Bronze Age.

Considering that Poltavka outlier came from a Kurgan burial, and was therefore an individual of some social standing, he might be the direct ancestor of many millions of present-day Asians. If so, this won't be very difficult to prove in the near future as ancient DNA research revs up a few notches.

On a related note, apparently there's a paper on the way with ancient DNA results from Rakhigarhi, a Harappan site in Haryana, northern India (see here). As far as I know, the results will include Y-chromosome haplogroups of three males, but I don't think we'll see any decent genome-wide data at this stage. However, hopefully I'm wrong and the paper will come out with full ancient genomes.

Feel free to post your predictions in the comments. I'm tentatively expecting a couple of instances of J2 and maybe an L or H. Razib made basically the same prediction recently so I'm not being original. What I do know is that we won't see any R1a-Z93. The only way that might happen is if, say, someone coughed or sneezed on the Harappan remains.

Data source and reference...

Mathieson et al., Genome-wide patterns of selection in 230 ancient Eurasians, Nature, 528, 499–503 (24 December 2015), doi:10.1038/nature16152

Saturday, January 2, 2016

Spatio-temporal segment sharing analysis featuring eight ancient genomes


No one's done this yet, probably because at this stage it's still a crazy idea. But sometimes crazy ideas actually work. Here's a map:


The map is based on the spreadsheet below, which shows the total amount of relatively large, probably in most part Identity-by-Descent (IBD), genome-wide tracts shared by the ancient individuals in centimorgans (cM). An extended version of the table, including ~1500 present-day Eurasians, can be viewed here.


I used Beagle 3 and fastIBD for the job. The dataset included just over 300K SNPs that showed a call rate of 100% in all of the ancient samples, so as not to potentially bias their results by imputing missing markers.

To do this by the book, I'd need to run many more ancient individuals, at least a few from each archaeological culture of interest, sequenced at comparably high coverage and genotyped in exactly the same way. This might be possible within a year or two.

Having said that, the results from my quick and dirty test run make perfect sense. Here are a few observations:

- The Corded Ware individual from Germany shows a close relationship to the Yamnaya individual from the North Caspian region, but no relationship to the two Neolithic farmers from Central Europe, NE1 and Stuttgart, supporting the idea that the Corded Ware Culture was introduced into Central Europe by migrants from the Pontic-Caspian Steppe.

- The Srubnaya individual from the North Caspian shares a lot of cM with the Corded Ware individual, and also shows a stronger relationship to other ancient Central Europeans than to the Yamnaya individual buried only kilometers away, suggesting that the Srubnaya Culture was introduced to the Pontic-Caspian Steppe from Central Europe or surrounds.

- The closer relationship between the Yamnaya individual and the Late Bronze Age Hungarian, BR2, than between the latter and the Corded Ware individual, gels with archaeological data showing that Yamnaya groups moved into the Carpathian Basin via the Balkans.

- Weak segment sharing between the Yamnaya individual and Kotias, a Mesolithic Caucasus hunter-gatherer (CHG) from western Georgia, suggests that the Yamnaya population did not receive its CHG admixture from the southwestern Caucasus.

- Elevated segment sharing between BR2 and present-day speakers of Baltic and Slavic languages suggests that BR2, or his close relatives, contributed genealogically in a significant way to the Balto-Slavic expansions that affected most of East Central and Eastern Europe during the Iron Age and early Medieval period.

The ancient DNA data used in my experiment came from the following studies:

Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5:5257 doi:10.1038/ncomms6257 (2014).

Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

Jones, E. R. et al. Upper palaeolithic genomes reveal deep roots of modern eurasians. Nat. Commun. 6:8912 doi: 10.1038/ncomms9912 (2015).

Lazaridis et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, Nature, 513, 409–413 (18 September 2014), doi:10.1038/nature13673

Mathieson et al., Genome-wide patterns of selection in 230 ancient Eurasians, Nature, 528, 499–503 (24 December 2015), doi:10.1038/nature16152

Friday, January 1, 2016

Kum6: Sardinian-like genome from Late Neolithic western Anatolia


Behind a pay wall at Current Biology:

Summary: Anatolia and the Near East have long been recognized as the epicenter of the Neolithic expansion through archaeological evidence. Recent archaeogenetic studies on Neolithic European human remains have shown that the Neolithic expansion in Europe was driven westward and northward by migration from a supposed Near Eastern origin [ 1–5 ]. However, this expansion and the establishment of numerous culture complexes in the Aegean and Balkans did not occur until 8,500 before present (BP), over 2,000 years after the initial settlements in the Neolithic core area [ 6–9 ]. We present ancient genome-wide sequence data from 6,700-year-old human remains excavated from a Neolithic context in Kumtepe, located in northwestern Anatolia near the well-known (and younger) site Troy [ 10 ]. Kumtepe is one of the settlements that emerged around 7,000 BP, after the initial expansion wave brought Neolithic practices to Europe. We show that this individual displays genetic similarities to the early European Neolithic gene pool and modern-day Sardinians, as well as a genetic affinity to modern-day populations from the Near East and the Caucasus. Furthermore, modern-day Anatolians carry signatures of several admixture events from different populations that have diluted this early Neolithic farmer component, explaining why modern-day Sardinian populations, instead of modern-day Anatolian populations, are genetically more similar to the people that drove the Neolithic expansion into Europe. Anatolia’s central geographic location appears to have served as a connecting point, allowing a complex contact network with other areas of the Near East and Europe throughout, and after, the Neolithic.

Omrak et al., Genomic Evidence Establishes Anatolia as the Source of the European Neolithic Gene Pool, Current Biology, DOI: http://dx.doi.org/10.1016/j.cub.2015.12.019