search this blog

Tuesday, March 3, 2015

First look at Bell Beaker, Corded Ware and Yamnaya genomes


It's usually not a good idea to try and force people who've been dead for thousands of years into analyses based on modern genetic variation. However, that's what I've done here by running 20 of what I consider the most interesting samples from the freshly published Haak et al. 2015 paper with the Eurogenes K15 and 4A Oracle.

K15 ancestry proportions + other data

K15 4A Oracle results

My experience is that the K15 is an excellent tool for exploring ancient genomes, and I think it's done a great job here. Below are a few of my observations based on the output:

- the best two-way mixture model for the Yamnaya genomes, from the Samara region near the Russo-Kazakh border, is Samara_HG/Tabassaran, rather than Karelia_HG/Armenian as per Haak et al. (see discussion below)

- far Eastern Europeans like Volga Tatars and Finns are the most similar modern populations to these Yamnaya samples, which makes good sense considering uniparental marker data and geography (for instance, see this map posted by Richard recently in the comments)

- the unusually high Amerindian and South Asian ancestry proportions among the Yamnaya genomes are very likely the result of their extreme levels of Ancient North Eurasian (ANE) ancestry, estimated by me with the West Eurasia K8 to be around 35%

- the German Bell Beaker sample appears to be a complex mixture of populations from several different parts of Europe, including the Yamnaya horizon, so based on this data it's impossible to pinpoint the main geographic source of the Bell Beaker population expansion, if indeed there was such a source

- three out of the four German Corded Ware genomes are obviously of mixed origin, presumably between Corded Ware migrants from Eastern Europe and earlier middle Neolithic inhabitants of North-Central Europe, but still largely of Yamnaya or very similar ancestry

- Eastern European foragers Karelia_HG and Samara_HG don't show any hints of Near Eastern admixture

I'll post the K8 ancestry proportions for the same 20 ancient genomes in a couple of days. A lot of people will probably be surprised by the results of the Yamnaya samples. Not only do they show unusually high levels of ANE, but also only around 25% of Near Eastern or Early Neolithic Farmer (ENF) ancestry.

Admittedly, these results are somewhat at odds with the findings of Haak et al., who were able to fit the Yamnaya as 50/50 Karelia_HG/present-day Armenian or Iraqi Jewish. Well, this might be a statistically valid fit, but I'm simply not seeing any obvious connection between Armenians or Iraqi Jews and the Yamnaya samples.

As per above, a more sensible solution appears to be Samara_HG/Tabassaran, but based on the K8 output I'd say an even better solution would be to model the Yamnaya as a three-way mixture between Eastern European foragers, early Neolithic farmers straight from the Near East, and perhaps some sort of Central Asian population very similar to the main ANE-proxy MA-1 or Mal'ta boy. But more on that later.

Update 04/03/2015: I've also now analyzed most of the early and middle Neolithic samples from Haak et al. (see here). The results clearly suggest that a profound genetic shift took place in Germany from the middle to the late Neolithic.

Citation...

Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

See also...

Fitting the Yamnaya with qpAdm

245 comments:

1 – 200 of 245   Newer›   Newest»
Roy King said...

What do you make of the large West Asian component among Yamnaya? Is this the same West Asian component that is absent among most of the Neolithic/Chalcolithic samples of Europe?

Davidski said...

Yes, it's one of the components lacking among the EEF.

My estimate using the K8 is that it's 70/30 Near Eastern/ANE, so that's probably why it doesn't show up in Neolithic Europe.

Roy King said...

@Davidski,
Or it could be that Neolithic Europe originates from the Levant, not Armenia/not Eastern Anatolia and that the West Asian component in, say contemporary Lebanon, descends later from the South/NE Caucasus, perhaps with Kura-Araxes or with Hurrians.

Davidski said...

It depends when ANE arrived in the Caucasus. If it's been there since the LGM then we don't need any migrations from Central Asia to get the Yamnaya people, because they can be a two way mix of EHG/northeast Caucasians.

But then again, I've read that Hurrians came to the Caucasus from Central Asia during the Bronze Age. If so, it's possible that the West Asian component didn't appear until that time, because we need that 70/30 ratio to get it.

Shaikorth said...

The West Asian is prevalent in the Volga region so increased similarity to Tatars, Chuvash, Erzya and even Kargopol Russians compared to Estonia, Finland and Lithuania (and based on what I've seen so far, Karelians and Vepsians) in Oracles is not surprising and I expect K13 will show even more. It's noteworthy that Yamnaya's Fst-distances published in the paper back this result with Erzya topping that list, unfortunately there was no Chuvash or Tatar comparison. Eastern Baltic OTOH had smallest Fst's to WHG.

The idea of West Asian component as an IE signal to some degree might get a boost from this.

Nirjhar007 said...

David You are a Genius!

Karl_K said...

@Nirjhar007

"David You are a Genius!"

Finally you are making sense!

Nirjhar007 said...

You Too Karl!

Helgenes50 said...

Thanks David for your great work.
This is really exciting

PersonaMan said...

Very interesting.

K15 average for German Bell Beaker is basically the same as in the modern era.

Mark S said...

Great work David! Are you planning to run that Spanish R1b from Els Trocs too?

The K15 data on the R1b Samara HG has already gone a long way to helping understand the origin of R1b (not having any West Med, West Asian,East Med, Red Sea)

That Spanish R1b genome would be really interesting to look at,to see if it has any trace of EHG

Graham Little said...

Yeah K15 Bell Beaker Averaged sits in well with North Germans & South Dutch. Individually with other Modern Germanic speaking nations.

Alberto said...

Thanks David!

An interesting detail being noted by others is how much WHG ancestry was left in Europe in the Late Neolithic once you moved away from typically neolithic cultures. Even the resurgence of WHG ancestry among the middle Neolithic cultures doesn't show how much WHG ancestry was moving around outside these settlements.

The Bell Beaker I0112 is a testament to this. Probably very low ANE, but also pretty low ENF. Nothing like a Sardinian or a Yamnaya sample.

Gaspar said...

Why are the far older G2a Dna finds around Korsdorf Germany from Haak paper, who are in BB and CW lands NOT incorporated into this admixture?

Romulus said...

Wow those amerindian levels are high. You should have thrown Ma'lta on that list too for comparison.

capra internetensis said...

Tabasarans had the second highest frequency of L23(xL51) in all the populations sampled by Myres (the very highest was another Dagestani group). Some very basal Z2103 has shown up in Dagestan as well.

Looks to me like they are carrying a good deal of very ancient Samara Yamnaya-like ancestry, paternally at least (and NE Caucasian mtDNA isn't all that far from Yamnaya either, at the higher clade level anyway).

So what do Tabasarans look like in terms of Yamnaya + something else, or EHG content?

Grey said...

"An interesting detail being noted by others is how much WHG ancestry was left in Europe in the Late Neolithic once you moved away from typically neolithic cultures."

They probably just retreated to refuge terrain the farmers couldn't use.

I expect they would have been absorbed by the farmers eventually if the farmers had had more time but the IE incursion prevented it.

And that's the big difference between European and East Asian history imo - the European farmer expansion getting stalled for a while.

Jean said...

The raised level of ANE in Yamnaya over the Karelia and Samara hunter gatherer samples may be explained by the arrival of further ANE-rich bands from Siberia over the next two thousand years.

Matt said...

On the K15, IRC we had some predictions earlier in coments on this blog I think that Bell Beakers would be almost 100% North Sea, Corded like Baltic, Yamnaya 100% Eastern_Euro (minus its East Asian) etc.

While those don't seem to have borne out by the K15 analysis, however, the most high component for each group seems close to what was thought (I can't actually remember where the exact predictions were posted) -

- the Yamnaya are mostly most high for East Euro
- Corded Ware mostly most high for North Sea
- Bell Beaker mostly peaks North Sea, with a noticeable increase in Atlantic over Corded Ware and decline in Eastern Euro, and first appearance of West Med, which surely actually represents an actual drifted form of ancestry originally unique to Spain

(on this topic, it would be interesting to see if the Middle Neolithic Spanish score *really* high in this component).

Lowest to highest in the four "North European" (North Sea, Atlantic, Baltic, Eastern Euro) components, Yamnaya->Beaker->Corded->Karelia->Samara.

Lowest to highest in the three "North European" components excluding Atlantic, Beaker->Corded->Yamnaya->Karelia->Samara.

These components aren't equally HG/Neolithic of course, and much more HG ancestry for Yamnaya, Karelia and Samara is contained in Amerindian.

Davidski: A lot of people will probably be surprised by the results of the Yamnaya samples. Not only do they show unusually high levels of ANE, but also very low levels of Near Eastern or Early Neolithic Farmer (ENF) ancestry of around 25%.

This is why I think f4 and D stats stats trying to isolate the pure Basal vs "Crown" Eurasian level Yamnaya relative to various Europeans may be important, in case the ENF is just not a good fit for Yamnaya and Basal Eurasian spread independently of ENF to a degree.

Roy King said...

@Davidski,
Please do the ?4 EN Spanish Cardial Samples. Thanks!

Karl_K said...

@Grey

"And that's the big difference between European and East Asian history imo - the European farmer expansion getting stalled for a while."

Nope. That's not the big difference. But you are at least getting warmer.

Chad Rohlfsen said...

Matt,
West Med was the core neolithic marker.

Chad Rohlfsen said...

All had it as their main component. Spain or not. It's just a non ANE farmer marker.

PersonaMan said...

Interesting to note that the German BB with P312 is the only one without any West Med. Potentially says a lot about P312 IMO, especially in a period like this when, as David says, BB looks relatively mixed.

Matt said...

All had it as their main component. Spain or not. It's just a non ANE farmer marker.

Hmmm. Can't remember how much Stuttgart or Oetzi had? You got any component averages for them? You really don't think that the Spanish Neolithic might not have relatively more of it than them, and this might not mean something?

Roy King said...

@Matt,
Oetzi had 48% and Stuttgart had 47% West_Med from K15.

Chad Rohlfsen said...

CO1 46% West Med, 17% East Med
Stuttgart 47% West Med, 27% East Med
BR1 19% West Med, 0% East Med
NE1 40% West Med, 31% East Med

Just for example

Krefter said...

@Personman,
"Interesting to note that the German BB with P312 is the only one without any West Med. Potentially says a lot about P312 IMO, especially in a period like this when, as David says, BB looks relatively mixed."

All the Bell beaker females Davidski has may have had fathers who were P312. It doesn't tell anything. This test is based on people who lived over 4,000 years after these individuals died. They won't fit perfectly in it. Sometimes west Asian is used to express high ENF sometimes West med is. The Bell beaker samples have very different results from each other because this test is for modern people.

Matt said...

I'm not suggesting that populations carrying Neolithic ancestry ancestral to the Spanish Neolithic might not score in West Med if that's all that's available and a decent enough approximation.

More that the component itself may also represent a composite of Early Neolithic Europeans plus drift in Spain, and thus score much higher in actual Spanish Neolithic samples (and those at least partially derived from them such as Bell Beaker).

Chad Rohlfsen said...

Bell Beaker has less West Med than the Bronze Age Hungarians, if that tells you anything...

Although, inferring something about someone that is much older than these components can lead to odd results, just as having farmers score Atlantic, which has ANE. It is much better to use ancient genomes as references.

Matt said...

It's that the increase in West Med in Bell Beaker compared to the nearby Corded Ware is so disproportionate relative to the East Med (I think found in other early farmers) that's mainly interesting to me.

Chad: Although, inferring something about someone that is much older than these components can lead to odd results, just as having farmers score Atlantic, which has ANE. It is much better to use ancient genomes as references.

I think there would probably be a degree of error in viewing each one of the K15 components as no more than the sum of its composition at K8, and think that these ancient samples score in one of the K15 components rather than an other is an indication of relatedness in allele frequencies to K15 component bearing modern populations, rather than a mere attempt to approximate K8 with what is available in K15.

But that is a tangent, that it's probably better not to eat this comment section up with.

capra internetensis said...

This set of components seems to be capturing a lot of variation in the farmer groups but not in the EHGs - which I guess is to be expected. The main difference between the foragers is that the Samaran has 2% South_Asian and 1% Oceanian, while the Karelian has neither but make up for it with higher Amerindian. That would make sense if there is some kind of Central Asian ancestry leaking into the Samaran.

The one Yamnaya sample who wasn't from Samara oblast, but from near Kazakhstan, is quite an outlier. He has virtually no Atlantic, and by far the least South_Asian (despite being located furthest southeast), but has the highest levels of West_Asian and Eastern_Euro both. There is also another Yamnaya without Atlantic, but he is more normal, having instead the highest Baltic and the lowest East_Euro of all the Yamnaya samples.

Curiously, the only significant Northeast_African is carried by a Corded Ware guy, and Siberian by Bell Beaker people.

Arch Hades said...

The WHG, EEF, ANE triangle as it is now will fail as a good model for the Yamna IMO, because the 'Neolithic' component in them is very distinct from the Sardinianesque Neolithic component in Central, Western, and Southern European farmers. At least the Karelian hunter gatherers are basically a mixture 60/40 of WHGs and ANEs but the EEF here doesn't really work for the other major component of the Yamna.

Chad Rohlfsen said...

Also,
Refer to the paper.
Bell Beaker shared more drift with BR1, Loschbour, Motala, Samara, Pitted Ware, German MNE like Esperstedt, than either of the Spanish Neolithic groups.

Unless Bell Beaker somehow became 2/3 Corded Ware, without sharing jack for markers, it might be safer to assume that it has the same ancestor of the Yamnaya in BR1, as Early Mako, is considered ancestral to these Beakers. I think Cotafeni is a safe bet. I'd love to see some Kemi Oba results. Due to its proximity to the Chalcolithic groups in Central Europe, I'd bet it isn't a whole lot different than Bell Beaker.

Chad Rohlfsen said...

Plus, look at Corded's shared drift. BR1 is down on the list.
It makes Beaker look more like BR1 with some Corded admixture in a couple of them.

Krefter said...

Modern west Eurasian K15 scores broken down by region.

https://docs.google.com/spreadsheets/d/1kbEaP9EQzOHdu3-Mfqmrnso2jmquC8sD3FC4F3qR5d8/edit?usp=sharing

Ancient West Eurasian K15 scores. I only have Bronze age and Iron-Medival, I'll add the Mesolithic, Upper Palaeolithic, and Neolithic ones later.

https://docs.google.com/spreadsheets/d/1oUCR_HX8F0pCo8kf9nSf0L-BOoH-HuosdJgEt-BN-M0/edit?usp=sharing

Krefter said...

Davidski, did you put the new results into the "Make yourself a mixture of ancient genomes" data set?

Krefter said...

Where can I download these genomes? There are tools online where I can look at any SNP in the samples I want.

Matt said...

Chad: Bell Beaker shared more drift with BR1, Loschbour, Motala, Samara, Pitted Ware, German MNE like Esperstedt, than either of the Spanish Neolithic groups.

If this is page 66 f3 shared drifts, HungaryGamba_BA, the Bronze Age sample is low down on the list of shared drift, lower than the Spain_MN or Spain_EN.

HungaryGambaHG (Hungarian plain hunter gatherer KO1) does indeed top the table though. Not sure what you mean by BR1 here?

Matt said...

Krefter - Reich Lab datasets page Haak et al Nature 2015 - http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets.html

Chad Rohlfsen said...

Sorry, you're correct. I should've squinted more.

Still, Spain is below EHG, and German groups. That is telling.

Davidski said...

I'm running the early Neolithic samples with the K15 now. I'll link to a spreadsheet with the results in a couple of hours.

capra internetensis said...

OK, noob question: what does this kind of analysis tell us about the ancient genomes that we didn't already know?

Or is the point to tell us something about the modern components?

Matt said...

@ Chad, yeah, I think I did that a few times reading these figures as well -

The hierarchy for Beaker's f3 sort of goes WHG->SHG->Beaker like German Late Neolithic->German Middle Neolithic->German_Corded->EHG->Spanish Middle and Early Neolithic->German Early Neolithic

I'd interpret that to mean that the HGs tend to get very high shared drift scores (for some reason), and so do populations they've directly and recently contributed to seem to get a boost. It also seems to me like these "German" Bell Beaker should have substantial ancestry from contemporary people from Germany. My impression was Beaker looked like Corded's stats plus a substantial amount of Spanish Middle Neolithic stuff specifically, but that's just a first impression, perhaps a biased one and you've said there's not much uniparental marker overlap for them with Corded Ware.

But it's hard to interpret for me - taking just Bell Beaker's f3 stats, hard to see why should La Brana be so much lower than the other HGs for instance, or the Copper Age Hungarian or Iceman so much lower than other similar early and middle Neolithic like people.

Shaikorth said...

The f3 and f4 stats could be distorted by a sample's high internal IBS sharing, or if we're speaking of an ancient individual, misreads caused by low coverage or varying numbers of SNP's.

Krefter said...

Matt,
"Krefter - Reich Lab datasets page Haak et al Nature 2015 - http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets.html"

I can't do anything with that. Most of us will have to wait till Felix converts it.

Romulus said...

@Krefter

The pigmentation results from Felix's analysis will be interesting.

ryukendo kendow said...

Wow, these are almost exactly as I predicted!

I predicted that East Euro will be highest, and be in a 1:1 to 2:1 ratio to West Asian. North Sea will be high too, and Baltic will be depressed and no Med scoring would be present, while amerindian and South Asian exotic might occur. These are borne out.

We were thrown off by the low levels of ENF however, which caused North Sea and West Asian to switch places compared to what I expected. The actual level of ENF ancestry from the NE is likely to be higher than the West Asian component suggests, because ~22% ENF cannot fit in a smaller fraction of West Asian. Judging from patterns in other ADMIXTURE runs, and since there are no Euro Neol components, the remainder is probably dumped in North Sea. Interestingly, The Tabassaran and cauc pops in general have elevated north Sea.

David, what kind of variation are we talking about when you say that the ANE vs WHG figures are choppy for the Yamnaya samples? Are they choppy for the EHG samples?

If ENF levels are highly homogeneous, but ANE:WHG are choppy, that is quite mysterious.

Central Asians today all carry quite a bit of ENA ancestry from ASI. If we do D stats for yamnaya to detect signs of ENA ancestry, that might help us figure out if any ancestry arrived from farther afield, instead of just locally from the caucasus. I'm guessing that Armenian and other Near East pops were chosen because they were a more 'extreme' more differentiated version of North Caucasian pops compared to EHG.

Its interesting that Tabassaran were chosen as the other half, because Dienekes has discovered using a South Asian centered dataset that a component peaking in Urkarah and Stalskhoe on south Dagestan distinguishes higher castes and IEs from non IEs such as Burusho and low castes, and is spread around West Eurasia, and Tabasaran and Urkarah and Staskhoe all speak Lezgic languages, and all three are found in a small region in S Dagestan.

What are you going to do next?

I suggest we do an admixture run with only Yamnaya, modern European, BA, EHG and Neolithic genomes, and post the results at successive K, so that the variation comes out of the dataset naturally.

Can you post fst distances between the components of the k15?

Romulus said...

It is interesting that the K15 "West Med" group that is so prevalent in all of the Neolithic European samples is completely absent in the Yamnaya. Also interesting that the West Asian component peaks in the Yamnaya but is absent in the HGs, Mal'ta, and European Neolithic samples.

it seems clear that the Near Eastern Group which the R1a/b HGs interbred with was completely exclusive from European EEF. So Yamnaya are 0% EEF.

Gill said...

How much ANE is in the Karelia/Samara HGs compared to Yamnaya? They have high Amerindian...

Davidski said...

Both of the EHG have around 38% ANE, which isn't much more than the Yamnaya.

Also, their ANE appears to be different, and indeed more Amerindian like, while the Yamnaya have both the Amerindian-like ANE as well as the Hindu Kush-like ANE.

So it's obvious now that not all of the ANE in the Yamnaya comes from the EHG, but also from Central Asia.

Krefter said...

@Romulus,
"The pigmentation results from Felix's analysis will be interesting."

Yeah I bet they will. We have a big data-set so exceptions won't be the rule. Yamna is very unique genetically, unlike anyone in west Eurasia today. They probably will have a lot of unchanged old-Eurasian dark alleles from ANE and EHG.

This is kind of how I picture EHG-ANE Mesolithic people.

http://cdn02.cdn.justjared.com/wp-content/uploads/2008/02/camilla-premiere/camillla-belle-10000-bc-premiere-06.jpg

Corded ware, Unetice, and Bell Beaker might come out very dark too. Although based on bronze age Hungarians probably not.

Gill said...

What's the difference between the two ANE types? Is the Amerindian-like ANE the older one?

Mike Thomas said...

@ Davidski

Is it possible to break the data even more to more baseline comonenents- eg instead of "yamnaya" have EHG/ ANE and 'west Asian' (or however u chose to define the latter )?

Davidski said...

Gill,

I'd say the difference between the two is just drift. The Central Asian ANE blurs somewhat with the South Eurasian component, while the Amerindian ANE with WHG.

Mike,

Yes, I'll post the K8 results soon.

Krefter said...

@Davidski,

Which Yamna individual scored 38% ANE? I'd like to see how his K15 differentiates from the others.

Davidski said...

I'm pretty sure Yamnaya_I0429 was the sample with the highest ANE, but I'll double check soon.

Chad Rohlfsen said...

Krefter,
I've actually got recreations of the Samara, Karelia, Yamnaya, Sredny Stog, Catacomb, and Andronovo people.

They're nothing like that guy.

Krefter said...

Of course not in facial features because the actor isn't EHG. A reconstruction like they did for Loschbour for the Haak genomes would be cool to see.

Mike Thomas said...

Krefter
Did u look closely at the polish lengyel mtda data ?

Chad Rohlfsen said...

Want a site? It has just about every bust out there. Some bullshit pics, but the busts are great.

http://istorya.ru/forum/?showtopic=5127

You'll need google translate

Seinundzeit said...

David,

This stuff is tremendous, thanks!

And that is expected result (assuming it really is the case), the one with the highest ANE (Yamnaya_10429) also has the highest combined South Asian (7.37%) and Amerindian (5.16%) score.

Krefter said...

Mike, no. From what I did see it's typical of Neolithic Europe.

Mike Thomas said...

Ok thanks

ryukendo kendow said...

@ David
What evidence is there exactly for the multiple ANE idea?

Krefter said...

I added alot of Mesolithic and Neolithic genomes to the ancient K15 spreadsheet.

https://docs.google.com/spreadsheets/d/1oUCR_HX8F0pCo8kf9nSf0L-BOoH-HuosdJgEt-BN-M0/edit#gid=885233802

Here's the modern one broken down by region again.

https://docs.google.com/spreadsheets/d/1kbEaP9EQzOHdu3-Mfqmrnso2jmquC8sD3FC4F3qR5d8/edit

Davidski said...

The Central Asian-specific signals otherwise closely correlated with ANE don't show up in Scandinavian hunter-gatherers despite the fact that they're estimated to have 15-19% ANE. These signals also don't show in Karitiana Indians nor Karelia_HG, and only vaguely in Samara_HG.

Chad Rohlfsen said...

ryu,
EHG likely shares more drift with Native Americans.
Remember, the tree had Amerindians as 58% EHG vs 41% MA-1. It makes more sense that EHG is a pure "ANE", descended from MA-1 like stuff, drifted with Native Americans. I bet we will see the reverse in South Asia. If David can isolate EHG from MA-1 stuff, a K9 could then have EHG and MA-1. It would be interesting to see how that plays out.

Chad Rohlfsen said...

EHG may then be the true Yamnaya marker, for mixture into South Asia, and West Asia. Provided it isn't too high in Loschbour and EEF. But, we will see.

Krefter said...

Here are some of my ideas about the ancient K15 scores. All of this should have be obvious everyone by now.

WHG and EEF-related.
Atlantic
West Med
North Sea(Only if WHG is exceptionally high)
East Med(Mostly only in Early Neolithic)

EHG, ANE, and Yamna-related.
East Euro
West Asian
Baltic and North Sea(kind of because it has a decent chunk of ANE)
Amerindian
South Asian

Atlantic drops from WHG to SHG and is nearly non existent in EHG. East Euro drops from EHG to SHG and is very small in WHG. North Sea and Baltic flow between both of them, because they're a east-west mix.

North sea, Atlantic, and Baltic are a similar mix as LN/BA, which is why they score so high in them. Atlantic though will significantly drop as ANE rises, because it is mostly WHG-EEF.

West Med and West Asian trade off of each other in LN/BA. If the sample has fairly low ANE, west med will take the place of West Asian to represent excess ENF ancestry. This is why Bronze age Hungarians score high in west med and not west Asian.

Krefter said...

There are several posters who I think are wrongly thinking CWC and BBC K15 scores give clues to what region of Europe today their closest relatives live. Central Europens keeps coming up as most similar in K15 because....

These Europeans have east and west European ancestry and so score very evenly in North Sea/Atlantic and Baltic/East Euro. The west vs east Euro specific drift(or whatever) didn't exist in the Bronze age IMO, so bronze age samples score evenly between east and west Euro centered components.

Krefter said...

Something interesting to note is none of the Bronze age samples score much above 0% in East Med and Red Sea. Although outside of the Baltic and North Sea these all modern Europeans score in these components.

Both could be a signal of recent near eastern ancestry in Europe.

Marnie said...

@Krefter

"Something interesting to note is none of the Bronze age samples score much above 0% in East Med and Red Sea. Although outside of the Baltic and North Sea these all modern Europeans score in these components."

That was apparent in Dienekes' Dodecad experiments in 2010.

Davidski said...

I now have K15 results for most of the early and middle Neolithic samples. See update above.

ryukendo kendow said...

Well, the only outlier now is BR2.

@ Krefter

ADMIXTURE is not a simultaneous equation solver, so components do not trade off in that way.

If the only thing that mattered were 3-ratios, there are millions of possible solutions using 15 components to create a particular ratio. That ADMIXTURE chooses to attribute ENF+WHG to EEF sometimes and ENF to East Med and WHG to Baltic separately sometimes, is informative.

By ratio logic, the higher WHG higher ANE germanic hinxton genomes should not get less baltic than the lower WHG lower ANE celtic hinxton genomes. But they do.

Otherwise there would be no point in running the K15 apart from the K8 at all.

And it is not true CW score as so central solely because of that reason. Why would they score so high in North Sea, unless they shared some dimension of drift in common with populations high in this component? Why do pops in the caucasus score in North Sea instead of Baltic?

If anything, the admixture confirms the f-stats that Corded Ware kinda resembles west Europeans more in autosomal ancestry. Here's what the oracles say about them, I picked the top 5 stats, in no particular order:

1 Icelandic+Yamnaya_I0357 @ 5.40548
2 North_German+Yamnaya_I0231 @ 5.616362
3 Danish+Yamnaya_I0357 @ 5.860548
4 Icelandic+Yamnaya_I0443 @ 6.082003
5 Danish+Yamnaya_I0443 @ 6.315309
1 Irish+Yamnaya_I0357 @ 8.571771
2 West_Scottish+Yamnaya_I0357 @ 8.695445
3 Southeast_English+Yamnaya_I0357 @ 9.350556
4 Icelandic+Yamnaya_I0357 @ 9.48351
5 Orcadian+Yamnaya_I0357 @ 9.834926
1 Estonian+Yamnaya_I0429 @ 4.185955
2 Finnish+Yamnaya_I0438 @ 4.876483
3 Finnish+Yamnaya_I0429 @ 5.169476
4 Southwest_Finnish+Yamnaya_I0438 @ 5.434564
5 Estonian+Yamnaya_I0438 @ 5.715392
1 North_Dutch+Yamnaya_I0357 @ 6.688911
2 Swedish+Yamnaya_I0357 @ 6.758269
3 Irish+Yamnaya_I0357 @ 7.034446
4 West_Norwegian+Yamnaya_I0357 @ 7.090177
5 West_Scottish+Yamnaya_I0357 @ 7.207898

Here's what they say about the Yamnaya:
1 50% Samara_HG +25% Icelandic +25% Tabassaran @ 7.610856
2 50% Samara_HG +25% Danish +25% Tabassaran @ 7.618812
3 50% Samara_HG +25% North_German +25% Tabassaran @ 7.687351
4 50% Samara_HG +25% Orcadian +25% Tabassaran @ 7.767098
5 50% Samara_HG +25% Tabassaran +25% West_Scottish @

So this NW Europe autosomal affinity receives corroboration via both formal and informal means.

Davidski said...

It's obvious that the North Sea is the least specific of the northern European clusters, so it catches all the stuff that doesn't want to go anywhere else. I made this cluster so I know how it behaves.

And whatever makes the North Sea like this probably also influences the formal stats.

IMO it has something to do with genetic diversity and less ethnic-specific drift.

Davidski said...

Here's the paper that first introduced me to the peculiar matchiness of North-Central Europeans.

http://www.nature.com/ejhg/journal/v17/n7/full/ejhg2008266a.html

It doesn't explain why it happens, but certainly confirms it. The North Sea cluster is just another reflection of it.

Fanty said...

yeah. I recall you made an experiment of wich of the "northern European" (Atlantic, NOrthsea, Baltic, Eastern European) components, non-europeans best match.

And I recall that NW-Africans, NE-Africans and Arabs all matched North Sea best. By far.

It was not until regions like Iran or India that best match switched to Eastern European.

Atlantic was some kind of second place for Africans and Arabs, but far behind Northsea.

And Baltic didnt match anyone outside of Europe at all, not even second best choice.

ryukendo kendow said...

@ Davidski

Can you post that experiment?

Fanty said...

I think it even becomes more importance now.

Because it foreshadowed problems of trying to put people into clusters, that had been done with completely different people to what as originaly used.

I recall that Africans came out as almost PURE (100%) Northsea. MAybe some as 99% Northsea + 1% Atlantic and stats like that.

What suggests that some results of ancient people described with modern clusters, mustnt been taken too liteate.

Balaji said...

Davidski,

It looks like you are on the right track as far as the estimation of the ANE in the Yamnaya is concerned. From the Haak PCA, it looked to me that the Yamnaya had about the same ANE as the EHG and this was estimated by Haak to be 40%, close enough to your number. If Ryukendo was right that the Near Eastern ancestors of the Yamnaya had contributed only ENF, the ANE of the Yamnaya would have been half that of the EHG around 20%.

However please look again at the Haak ADMIXTURE figure.

http://biorxiv.org/content/biorxiv/suppl/2015/02/10/013433.DC1/013433-1.pdf

The Yamnaya distinctly have a purple component at K=6 and K=7 that is modal in Papuans and is prominent in South Asia. The Yamnaya could not have acquired this from New Guinea. They must have got it from South Asia.

Matt said...

So, going back to my earlier speculation on West Med and the Spanish Neolithic, the Els Trocs samples do peak that component, but not about the same as present day Sardinians, they're not the 60-70% West Med I thought they might be, so I was overestimating. Seems like the West Med component represents drift which was beginning in them, but not complete and which has never been totally complete because of IBD effects.

The tradeoff between early and middle Neolithic Spaniards is from East Med->Atlantic, which might fit with trading from a more Basal/ENF farmer to a more HG one, although none of these components separate those aspects totally, all probably have their own drift effects.
On that note, the similarly increasingly HG German Middle Neolithic folk get generally more North Sea and East Med than the Spanish MN (and even Baltic to a small degree), despite being similarly WHG+EN, underlining that the shift of these ancient samples into modern clusters seems to depend majorly on the ANE, WHG, ENF (or real components that correlate pretty well with them), but not totally (or the same changes in populations would result in exactly the same cluster shifts).

ryukendo kendow said...

@ Matt

The East med component seems to behave similarly to components that peak in the South Cauc, which are also represented in Euro neol from central europe apart from West Med in other people's analyses.

We may or may not be seeing something real in cases like this. I think it will take highly specific tools to find out, because the statistical realities in terms of allele freqs relationships are getting more and more complex at this level of resolution.

Matt, where did you get the fst distances of the k15 components?

@ Davidski

That might indeed affect the situation. Its difficult to know by how much.

The easiest way to find out is to simply run admixture with Europeans only with supervised clusters on ancient genomes, one in Yamnaya, one in Neol Euros, one in WHG and one in EHG. And another one on a modern Middle Eastern for pops like Sicilians and Spanish, for good measure.

We need to to discern why ADMIXTURE in Haak and formal stats create such a plateau in Yamnaya ancestry in N Euros west till the scottish, despite Yamnaya being almost 40% ANE.

That would illuminate things quite quickly.

saman sistani said...

The Spanish EN R1b boggles my mind, how can it have 0 West Asian and 0 East Euro,it is so different then the Samara R1b, they cannot have been from the same source.

Alberto said...

@sarman

"The Spanish EN R1b boggles my mind, how can it have 0 West Asian and 0 East Euro,it is so different then the Samara R1b, they cannot have been from the same source."

All EEF score 0 in West Asian and East Euro, so that's not so surprising. The difference between this Spanish R1b and other Early Neolithic samples is that he scores the highest in West_med and lowest in East_med, suggesting that he is actually more "native" west European than the others (who are more Near Eastern). Or it might suggest a North West African origin, but that's more unlikely.

saman sistani said...

@Alberto

Thank you for your reply, so do you believe that his ancestors did not come from Siberia as he completely lacks any admixture that the Early Samara sample contains?

Alberto said...

@saman

He's the only R1b found so far at this place and time, so it's hard to tell. But obviously his ancestors did not arrive from Siberia recently. He looks mostly like any other Early Neolithic sample, with that small exception that I mentioned.

I think West_med peaks in modern Sardinians and probably Spanish, while it goes down in Eastern Mediterranean populations. So if anything, it seems that this R1b had some kind of "native" west European ancestry (mixed with a lot of EEF).

Dospaises said...

@saman

Here's something that might also help you. It only takes about 7-8 generations of mixing with a completely different group of people for the descendants of a person to completely lose the autosomal ancestry but they can still retain the Y-DNA of their direct male ancestor. Something similar can happen in Latin American. An R1b Spaniard goes there and has children with natives. His sons have children with natives. After the 7 or 8th generation some of this descendants won't have any of his Spanish autosomal DNA.

truth said...

@ Alberto

Modern Spanish have around 22% of West-Med, which is Half of what Sardinians have at 48%

Alberto said...

@truth

Yes, and once you move to the east Mediterranean (like Lebanon) it drops to about 10%.

More telling is the lower East_med, which is only 12% when most other Early neolithic samples are above 20% and 30%.

Both things combined, it looks less Near Eastern and more "something western". But it's only one sample and I wouldn't read too much into it.

Fanty said...

"The Spanish EN R1b boggles my mind, how can it have 0 West Asian and 0 East Euro,it is so different then the Samara R1b, they cannot have been from the same source"

As Dospaires already stated, any signals of the original autosomal DNA hit zero with known measurement tools after roughly 7 generations.

But of course there are EXCEPTIONS:
1. People stick to their own kind and intermix only rarely or not at all, even in the new place. (example: Jews)

2. Massive migration (or a steady trickle of little migrants but over very long periods).
If there are too many migrants, the (gene) Pool gets too dirty to wash out the alien genes. But it turns around. All the others in that population will get amounts of the migrants genes.

And of course the migrants get them back too and never lose them.

Chris Davies said...

Anyone know what the 'West Med' % is in Mozabites, Berbers, Tuaregs, Saharawi,Fulani, Bulala, etc.?

Dospaises said...

@Fanty

You scenarios of exceptions really are the norm. However, since the source of R1b is from somewhere east of Spain (most likely West Asia) and there are R1b groups that have a completely different autosomal makeup we know that either Spain_EN lost it's West Asian and East Euro that an ancestor had or Yamnaya and Samara lost a lot of East Med. I'm going with the former.

PersonaMan said...

@Dospaises,

The former indeed seems more likely given that El Trocs R1b1 was one among many, while the R1b1 in Samara/Yamnaya is all there is so far.

Roy King said...

@Alberto,
Fascinating! The Spanish-EN samples are far removed from the coast and upland at 1500 m asl. They likely have an additional hunter-gatherer-forager genetic contribution apart from the initial cardial culture along the coast. Thus, no Y-G2a. The R1b sample, as you point out, is even less East-Med and more West_Med. This would suggest that West_Med is also a Mesolithic component enhanced in the non-coastal Cardial samples. Also the R1b* there must be a Mesolithic immigrant (not recent) likely from North Africa, which also has in modern populations a large West_Med component--24% among Berbers.

Fanty said...

"You scenarios of exceptions really are the norm."

Possibly.

But for that R1b it could be like this:

a Handfull (relatively spoken. In comparation o the target populatons size)of R1b come from Russia/Siberia to the middle east. Not enough to have an impact on the autosomal DNA.

An odd drifteffect increases their number to 10-20% of the population. Higher than their original autosomal part.

Those went as farmers to Spain.

Fanty said...

Or...
Different:

It comes slowly, step by step from Eastern Europe. In a chain of short migrations with local intermixture pauses.

And if it takes more than 7 Generations from the Ural to Spain, its all lost on the way.

Fanty said...

Add:
And of course... whatever falls from the wagon on the way, doesnt drop into a black hole and is gone.
Its on the road.

So, by passing by...the others get ANEd while his lineage arrives washed clean at the final destination.

Dospaises said...

@Fanty

Whatever the route was, they were very likely a small minority, and it took them a long time to get to Spain, probably several hundred years, meaning they had to mix with the populations that they encountered on the way. It only takes 300 years for the autosomal DNA to change significantly in a population especially if they are a small enough of a group and the host population they are joining, or being taken over by, is large enough. The odds are that is what happened.

Dospaises said...

@Roy King @Alberto

West_Med does peak in Sardinians as far as modern pops go. However, the Sardinian DNA has changed somewhat over time so the original population was likely even more like the Spain_EN. As far as ancient DNA West_Med peaks in NE6 from Hungary from 5300-4950 BC which was found in LBK in the European Neolithic. Oetzi is also very high in West_Med.

The La Braña and Motala Mesolithic hunter-gatherer-foragers don't have and West_Med. 0.1% and 0% respectively.

Moroccans, Spanish, Tunisians, Algerians, and Tuscans have close to the same amount of West_Med.

West_Med looks to be a Neolithic autosomal component found mostly where it gets it's name from.

Chris Davies said...

'West_Med' bears some resemblance to the distribution of the HLA haplotype A*30:02-B*18:01 :-

http://en.wikipedia.org/wiki/A30-Cw5-B18-DR3-DQ2_%28HLA_Haplotype%29

Roy King said...

@Dospaises,
I think West_Med tracks the Mesolithic Castelnovian culture, which is found in North Africa and the Western Mediterranean circa 6500-6000 BCE.

AWood said...

@PersonaMan
The former indeed seems more likely given that El Trocs R1b1 was one among many, while the R1b1 in Samara/Yamnaya is all there is so far.

--Actually no. El Trocs R1b1(xM269) is only one of one. Do you follow this at all? As others have said, the remains from EEF cultures are relatively homogenous and variation disappears after a handful of generations.

PersonaMan said...

@AWood

That's exactly what i'm saying. El Trocs is one among many (of other lineages) ie the only one. So evidently if his y-ancestor was one of one (or at most a handful of R1b1) that found it's way into groups dominated by other lineages and so lost all association.

My point is in complete agreement with what you are saying, you must have misunderstood what i wrote.

Grey said...

On the north_sea component thing.

Say you have two populations: a Levant like one taking the maritime route to Iberia and then up the Atlantic coast from there and a Samara one taking the overland route west through giant country to the Baltic and then down along the Atlantic coast from there then the people where the two flows met might end up being partially related to most everyone along both the two routes.

http://www.europeword.com/images/flagandmap/map_of_europe.jpg

Grey said...

off-topic at the moment but maybe not eventually is I wonder if goats make a suitable possible candidate for self-domestication if they had sedentary HGs nearby?

Dospaises said...

@Roy King

It seems to me that there are a couple of problems with that. If West_Med is especially high in the people of the Mesolithic Castelnovian culture they would have to be absent in all of the other K15 components because there is no introduction of any other component in the Spain_EN R1b. Additionally, the Mesolithic Castelnovians would have had to have been completely isolated from all other hunter-gatherers because, as I pointed out earlier, the La Braña1, Motala12, and also Loschbour are absent of West_Med and East_Med. All of the components of the Spain_EN R1b that are absent are also virtually absent in all of the other Early and Middle Neolithic samples. So Mesolithic Castelnovian would have had to have been very different from all other people which I can't see happening.

The Spain_EN R1b sample is also younger than the other Spain_EN samples. It seems to me that the different amounts are more due to chance that he inherited elevated levels of one component (West_Med) and lower of another (East_Med). It's only a 6.76% difference compared to I0413. The difference is only slightly more than what can be see in siblings but these people have several generations between them.

What I can agree on is that the paternal line of R1b was at one time a hunter-gatherer at least 400 years before his birth and probably took up farming just like the KO1 individual from Hungary which was I2a. Notice that the other male Spain_EN and other Middle Neolithic individuals are also I2a.


Graham Little said...

Was looking through the Euclidean scores on Bell Beaker. It seems like we have 2 groups. One that is South Dutch like, probably North French.

The Others Germanic. From Germany, Denmark to Iceland.

Grey said...

@Graham Little

"Was looking through the Euclidean scores on Bell Beaker. It seems like we have 2 groups."

I was wondering if all the arguments over the origins of BB might mean there were multiple strands (possibly with the same origin).

Krefter said...

Geneticker revealed something interesting with ancient Euro genomes looking at SNPs rs1042602 and rs1393350.

https://genetiker.wordpress.com/2015/03/04/two-pigmentation-snps-from-prehistoric-europe/

The derived version rs1042602 is pretty much non-existent in Mesolithic and Neolithic Europeans and is at modern frequencies in Bronze age ones. We have dozens of Yamna and Catacomb calls in this SNP from Wilde. 2014 and they're consistant with these new Yamna samples.

Except for one LBK_EN individual no Neolithic or Mesolithic Europeans have derived alleles in rs1393350, while a large fraction of Bronze age(and IR1 from Hungary) Euros do.

Roy King said...

@Krefter,
Can you check the status of the lactase persistence allele C to T(rs4988235) in Yamnaya and in the new Neolithic to Bronze Age samples?

Krefter said...

@Roy King,

You'll have to ask Davidski, Geneticker, or anyone else who has the genomes.

@Davidski,

Can you test rs1805008 and rs16891982 in the Bell beaker, Unetice, and Corded ware genomes?

Roy King said...

@Davidski,
Can you check the status of the lactase persistence allele C to T(rs4988235) in Yamnaya and in the new Neolithic to Bronze Age samples?

Davidski said...

I can post a list of calls for all the samples for the SNPs you guys are interested in.

Just keep in mind that some of the data only comes in Affymetrix SNPs, so maybe make a list of Affx SNPs, or both Affx and rs SNPs.

Roy King said...

@Davidski,
Thanks--LCT is in linkage disequilibrium with neighboring sites so there should be a SNP near to rs4988235 on the Affix. Very interesting to see if the Yamnaya have lactase persistence.

Davidski said...

I had a look. rs4988235 is not in the rs SNP dataset. But yes, there might be other SNPs, especially in the Affx dataset, that are useful in this context.

Krefter said...

Davidski if rs16891982 isn't there maybe rs28777 is.

Davidski said...

It's not there. What's the Affx version?

Shaikorth said...

The AAPA 2015 abstract for "Phenotypic inference from ancient DNA" says they included over 30k SNP's with known phenotypic effects in the 390k SNP set, no doubt this includes things like disease suspectibility, pigmentation, EDAR variants etc. The info on the fully public dataset from Reich lab says it has just 354k SNP's.

Is it possible that they have removed the phenotype-related SNP collection they're going to discuss from the public set? At least no one would pre-empt the upcoming paper.

Krefter said...

@Davidski,

Does this help?

http://browser.1000genomes.org/Homo_sapiens/Variation/Explore?r=5:33951193-33952193;source=dbSNP;v=rs16891982;vdb=variation;vf=9638058

Davidski said...

Just make a list of the Affx SNP calls you want to see and I'll post them.

Krefter said...

What does a Affc SNP ID look like?

Davidski said...

Like this: Affx-13943225.

Krefter said...

Does knowing the chromosome, ene, and position help?

Romulus said...

The pigmentation analysis Genetiker did is very interesting. Looks like blonde hair/blue eyes/light skin was a European trait established in the Neolithic and the Yamnaya were a darker group.

Chad Rohlfsen said...

16903067 transcript cluster id???

Romulus said...

Its interesting that the Neolithic R1b1 shows up with no West Asian or Eastern Euro.

If you look at the results of that recent Y DNA study of 2000+ Catalonian men they had some R1b results very high on the tree positive for no subclades (and they tested for just about everything it seems). Perhaps they are descendants of this ancient R1b group who got to Spain long before the Beaker/Yamnaya?

Here are those R1b results from the Catalonia study.

R1b 26
R1b-L21* 132
R1b-M153 20
R1b-M343* 13
R1b-P311* 2
R1b-P312* 406
R1b-SRY2627 162
R1b-U106* 82
R1b-U152 164
R1b-Z195* 60
R1b-Z220* 168
R1b-Z278* 70
R1b-Z381 44


Chad Rohlfsen said...

Wow. Bell Beaker does have a decent amount of possible blonde hair and blue eyes. That wasn't expected. Corded didn't!! Only after mixing with Beaker, apparently.

Arch Hades said...

@Romulus Well aren't the Yamna considered post Neolithic ?

Krefter said...

@Romulus,

Geneticker tested only two SNPs. Based on the bronze age samples' calls in those two SNPs he tested though I think they looked like modern central-north Europeans.

So like these people in a documentary about Beaker folk.

https://www.youtube.com/watch?v=hmHXBXG7Loo


The Haak genomes will reveal an obvious change in pigmentation.

It's been my opinion for a long time that west Asians have the pigmentation of EEFs and other ancient Europeans. Just look at Sardinians, they have allele frequencies similar to EEF and look very near eastern.

There was a change that occurred somewhere in Europe around the Neolithic and Bronze age. This change came to dominate Indo Europeans and Finno-Urgics in central-north Europe. Genetics tell us those people have a lot of common ancestry from this era, so it makes sense.

Romulus said...

@Krefter

Yes true, but it is still telling that the oldest examples of the derived alleles are found in the EEF samples and that they only occur again in EEF admixed groups. Also that all of the HG(Samara,Karelia,Motala)/Yamanya samples are ancestral.

I think its a pretty safe bet that the derived state for those particular depigmentation genes originated in EEF with that West Med admixture and I would suspect more as well.

Arch Hades said...

I wonder if this 'Amerindian' and 'South Asian' component is why in the Haak paper the Fst genetic distances between the Yamna and modern Europeans seemed pretty far off. Most modern Europeans from the far south to north were more similar to each other than to them.

Chad Rohlfsen said...

Something else interesting...
Large stone circles in Bashkiria that pre-date the erection of large gray stones in Stonehenge. Similar structures down the Urals. Also in Hungary and the Czech Republic.

https://hague6185.wordpress.com/2013/05/31/mysteries-of-stone-circles-in-russias-bashkiria/#comments

Chad Rohlfsen said...

Blonde hair isn't a 'West med' thing. All Iberians were ancestral for dark hair. No blondes...

Romulus said...

There is also the existing paper (forget the name) that stated the Yamnaya were a dark haired/eyed people. That leaves the possibility that they played a role in the spread of SLC245A but not for Hair/Eye depigmentation alleles. Given that the farmers already brought SLC245A it would just be reinforcement of an already existing phenotype.

Romulus said...

@Chad

NE7 is the oldest example of blonde hair and blue eyes I know of and In K15 it shows a majority West Med

NE7 K15

Population
North_Sea 6.59%
Atlantic 28.98%
Baltic -
Eastern_Euro -
West_Med 47.89%
West_Asian -
East_Med 15.44%
Red_Sea 1.11%
South_Asian -
Southeast_Asian -
Siberian -
Amerindian -
Oceanian -
Northeast_African -
Sub-Saharan -

Felix's analysis of hair/eyes

http://www.fi.id.au/2014/10/ancient-hungarian-genome-ne7-analysis.html

Chad Rohlfsen said...

Iberians had high West Med and no blonde hair. There's something else going on.

Romulus said...

@Chad

Not every sample that contains West Med must necessarily carry the derived allele. However every single sample that DID test positive for the derived state of rs1393350 (the blonde hair blue eyes allele genetiker tested for) contains West Med. No sample without West Med tested positive for the Allele.

There are obviously other genes related to the blonde hair and blue eyed traits, but given that we know the Yamnaya were dark hair/dark eyed its fair to make the assertion that they did not spread them.

Chad Rohlfsen said...

Bell Beakers hardly have West med. That p312 beaker has none.

Chad Rohlfsen said...

There's no correlation between a component and hair color. It fluctuates.

Romulus said...

The p312 beaker isn't tested for the blonde hair/blue eye gene so it is irrelevant. The ones that were positive contained a significant amount of West Med. Why would the total % matter when we are talking about a few specific pigmentation genes that make up less than a percent of an entire genome? Why don't any of the 0% West Med genomes carry the derived Allele?

Romulus said...

Maybe it is simpler to look at it outside of an admixture context.

1. The derived allele for blonde hair and blue eyes existed in Neolithic Europeans who had no Yamnaya or EHG DNA. (Multiple Examples).

2. The derived allele for blonde hair and blue eyes did not exist in EHG or Yamnaya (Multiple Examples).

3. Other analysis of Yamnaya Pigmentation shows they were Dark Haired and Dark Eyed.

The logical conclusion from that Data is obvious.

Krefter said...

@Romulus

Blonde hair certainly did not originate in EEF. It's just the oldest examples come from EEF.

Keep in mind it was even more popular Bronze age Siberians, who certainly won't score any West Med.

Evolution is incredible complicated.

Yamna might turn out like Andronovo, you never know.

Chad Rohlfsen said...

This whole convo is anachronistic. Ancient people aren't modern clusters. There are modern pops with blonde hair that score next to zero west med. Pashtuns have some light hair and no west med.

Romulus said...

@Krefter

We already do know that the Yamnaya are dark haired and dark eyed... So we do know.

Krefter said...

We have results from one SNP associated with hair color. It's 25% in Finns and over 30% in French so....

I'm just trying to point out it has complicated origins like any other trait and can't be pin pointed to a single pop. It might in the future though.

There's no reason to discuss till we get results in more SNPs and or Hirisplex.

Romulus said...

@Chad

There are ploynesian groups who exhibit blonde hair as well, but they are irrelevant to a discussion of Europeans.

Chad Rohlfsen said...

Romulus,
These component are rather irrelevant. Atlantic has EEF and ANE, yet is used for hunters and farmers. These components post date these genomes. They don't matter! Move on.

Chad Rohlfsen said...

Taking them literally and making inferences, is a fools errand. It won't lead anywhere. Once you learn what these components mean in terms of EEF, WHG, and ANE, you'll see. David posted their K8 values back on the triangle post, I believe.

Chad Rohlfsen said...

David,
Try this for the affx LP allele.

16903067

Krefter said...

What rs SNP is that Chad?

Chad Rohlfsen said...

Off the Affyx site. I had to register. I'll get back to it after if finish my assignment here. It listed a bunch of stuff under it. I can't remember if it was under LPC or LPS, something like that.

Krefter said...

Somehow Geneticker found a way to find rs SNPs.

Chad Rohlfsen said...

https://www.affymetrix.com/estore/user/login.jsp?toURL=/estore/

register and search here. I've got to focus on my paper.

Mike Thomas said...

https://drive.google.com/file/d/0B1vtTHobiXwVLWU0ZU92ckozeHM/view?usp=sharing

Graph mode of David's K15.

Davidski said...

I've converted almost all of the Affx SNPs to rs numbers, but what I'm saying is that if you want to make sure you're not missing anything then give me a list with the Affx numbers.

Nirjhar007 said...

Thanks Mike:)

Karl_K said...

@Shaikorth

"The AAPA 2015 abstract for "Phenotypic inference from ancient DNA" says they included over 30k SNP's with known phenotypic effects in the 390k SNP set, no doubt this includes things like disease suspectibility, pigmentation, EDAR variants etc. The info on the fully public dataset from Reich lab says it has just 354k SNP's.

Is it possible that they have removed the phenotype-related SNP collection they're going to discuss from the public set? At least no one would pre-empt the upcoming paper."

Interesting observation...

Krefter said...

Thanks, Davidski.

Are you 100% sure your conversions are correct? If you find rs1805008 or rs16891982 can you post the calls? I'll keep looking to find how to convert rs to Affx.

Chad Rohlfsen said...

That's pretty nice Mike! You should make one after David does the K8, and hopefully a K9, if he can separate EHG from MA-1.

Chad Rohlfsen said...

David,
Just thinking about something here. That South Asian might be pure MA-1, in this context. Just as you said. I am pretty confident that you're going to be able to separate EHG from MA-1, as well as finding EHG in EEF and WHG. I think that if you take EHG out of MA-1, then leave the remaining labeled as MA-1, we can get to the root of ANE, which makes sense. Several of the TreeMix graphs I've seen had the root of MA-1 mixing into South and South Central Asians. I think the results will be surprising for South Asians.

We could see something like MA-1 ancestry (without extra Amerindian drift) increase for South Asians, just as making EHG unadmixed, increases "ANE/West Eurasian" in Native Americans.

Mike Thomas said...

Yeah shall do

Davidski said...

I'm running the K8 now with all the markers I can get, but it'll take a while because to get optimal results I have to run each sample separately.

I'll try and split EHG from ANE this weekend. If that doesn't work I'll come up with some sort of test based on the ancient samples that isolates the Yamnaya as a component.

Chad Rohlfsen said...

David,
If you can, pluck that South Asian out of the Samara sample. That is, if you have to merge it with it with the Karelian to make the cluster. I think that is MA-1 stuff making an appearance already.

Chad Rohlfsen said...

Okay. Sounds good. I may try this plink deal another way. I'll email you after work tomorrow.

Mike Thomas said...

We'll await

"I'll try and split EHG from ANE this weekend. If that doesn't work I'll come up with some sort of test based on the ancient samples that isolates the Yamnaya as a component."

So you mean you think you might still be able to figure out definitively the difference between yamnaya ancestry as opposed to shared descent ?
IMO we'd still need high quality aDNA from central Poland, etc

Srkz said...

Krefer

try this snp list https://www.dropbox.com/s/68obd07szz1v8w4/snp.txt?dl=0

It was generated by searching equal b37 positions of Affx and rs snp's

Davidski said...

Krefter, those SNPs aren't in the dataset. You can try and find the Affx numbers for them, but I suspect they were removed on purpose.

Karl_K said...

"Krefter, those SNPs aren't in the dataset. You can try and find the Affx numbers for them, but I suspect they were removed on purpose."

On the plus side, this gives everyone more time to guess and debate what phenotype associated SNPs are found in each genome, and how they could have ended up at their current frequencies.

Karl_K said...

@Roy King

"Very interesting to see if the Yamnaya have lactase persistence."

Indeed. Best comment yet, Roy.

Srkz said...

I'm starting Gedmatch uploading, the first is Karelian EHG, kit number M118893. Tomorrow i'm planning to upload seconf EHG and some Yamnaya genomes.

Alberto said...

The results of the analysis made by Genetiker also agree that the best match for the "Armenian-like" population would be Lezgin, but even more northern (basically no ENF and more WHG) with Tajiks being the second best match.

This obviously doesn't match the f3 stats on the paper. I guess they just didn't have these populations to test them?

And it confirms a north Caspian route (and north Central Asian origin) as more likely. If the route was through the south Caspian and then crossing the Caucasus they might have been more southern (or if were from North Iran originally).

Grey said...

The depigmentation phenotypes may be the result of additive effects.

Say
population A has x
population B has y
population C has z

and then after combination
xyz -> phenotype 1
xy -> phenotype 2
xz -> phenotype 3
yz -> phenotype 4

then all of populations A, B and C would be responsible for the four phenotypes but none of them would be responsible individually.

Srkz said...

Crap, all new paleogenomes from the Reich dataset are fully homzygous:
"We made no attempt to determine a diploid genotype at each SNP in each sample. Instead,
we used a single allele – randomly drawn from the two alleles in the individual – to represent
the individual at that site20,39."

It means no IBD segments

Nick Patterson (Broad) said...

@skrz
We find that to get decent diploid
calls we need about 20x coverage.

Most of these samples have much less.
Note though that for F_st work
using smartpca or f_3 stats using
qp3Pop (ADMIXTOOLS) you get valid
estimates using inbreed: YES

Nick Patterson

Chad Rohlfsen said...

Alberto,
Lezgins are like 52% or so ENF.

Alberto said...

@Chad

Yes, I was thinking more in the K7b terms of "Southern", not ENF.

Anyway, I expect that Lezgins will turn much less ENF and more ANE once David can isolate these "Southern ANE" that defines this other population.

But yes, you are right. I should have said "less ENF", intead of "no ENF" (when thinking in the current "triangle" terms).

Karl_K said...

@Nick Patterson

I would assume that diploid information would be utilized for infering phenotypic traits from these genomes, if they were available. Will those calls eventually be released?

Srkz said...

@Nick Patterson
Thank you, but i'm interested in IBD/IBS segments. Are BAMs from http://www.ebi.ac.uk/ena/data/view/ERP009526 also haploid? Karelia EHG has a good coverage.

Karl_K said...

@Srkz

It would take a very interesting program to selectively remove all original reads with conflicting genotype data, while leaving anything left to align to a reference. In my opinion.

Nick Patterson (Broad) said...

@srkz

The bams are indeed available.
We did a "capture" experiment
so the reads preferentially align
to our SNP array, but I think you
get everything, and certainly its not haploid. Coverage is often
thin though.

Nick

Srkz said...

Yes, i understand. I'll try BAMs.
PS thanks for your great work ))

Sergey

Chad Rohlfsen said...

It won't affect ENF. It'll just change WHG and ANE to maybe WHG, EHG, and ANE. It may remove the South Eurasian though.

Alberto said...

@Chad

We'll have to see.

I think that current ENF is based on populations like Bedouins, Saudis and Yemenite Jews, which are modern and admixed populations, and certainly have a good amount of "West Asian".

If this "Southern ANE" is something similar to West Asian, then ENF is catching a lot of it in certain populations, giving too high ENF results.

We'll see what David has in mind to do, but I think that on one side the ANE will be split in 2 components, and on the other ENF should be stripped of any overlap with the "Southern ANE" (I hope we have a name for it soon!).

I suggested him to try to use EEF and masking the WHG chunks, because this is the closest we could get to the real ENF and it's always better to use ancient samples than modern ones. Now that there are many of these EEF genomes it might be possible to do it.

The end result might be that Yamnaya will have very low ENF, just as is lacks any West_Med, East_Med or Red_Sea in K15.

And other pops will be changed too, of course. Among them, Caucasus ones.

But, again, let's wait and see.

Iosif Lazaridis (Broad) said...

It's great that our data is finally out there, and I hope it will be useful to the wider community.

Just a quick comment: There are significantly negative f3(Yamnaya; Near East, Karelia_HG) and f4(Karelia_HG, Yamnaya; Near East, Chimp) for many Near Eastern/Caucasus populations. We are clear in our paper that we don't think we have a good surrogate for the admixing population, and we mainly model Yamnaya with Armenians/Iraqi Jews because they top these statistics.

You get a negative correlation like Fig. S9.20 with different Near Eastern/Caucasus populations. Percentages vary (39% BedouinB, 47% Druze, 48% Iraqi Jew, 53% Armenian, 68% Lezgin). African admixture (which many Near Easterners like BedouinB have, see Moorjani et al. 2011) reduces these estimates, while for populations like Lezgins (who have lower Near Eastern ancestry), you need proportionally more Lezgin input into the Yamnaya to account for the same amount of dilution. We estimated (Lazaridis et al. 2014) that Lezgins are 71% Near East, so .68*.71 = 48%, which seems about right.

Overall, I think that ~50% is a good ballpark estimate, but there's only so much you can do without an actual ancient Near Eastern genome.

Srkz said...

Updated Karelian EHG kit number is M652848

Shaikorth said...

If Yamnaya are 32% EHG when rest is 71% Near Eastern 29% ANE Lezgin, or 61% EHG when rest is BedouinB with 7-10% SSA and rest Near Eastern, the range of a pure Near Eastern in Yamnaya should maybe be a bit below 50%. The SSA and ANE difference of BedouinB and Lezgin is actually resulting in surprisingly small range in fits, there could be many reasons like ancient Near Eastern itself having affinities to Africa, "Basal Eurasian" turning out to be truly ancient African mixture etc.

Armenians and Lezgins shouldn't have recent SSA like BedouinB do (older stuff perhaps in low single digits) and they both have noticeable ANE, yet just that ANE difference results in a bigger change than BedouinB's SSA and lack of ANE - Armenian portion of Yamnaya is ever so slightly closer to the BedouinB portion in size than to the Lezgin portion.

Davidski said...

Hi Iosif,

Thanks for your comment.

My impression for now is that the 50/50 Karelia_HG/Armenian model works in the context of a deeper phylogeny, but not when one tries to break things down into more recent components like ANE and the WHG-related stuff within the modern Near Eastern component.

I suspect what might be happening is that the Yamnaya have Near Eastern ancestry that is more basal than found in most of the Near East today. As a result, some of their steppe ancestry is classified as Near Eastern when they're being modeled as partly present-day Near Eastern.

Like you say, there's only so much that can be done without an ancient Near Eastern genome. However, if possible, it would be useful to have ancient Near Eastern genomes from a couple of very different sites, like, say, Anatolia and the Lower Tigris & Euphrates, because I suspect they might show very different levels of Basal Eurasian ancestry, which might influence the modeling outcomes.

Chad Rohlfsen said...

Shaikorth,
It's possible, but not hard to imagine a couple different basal groups with shared drift differing between some UP and farmers. The 23kyo Kenyan would be interesting to examine, as it's highly diverged from modern Africans.

Matt said...

Re: Iosif Laziridis's comment David's ENF estimates for each one of these (except BedouinB) from the K8 are:
Druze 80%, I_J 83%, Armenian 77%, Lezgin 58%.

which fit with the estimates from Iosif to percentages to 37% (Druze reference), 39% (I_J), 40% (Armenian), 39% (Lezgin).

So, the estimates based on anti-correlation *could* be high, but they do at least interact with David's ENF estimates to produce a reasonably consistent ENF level if so (each Near Eastern reference would give slightly different predictions about relatedness to African / South Asian populations though).

I still feel like estimating ENF and the idea of ENF is difficult, especially from a modern sample like the BedouinB.

By contrast, seems like working out the Basal Eurasian level in Yamnaya (or any population) should be simplicity itself:

Remember, populations *should* only differ in their relatedness to Ust-Ishim, overall, due to either Basal Eurasian admixture or African admixture. So if you can logically exclude African admixture for a second then:

f4(EHG/WHG/Dai,EHG/WHG/Dai;Ust_Ishim,Chimp) should be 0 indicating these populations have no Basal Eurasian ancestry in any of them.

f4(LBK_EN / Hungary_EN,EHG/WHG/Dai;Ust_Ishim,Chimp) should be some negative number, indicating that LBK_EN / Hungary_EN has some level of basal Eurasian ancestry.

IRC, a population's level of Basal Eurasian ancestry should then be proportionate to its f4(Test,EHG/WHG/Dai;Ust_Ishim,Chimp) relative to these two populations and the estimated BE in LBK_EN / Hungary_EN .

This would be noisy because UI is only one sample, but a noisy estimate is still an estimate to work from.

(And if BE only changes in relation to ENF, then this predicts ENF; if not then it does not).

Alberto said...

@Davidski

"I suspect what might be happening is that the Yamnaya have Near Eastern ancestry that is more basal than found in most of the Near East today."

Well, RK had some good reasons to think that this Q component in the other population (what we were now calling "southern ANE") was Basal Eurasian.

But either way, it would be very interesting if this component could be decoupled from both ENF and ANE so that this population can be tracked much more effectively.

Krefter said...

Karelia_HG Kit # M652848 ANE K7.

ANE 37.52%
ASE 3.63%
WHG-UHG 55.15%
East_Eurasian 3.70%
West_African -
East_African -
ENF -

Chad Rohlfsen said...

That K7 probably isn't reliable. It had MA-1 at less than 50% ANE. EHG does look like it very well could be full ANE, but sharing much more drift with Native Americans and Europeans, than MA-1.

Krefter said...

If EHG is full ANE why is it just as close to Loschbour as to MA-1? Unless what you mean by ANE is different than a pure clade since at least 24,000YBP.

Chad Rohlfsen said...

Read over SI8 again.

"Fig. S8.6 accounts for this “symmetry” between MA1 and Karelia_HG with respect to Native Americans by proposing a balancing of two processes: first, Karelia_HG share more alleles with Native Americans due to sharing additional common genetic drift (and thus Native Americans should share more alleles with Karelia_HG than with MA1), but, second, Karelia_HG has WHG-related ancestry which dilutes this affinity to Native Americans. However, this requires that the additional common drift shared by Karelia_HG and Native Americans to be nearly perfectly balanced with the dilution due to WHG ancestry, resulting in the observed symmetry, which is not very parsimonious."

Remember, EHG failed as a mix of WHG and ANE.

Davidski said...

Chad,

Where did you see MA-1 at less than 50% ANE in the K7?

Chad Rohlfsen said...

Gedmatch

ANE 49.08%
ASE 11.22%
WHG-UHG 33.58%
East_Eurasian 1.54%
West_African -
East_African 4.58%
ENF -

Chad Rohlfsen said...

It might be an issue with how Felix did it. I'm not sure.

Davidski said...

That's the MA-1 sequence that Felix uploaded. I don't use that sequence because it looks somewhat iffy.

Also, EHG are indeed closest to 40/60 ANE/WHG in the old Lazaridis et al. model.

They might not be a mixture of these components, but that's the best way to fit them in this context.

EHG are not 100% ANE, unless we redefine ANE, in which case MA-1 won't be 100% ANE.

Chad Rohlfsen said...

I'm not really concerned with the label. It could just be EHG and MA1. I think that if you can get EHG separated you'll see the difference in Motala, Karitiana, and Anzick1.

«Oldest ‹Older   1 – 200 of 245   Newer› Newest»