Wednesday, March 15, 2017

Failure to replicate

Just in at bioRxiv:

We fail to replicate a genetic signal for sex bias in the steppe migration to central Europe after ~5,000 years proposed by Goldberg et al. PNAS 114(10):2657-2662. Estimation of X-chromosome steppe ancestry in the Bronze Age central European population with the qpAdm method (Haak et al. Nature 522, 207-11) does not indicate lower steppe ancestry on the X-chromosome than in the autosomes. We perform a simulation which indicates presence of estimation bias of -19.5% in the inference of X-chromosome admixture proportions using the method used by Goldberg et al., largely eliminating the observed sex bias.

Iosif Lazaridis, David Reich, Failure to Replicate a Genetic Signal for Sex Bias in the Steppe Migration into Central Europe, Posted March 14, 2017, doi:


Karl_K said...

And this is why preprints at BioRxiv are so great.

This is the kind of paper that previously could never get published where anyone would ever read it. And even still, it takes a fantastic reputation to pull it off.

I (and several others that post here) could seriously sit around all day, every day, writing papers where we failed to replicate something.

If only there was someone here who had a passion for writing grants.

capra internetensis said...

Sure beats the classic format of "series of increasingly bitchy letters to the editor". ;)

bellbeakerblogger said...

I'd call this foreshadowing. In any case, this result is more realistic than the guy with fifty wives. However big or small, we have a folk migration.

John Smith said...

I knew this was the case as neolithic mtdna is so different than modern European mtdna. The Yamna and neolithic Siberian groups had a higher frequency of H which is why I knew this already, although I still think the Yamna were a dead end according to current evidence as some of them had C (which further proves my theory of steppe Ancestry but the Yamna as a dead end). I am not certain why some folks thought Neolithic populations of Central Europe (LBK ect) had a mtdna contribution to Europe when the mtdna was so different . The Hamangia who had a high frequency of H (unlike other neolithic groups) which may have had been dominated by R1b and despite having a culture connection to the near east did not come from there and may have spoken Indo European languages which could be as old as 10kya. According to scholars Celtic Languages are 2,600 years old however according to genetics it is probably at least 3,800 in the British isles alone. The scholars probably underestimated the ages of all Indo European languages. Maybe people have been using Wool for longer than scholars thought they were.

Slumbery said...

John Smith
"According to scholars Celtic Languages are 2,600 years old however according to genetics it is probably at least 3,800 in the British isles alone."

Language is still not a genetic attribute, so according to genetics we can't say anything about the age of Celtic.
Also Celtic had plenty of time to reach the British Isles later (and it probably did even later that its own age) and that not necessarily meant easily detectable population turnover. The assumption that two populations was culturally identical and spoke the same language subgroup just because they cluster close on a PCA assembled with a few (and incomplete) DNA is baseless. (I do not talk about IE in general, but telling apart the garden varieties of IE from aDNA.)

AWood said...

"Language is still not a genetic attribute, so according to genetics we can't say anything about the age of Celtic. "

I'm not a linguist but I would still bet money that genetic calculation is more in the scientific realm, where as linguistics is highly theoretical. I'd never bet my money on linguists who can't even agree on anything, let alone prove anything beyond a doubt.

Slumbery said...

Awood: Genetics is indeed much more of a hard science than linguistics, but that is beside the point. Genetics can be however scientific, languages are still outside of the core expertise of the field, since the variations of modern human languages are not genetically coded, but purely cultural phenomenons. Ancient DNA study can serve only as auxiliary information. It can be useful, but you can't seriously except it to answer questions like how old "Celtic" is.
And yes, linguistic can not give a hard scientific answer to such a question either. there are things we will never know in the way as we know the DNA code in our mitochondria for example.

capra internetensis said...

*Genetics* may be a hard science, but drawing linguistic conclusions from genetics is *not* genetics; it's sociolinguistics. "My data about population movements (or lack of them) comes from a hard science, therefore my speculation about the linguistic impact of these population movements (or lack of them) is hard science" - no.

Matt said...

Not read the paper yet, but as I commented at the time, using ADMIXTURE for this purpose seemed pretty questionable, and glad to see they've pulled out qpAdm for this purpose. With ADMIXTURE, different clusters might form on autosome vs X for reasons orthogonal to real ancestry. X-Fst vs autosome-Fst is similarly somewhat questionable due to confounds of general levels of within population diversity on each.

Not hugely surprising, and again I'm gonna call on researchers to test whether a bias exists in modern populations relative to a neutral outgroup.

The only finding looking solid at the moment to me is Chiang 2016 (, that Sardinians have relatively more X chromosome EEF ancestry (Anatolian+Loschbour) compared to their autosome relative to Tuscans, Spanish, British and Finnish, who are all neutral to each other.

But this would mean that an alternative explanation will have to be found for dearth of y - G and I2 in recent Europeans. Founder effect? That is, admixture was not initially sex biased, but *eventually* steppe chief (or at any rate, at least "Eastern European early IE adopter") y-dna replaced everyone else in Europe to a large degree?

Folker said...

Beware, because it doesn't mean there wasn't a sex-bias in the Steppe migration, but that the margin of error is so huge that the variation is included within. So, it only means that the procedure used to detect it was not precise enough. I think it will be a huge matter of discussion for the times being, because it's a core subject for archeology. Who made it to Central Europe? Men only? Men with some women? Or Men and women?

Matt said...

One comment re: their choice of outgroups in the qpAdm models:

"Our outgroups are: Mota (5), Ust_Ishim (6), Kostenki14 (7), GoyetQ116-1 (7), Vestonice16 (7), MA1(8), AfontovaGora3 (7), and Levantine Neolithic farmers (9)"

with the exception of Mota and Ust Ishim to generate outgroup stats, they've essentially abandoned using non-West Eurasian outgroups (including type example of pre-El Miron West Eurasians). Presumably they've found that non-West Eurasian outgroups aren't actually clearly adding information for modelling with West Eurasian ancients...

This is almost a complete reverse of the way the technique was first introduced, using only outgroups to West Eurasia and then carefully adding perhaps some ancient West Eurasians.

Lack of informativity about the West Eurasian ancients in non-WEE outgroups was something I found questionable about the qpAdm technique from almost the beginning (tldr version; there seems like a good deal of dispersal that the different WHG, for instance, have on East Asian related stats and that overlaps with much of the range of present day Europeans, and so putting that in the driving seat seems like it will add noise to your conclusions).

Matt said...

From the perspective of whether this tells us anything new about the autosomal ancestry in the Bronze Age, here are their autosomal fits with the population labels for the samples :

Take it (too) literally, and ignore standard errors, and you have I0059 - LN2 as steppe mixing with a population 43% HG and 57% Anatolian_Farmer (which is practically a PWC or Blatterhohle cave sample) and at the other end of the spectrum RISE577 - Unetice would fit as steppe mixing with a population 31% less HG than Anatolian Farmers were (it is an outlier but other populations are there who are steppe plus AF and negative HG in substantial quantities).

(labels from preprint of the paper this rejects:

Correlations of sample proportions: (steppe vs HG is pretty linear and negative, while steppe vs AF is noisier and HG vs AF is noisiest).

Populations means work out: Corded Ware: 76% Steppe, 22% Anatolian, 2% HG, Non-Corded Ware: 55% Steppe, 35% Anatolian, 10% HG (Unetice: 60% Steppe, 34% Anatolian, 6% HG).

These seem quite different from the fits back in Haak 2015, where the Halberstedt_LBA, Benzigerode_LN2 and Alberstedt_LN fit as mixtures of EEF+Yamnaya with no additional WHG, while Unetice picked up a good chunk of WHG. While here it's quite the other way around. Presumably this reflects much more informative qpAdm methodology with the populations they use in it?

Ric Hern said...

Or Hamangia could have contributed to Sredny Stog ? Very little apparently but noticeable aso seen within later Yamnaya...

Samuel Andrews said...

Farmer Moms, Pastoral Dads???

BA Central European mtDNA doesn't indicate sex bias admixture either. Look here. I've been trying to let some of you know lots of Steppe women went with their R1 men to Europe for a while.

"There's a strong presence of Steppe mtDNA in LNBA Central Europe"

Samuel Andrews said...

I do think sex bias admixture occurred in European pre history. I do think Steppe/farmer/HG admixture was sex bias. I think EHG/CHG admixture was sex bias.

I think this because 80-90% of Northern Europeans' mtDNA is EEF and CHG but only 50-60% of their ancestry is. mtDNA U5, U4 went the way of the do do bird because of sex bias admixture.

WHG and EEF admixture was probably sex bias in many parts of Europe. Y DNA I2 dominates Middle Neolithic European Y DNA.

IMO, Sex bias admixture occurred between Baltic HGs and Corded Ware, which can explain high U5 and U4 frequencies in the Baltic.

Davidski said...

There were definitely sex biased patterns of mating and admixture in LNBA Europe between the steppe invaders and locals.

The issue here is about its extent and how accurate the various methods are at picking it up.

Ryukendo K said...

This is great, science working as it should. Do worry about its impact on Goldberg though, she's such a young researcher and her methods papers are really good.

So this is why Reich is unhappy with her results.

It is definitely false that there was *no* sex bias, as the Q statistic, utilising basic fst in the Goldberg paper, tells us that there *must* be sex bias somehow. The issue is with methods that attempt to detect proportions, such as ADMIXTURE (in Goldberg) and qpAdm(in Reich). The Q statistic is not analytically transparent.

Matt said...

@ Davidski, if you don't mind, do you know what do the authors mean by "We ran qpAdm with allsnps: YES and Mota as the basis population", esp. by "basis population"?

@ Ryukendo_K: Re: Fst, I'm not too happy that only a subset of the matrix of FstA and FstX was provided by Goldberg's paper for (AF,CE,BA,SP,HG). E.g. -

Annoyingly incomplete, as would've been good to see if various of the Q ratios which are not included (such as BA-CE, CE-SP, AF-SP, HG-SG) were even consistent with the 0.7-0.91 Q ratios which are present for all but BA-SP. We have less information about the range of the paired Q statistic in these ancient samples than would have been easy for them to compute and provide, for no apparent reason.

Simple outgroup f3 sharing statistics (and ratios of outgroup f3A vs outgroup f3X) would've been a better choice of comparison than Fst though, in my opinion.

Davidski said...


I didn't know what a basis population was myself, and had to ask. It's the population at the top of the list in right pops.

So I suppose it's like the main outgroup in TreeMix; usually some African group or genome.

Guess I have to rerun some stuff that I did with the new qpAdm and see whether this new setup changes things significantly.

And as far as I know, allsnps: YES literally means running all of the available SNPs, as opposed to those that overlap between all samples.

Matt said...

Although RK, on those autosomal Fst scores from Goldberg, one thing I found odd was that some of them seemed to depart from the reported Fsts from Lazaridis 2016 and Mathieson 2015:, e.g. in terms of differentiation of CE Neolithic from Anatolia Neolithic, WHG from other populations.

... and that implies different Q ratios -

So I wonder how these differences arise between their method of calculating Fst. Might have to reread that bit of their paper more carefully.

@ Davidski, interesting stuff, I didn't know that anything like that mattered in qpAdm either.

Davidski said...

I didn't know that anything like that mattered in qpAdm either.

It wasn't discussed until now, so I guess it does now.

Ryan said...

@Karl - I'm not sure why you're hostile to work like this. Trying to replicate others' work is important, and more results than many would like to admit turn out to be not replicable and likely wrong.

batman said...

In the traditional five-cast societies of ancient antiquity (ENE/BA/EIA) there used to be A dynasty at the helm and stem of it.

Which implies that the "royal seed" was to be distributed from the core of the biological and political society - through herritance and herritance only.

Which means that there were ONE and only one royal descendant, as in the present title "chrown-prince" alt. "arch-duke".

The younger brothers/half-brothers of the crown-prince would form the "Upper Nobility" - to perform as Dukes/Lords within the respective districts of the kingdom ('land'). Duely, they were to marry a number of women to produce the (next) local chieftains within their di-strict. Consequntly each and every chieftain (earl, marki) would be a son-son to the old king.

Within the local communities their respective Earls had to marry women on EVERY farm in his constituency - to produce the next generations of Farmers/Peasants/Ceorls, as grand son-sons of the Old King.

As each Ceorl (Husbond) came to maturity he would become the new Farmer - and the ONLY man to reproduce on his respective farm/yard/gard. Consequently EACH and EVRY child born - on each and evry farm in the Kingdom - where ALL to be grand grand-children of the Old King himself. Which paved the way to the old term "All-father" - as a sacred as well as a "heavenly" term.

The first 'royal lines' spreading after Ice-Age - starting some 12.000 years ago - would create "extended families" to become "aets" or "etnicities". Thus creating genuine, 'homegrown' dynasties - as a 'natural' (historical) consequences of space and time itself.

batman said...

* One consequence were that some earls and peasants could marry up to 50 times - to produce the 50 children nessecary to uphold or extend the population-numbers of their jurisdiction.

* A second consequence is that every new generation was counted - and 'changed' only by the royal take-over, when the chrown-princes of the G, H, I, J and R1-dynasties, respectively, started a new generation (mutation) of their roy-al seeds.

* The ancient frog-leap-pattern of the y-dna seed-lines seems to be a consequence of the agnatic inheritance known from the Old Eurasian high-cutlures - and their respective dynasties.

# As the Holocene Optimum started - 8.000 years ago - the median temperature of Eurasias 55th parallel became about one degree Celsius warmer than today. All the way up to Finland, Onega and Oleni Ostrov - where the oldest highlanders (R1a) and their cold-blooded domesticates occured some 7.500 years ago. At the brink of the start of the Volga-riversystem and the (already) established trade-route to the Caspian Sea, the Aral and the Ural.

# Checking the cold-booded horses (Prewalski, Fjord, Steppe) we find them to overlap with the spread of the cattle-breeding high-landers - creating what's known as "Scandianvian agriculture" vs. the "Euroepan" - where the warm-bloded horses (taipan) congruates with the spread of the large cattles and the heavily lactose-persistent "lowlanders" (R1b). We find them as far north as the western Baltics, where - in fact - the oldest known burials of oxes and horses are found. As well as the worlds highest density of LP - by far.

# Looking at the spread of Pottery we find the Volga-Don area as pivotal. From a 9.000 year old pottery from Elshanka we find the 7.700 yrs old Sperrings/Narva-ceramcis in Carelia, Finland and Estonia, where it transcends into the advanced Asbestos-ceramics in northern Fenno-Scandia and the well-known Pit-Comb-ware along the Volga, as well as the eastern Baltics. In paralell we find the oldest cermics in NW Europe in the Western Baltics, where the first known EBK is dated to 8.000 BP.

# The early comb-ceramics were paralleled by the Ertebolle/FB-ceramics, which transcended into the stroked, corded, belled and cordial Beaker-ceramics of Brittain and western Europe.

Obviously there's a co-existence between the western Beakers and the spread of warm-blooded domesticates and their herding "lowlanders" of the R1b-dynasty. Which, obviously, brached off into the eastern lowlands via the Vistulan transport-zone, along Dniester-Bug and Dniepr-Don (taurian, tyssagetae) - south to Anatolia and east to Bactria, bordering the southern farmers of the y-dna G-dynasty.

# The success of the cattle- and corn-breeding cultures took off as the Holocene warm-period reached its Optimum, and the former tundra and taiga of the Younger Dryas became covered with lush grasslands - all the way up to north-western Norway, southern Finland, western Russia, the Caspian steppes, the Tarim bassin, Mongolia and China. It may seem that an eastern branch of semi-nomadic herders/farmers were branching off as y-dna Q.

# Duely, we have no need to explain the extensive multiplication and migrations of cattle- and horse-breeders as "conquerors" - as the grassland-areas they fertilized and grew were outside/beside the areas needed by the the fishers and gatherers - and their herds of goats, sheep and/or reindeer. Which explains the whereabouts of the I2/I1-dynasties, as well as their eastern completaries of y-dna N and O.

# Except from the latter two, we may suspect that ALL of the mentioned 'dynasties' and consquent etnicities were familiar with the proto-IE tongue. That would explain why and how A I-E stem could spread from Ireland and Spain to India and Tarim, already before the spread of 'heavy' farming and cattle-breeding.

Davidski said...


I reran the tests on South Asians from here, moving Mbuti to the top of the right pops.

There weren't any major shifts in the results, although the fits did improve a little bit.

But I'm wondering now what would be the best outgroups for testing South Asians?

Unknown said...

According to some scholars, perhaps. Isn't Unetice (3,700+ YBP) the supposed source of Celtic languages/cultures according to a large block of others?

Chad Rohlfsen said...

Pick ones that create the biggest Dstats with your pleft.

Chad Rohlfsen said...

Most significant, I mean.

Davidski said...

Just had a look at this graph that Matt put together from the data in the new Laz/Reich paper.

Many of these estimates look plain wrong.

They're in line with Corded Ware moving across and into regions where almost pure Anatolian farmers still lived, which is rather unlikely.

They should be in line with Corded Ware moving across and into regions inhabited by almost pure Hunter-Gatherers like Narva and Blatterhohle and Middle Neolithic farmers like Baalberg and Salzmunde.

Karl_K said...


I'm not hostile. I think it is great. Isn't that what I said? I just went back and checked. I said "great". So, just to clarify, in the last people couldn't publish papers like this easily, and it is great that some experts are cleaning up after less careful academics. It is great.

Rob said...

Yes but it's likely that CWC didn't mix with local at first, and they were simply steppe/ ANF. Admixture with various local MNE & WHG
occurred later, as shown in the Unetice and post-Corded Baltic BA groups

Davidski said...

That doesn't work, because we now know what the earliest unadmixed Corded Ware migrants looked like, and they looked like Yamnaya.

The Corded Ware samples in this graph, from Estonia and Germany, are admixed, so they should show admixture from groups like Baalberge and Narva.

Samuel Andrews said...

"Although the X chromosome has a slightly different pattern of inheritence, over a small number of generations it will be identical to the autosome"

This is a question for everyone: Can X chromosomes identify sex bias admixture?

Davidski said...

This is a question for everyone: Can X chromosomes identify sex bias admixture?

This is the point I was trying to make above.

mooreisbetter said...

I've always said. No one knows if these "Bronze Age migrants" were conquistadors, illegal immigrants, or refugees. Put the fantasies aside gents.

All Y chromosome differences can be attributed to populations size and mortality differences between Mesolithic and Bronze Age groups. Think about it. Digest it. Grasp it.

Davidski said...


You should try and digest and grasp the relevant archeology and anthropology, which show that these "Bronze Age migrants" arrived with a new economy that made them more successful than the locals.

You know, like Europeans in the Americas. See the parallels now?

Rob said...

However, the mobile pastoralist economy has probably roots in central east Europe ( Baden, GAC ) as much as the steppe.
The American analogy is needless and wrong, because the level of difference between colonials and Amerindians was marked (literally civilizations apart). This is very different to the situation in copper age Europe, where interaction occurred for 2000 years before a 'snap valve" event occurred c. 3000 BC, resulting in a large westward thrust, even if eastern migration also occurred.

Seinundzeit said...


To test things out, could you look into these populations:






Using these outgroups:










With Mota as the "basis population".

And, testing the Central and South Asians as mixtures between Iran_Neolithic, Levant_Neolithic, AG3/MA1, Villabruna, Jarawa, and Ami.

It might not work, but I think it could.

And if it does work, it'll be very awesome to finally see a qpAdm model of Central and South Asians involving deeply basal ancestral streams.

Of course, only once you find some time, and only once you have the inclination.

Thanks in advance.

Shaikorth said...

"This is a question for everyone: Can X chromosomes identify sex bias admixture?"

Looks like they can. Jeong et al's results about sex biased admixture in Sardinians relative to mainland Europeans (for example in figure 7 look solid. They're based on D-stats, not supervised ADMIXTURE/Structure like Goldberg et al. and the modern and ancient samples used are of higher quality than the steppe samples.

The issue appears to be that an ADMIXTURE-based approach produces questionable results. For example, are BedouinB *really* pure Neolithic-Bronze Age Levantines and at least some Druze pure Bronze Age Armenians like this supervised ADMIXTURE test from Marshall et al. suggests?

Davidski said...


Those models don't work. The overall setup doesn't appear to be very stable.

Matt said...

Davidski: "Many of these estimates look plain wrong."
There is quite a robust tendency with these outgroups for the populations here with low steppe to have a high HG:Anatolian Farmer ratio, and then populations with high steppe to have a low HG:Anatolian Farmer ratio -

Looking at the numbers for means (assuming individuals have some noise), could be just about consistent with a scenario where Corded Ware is a steppe group mixed with one of those low HG:AF Neolithic farmer groups (Lengyel+Hungarian Copper Age?) who are ~10:90 HG:AF and Unetice+other LNBA takes on more admixture from populations (in Germany / Baltic?) which are richer in HG:AF (~35:65)... But that still seems kind of an odd scenario.

Despite what I wrote upthread, maybe Lazaridis and Reich should be using a richer set of outgroups with ENA / Villabruna cluster related groups in as well (the latter notably lacking), after all. I'd have hoped they'd know either way though.

Seinundzeit said...


Interesting. Thanks for looking into it.

If one removes Jarawa and Ami from the left pops, and if one adds an Austroasiatic Indian population in their place, does anything sensible happen?

Davidski said...


Maybe Lazaridis and Reich know something that we don't, like, for instance that there was a narrow band of pure Anatolian-like or Lengyel-like farmers still alive between the steppe and Germany, and this is where the early Yamnaya-like Corded Ware were getting their farmer admix from, as opposed to from the HG-rich typical Middle Neolithic German farmers and South Baltic foragers.


Davidski said...



I think at least one of the problems might be having ancestral groups like AG3-MA1 and Villabruna in left pops and the derived Eastern_HG in the right pops. But I'm not sure.

I need to think carefully again about the outgroups that might be useful for different populations when using the new qpAdm. The strategy in this preprint seems to be too simple and cautious, but your models are too complex and risky.

Seinundzeit said...


Just one last stab, but with a somewhat different strategy.

If even this fails, it is what it is.

Right pops:

Mota (basis population)









Left pops:

Austroasiatic Indian

Test pop:

Kalash. This would be the last attempt for today. Thanks in advance.

Davidski said...


Seems to have worked, but note the number of markers and standard errors.

Seinundzeit said...



Looks pretty sensible.

Davidski said...

I'm sure it is very interesting, but since we know that the early Corded Ware in the South Baltic were practically like Yamnaya, then that narrows the options in regards who they mixed with there.

Did they mix with almost pure Anatolians or LBK-like farmers, as these qpAdm models suggest? I doubt it.

Olympus Mons said...

Genetiker seems to .... "qpAdm analysis confirms European admixture in Chinchorro DNA"

If it comes out true, he will be sticking much more than a proverbial finger. It will be a full fist into lots of people asses.

Davidski said...

Genetiker is insane. He should be in therapy, not analyzing ancient mummies.

Nirjhar007 said...


Nirjhar007 said...

Here the masterpiece :

But Dave how he does well in case of snp calls?.

Nirjhar007 said...

Davidski said...

I don't take seriously anything that Genetiker does unless it's backed up by someone more competent, so if you're referring to that R1a from Dnieper Donets, it has now been confirmed by the guy from YFull.

Sgt said...

I'm jumping into the Y-HG {X-Chr; mt etc} debate late but it may not be a simple model of dominance vs migration. During times of stress females produce more female offspring which makes evolutionary sense. See: Preconception stress and the secondary sex ratio: a prospective cohort study, Chason et al, 2012.

Shaikorth said...


The Chinchorro samples' quality was too low for the filtering procedures Raghavan et al. used for their other samples.
see here:

MaxT said...

Genetiker has Corded Ware as 70%-80% Gravettian on his admixture chart lol I can't take this guy seriously, some of his blog posts are very absurd and strange.

I thought he would change after the whole apology post he made not too long ago but nope.

"These results show that many of my beliefs about European genetic history were wrong. I thought that the Aurignacians belonged to Y haplogroup I, but the one Aurignacian sample was C1a2. I thought that the Gravettians belonged to R1, but four Gravettian samples were C1, I, IJ*, and C1a2. I thought that the Magdalenians were R1b, but two Magdalenian samples were I. These results also imply that my beliefs about Indo-European origins were wrong. I apologize for attacking others over their positions on these subjects."

Karl_K said...


"The Chinchorro samples' quality was too low for the filtering procedures"

Exactly. Extraordinary Claims Require Extraordinary Evidence.

Olympus Mons said...

so, regarding Genetiker.
"so if you're referring to that R1a from Dnieper Donets, it has now been confirmed by the guy from YFull"

the guy finds it... but them someone else that should have done the job in the first place "confirms"... and the original finder is crazy? - Ok

These results show that many of my beliefs about European genetic history were wrong. I thought that the Aurignacians belonged to Y haplogroup I, but the one Aurignacian sample was C1a2. I thought that the Gravettians belonged to R1, but four Gravettian samples were C1, I, IJ*, and C1a2. I thought that the Magdalenians were R1b, but two Magdalenian samples were I. These results also imply that my beliefs about Indo-European origins were wrong. I apologize for attacking others over their positions on these subjects."

- The guy admits errors and apologizes for his mistakes....- Ok, completly wako, I see.

Matt said...

@ Davidski, if you do have time (and inclination) to run any more qpAdm models would be interested to see if there are any differences between the following:

a) Outgroups: Mota, Ust_Ishim, Kostenki14, GoyetQ116-1, Vestonice16, MA1, AfontovaGora3, Levant_N

b) Outgroups: Mota, Ust_Ishim, Kostenki14, GoyetQ116-1, Vestonice16, MA1, AfontovaGora3, Levant_N, Villabruna

c) Outgroups: Mota, Ust_Ishim, Kostenki14, GoyetQ116-1, Vestonice16, MA1, AfontovaGora3, Natufian, Villabruna

d) Outgroups: Mota, Ust_Ishim, Kostenki14, GoyetQ116-1, Vestonice16, MA1, AfontovaGora3, Natufian, Iran_Hotu, Villabruna

Mota as "basis population", where the ancestors are:

Anatolia_N, Bichon, Yamnaya_Samara

and the model populations are:

Corded_Ware_Germany, Unetice_EBA, Bell_Beaker_Germany and Halberstadt_LN

Pretty skeptical of the models in this paper and I want to see whether adding any of the other ancient populations they didn't use will drive a new result, or not (at least whether adding Villabruna if there's no time for any of the others).

batman said...


"Totally agree. Completly wako."

Apparently not the only one.

Moreover, people like Thomas A. Edison was deemed the same way, by his contemporary peers.

To bad science doesn't progress as a result of "major views" and "consent".

Kurti said...

might interest some people here.

Karl_K said...


Science largely does progress by small incremental changes to the major view. Of course there are occasional major revelations, which are of much more interest than the day to day work that goes almost unnoticed by the general public.

batman said...

@ MaxT

"Genetiker has Corded Ware as 70%-80% Gravettian on his admixture chart lol I can't take this guy seriously, some of his blog posts are very absurd and strange.

I thought he would change after the whole apology post he made not too long ago but nope.

"These results show that many of my beliefs about European genetic history were wrong. I thought that the Aurignacians belonged to Y haplogroup I, but the one Aurignacian sample was C1a2. I thought that the Gravettians belonged to R1, but four Gravettian samples were C1, I, IJ*, and C1a2. I thought that the Magdalenians were R1b, but two Magdalenian samples were I. These results also imply that my beliefs about Indo-European origins were wrong. I apologize for attacking others over their positions on these subjects.""

What this implies that the entire edifice built on the hypo of TWO sepearate refugias - during the LGM - as a basis for ancient dna-analyzis is wrong. Quite simply.

Which means there's (still) NO evidence for a Gravettian refugia separated from a Aurignacian/Magdalenian refugia during "the LGM" - to explain the Mesolithic/Neolithic dna of Eurasia.

In fact there's ample evidence that the pre-LGM populations of Eurasia had genetic inter-change with eachother.

Moreover it's plain and clear that both mt-dna U/T and y-dna C-F survived the lean bottle-neck of the LGM in various locations across the MILDER part of northern Eurasia - i.e. the Atlancitc facade.

Finally the glacial/postglacial distribution of ARCHEOLOGICAL sites have proven that the final and hardest bottleneck happened during the Younger Dryas (YD). Thus we can sigth a change of the dominant haplogroups before and after LGM, as well as before and after the YD.

The ancestors to the present Eurasians were all based on the population(s) that survived the last, cataclysmic cold-snap, when 2/3 of the larger land-animals of Europe and arctic Asia died out.

Which means that the paleolithic mammut-hunters of the Franco-Celtibrerian "Magdaleniens" and the East-European/Black Sea "Gravettians" were NOT points of causation, to explain the variety and distribution of y- and mt-dna in todays Eurasia. In facts we DO know that they were genetically as well as culturally connected. As weer the last mamoths...

At least the sinature "Genetiker" have the guts to admit that he's been misled - wether by facts, confusion or persuation. Others shouiting even higher about the same mistake obviously don't share the same civil courage. Eventhough they've promptly been explaining R1b to be caused by an "Celto-Ibrerian refugia" and R1a to be "a result" of a "Cento-Carpatian" or "Trans-Caucasian" refugia. Eventhough there's still no signs or evidence that ANY of them indeed did PERSIST through the cataclysmic climate of the Older and the Younger Dryas.

To find an area of PROVEN persistance (substinance) throughout the Younger Dryas we have to look for the land were the sun sets - and the warm waves from the Caribean tropics kept punching Eurasias west-coast with surface-water and summer-breezes above 12* Celsius.

batman said...

@ Karl K

"Science largely does progress by small incremental changes to the major view."

Sure. Indeed that was the methodology Edison follwed to, for about 25 years - to find the right alloy that made the bulb shine, permanently.

Meanwhile he was deemed as "crazy", "mad" and "wako" by a gross majority of contemporary collegues, as well as most of his sponsors and their 'free' press.

Btw.: Isn't the work made by this infamed Genetiker based on the same, scientific principles as (all) other geneticians?

Fanty said...

"No one knows if these "Bronze Age migrants" were conquistadors, illegal immigrants, or refugees."

But we know what the Germanic tribes of the "great migrations" had been: conquistadors, illegal immigrants and refugees. Depending on when and where.

Somehow, they ended up ruling the regions they migrated to. But in all but 1 case (England) they got assimilated. Only in England, they managed to make their language survive.

Davidski said...


These models look pretty good. Note that I dropped AG3 from the right pops, and added Iran_Neolithic, Satsurblia and Villabruna.

I'm going to try and check Yamnaya's X chromosomes now, using the same methods as in this preprint.

mooreisbetter said...

@Davidski You have some neat scientific abilities, but your conclusory assumptions are shameful. Your knowledge of history is also quite weak. "The Bronze Age expansion was like Europeans in America?" That's a conclusion masquerading as an argument. You're better than that.

There are at least 10 different ways that Population B can have more descendants than Population A. Here are a few:

1. Population A simply starts out with fewer members.

2. Population A dies of diseases.

3. Population B has a CULTURAL attitude about having more babies, whereas Population A does not.


Mormons versus Episcopalians in Nevada,

Hispanics versus Whites in California,

Palestinians versus Israelis in Israel)

If you knew anything at all, you would know that at various times in history, it was THE ELITES who did NOT HAVE KIDS.

Example: the Roman patricians had so few kids, they often had to ADOPT to have an heir. The plebs had tons of kids.

Example: modern Israel. The Palestinians have far more kids than Israelis.

This is not dominance, Davidski, it's demography.

Here is Davidski in the year 3000 A.D. "Clearly the Palestinians had an advantage that enabled them to dominate Gaza. They were the elites." Sorry, no.

4. Example four: massive immigration into Population A's homeland because Population B's original homeland was economically depressed and couldn't support a large population.

Hello, America 2017? Do you read the news at all? Have you heard about Trump and his wall proposal?

5. Example five: Population B is driven out of their homeland because of a superior force waging war on them.

The Goths fled the Huns in the east, only to make wars on the populations in the west. The picked on became the bully.

A subset of this is a genuine refugee crisis. Hello, Syria?

So, Davidski, there are just five examples how your "elite" theory is more than likely totally bogus.

I have more, but will ease up on you for tonight.

Algan mardi said...

@huijbregts @Davidski @FrankN @Matt @Alberto and @everyone
Could you take a look at my last post here?

Aram said...

Important notification about the Balanovski's recent paper.

In that paper there is a diagram with an Armenian R1b-P312. It can create a false impression that it is an early branch of P312 in Armenia. This further can lead to speculations that L51 is from West Asia.

It is wrong!
That P312 is the famous DF27 cluster in Khndzoresk ( )
The age of that cluster is 850 ybp. And all members of that cluster live in the same village.
In general P312 is present at low level (<1%) in many West Asian countries. This can be the legacy of Romans, Galatians, Crusaders. The latter is more probable in this specific case.
So NO there is no P312* in Armenia, and no L51* also. Even the Albanian L51* is not confirmed.

In reality nothing has changed after the Myres et al. The highest level of R1b-L11* is in British islands. There is no evidence in the that R1b-L51 formed or moved to Europe from West Asia.

Aram said...

Here is that branch on Yfull
Notice 2800 year common ancestors with Europeans.

Matt said...

@ Davidski, cheers.

With those extra outgroups and removing AG3, the fits look consistent and convergent for Halberstedt_LBA and Corded Ware in the ratio of HG:AF.

For Halberstedt_LBA itself the fits are essentially unchanged, while what happens for the Corded_Ware samples is mostly a shift that takes some of their steppe ancestry and a smaller proportion of AF and increases HG.

So the main difference in the offsets seems like this set could be better at distinguishing a steppe+minority AF combination from HG. I guess this is what Villabruna and Iran_N / CHG stats drive, and stats around Levant_N + Upper Paleolithic Siberia + other UP Europe don't quite do it.

All a bit richer in HG:AF ratio than MN usually seems to be, so might be worth testing Iberia_Chal, Germany_MN, Hungary_CA, Remedello with this setup and see what they come out with. HG:AF in these is 38:62, but then again that's only around 30:70 if we allow for Hungary_HG being 15% AF as per recent Lipson paper, so no big deal.

All this said, shouldn't really affect their main findings for the preprint unless its non-consistent on the X chromosome and there's a disproportionate change there.

Karl_K said...


"So, Davidski, there are just five examples how your "elite" theory is more than likely totally bogus."

None of your examples seem to explain the 'star-like' phylogeny of the Y-chromosomes of the migrants in the Bronze Age, while the mtDNA did not experience this phenomenon.

Davidski and others are saying that there was an elite culture AMONG the incoming population, but that the migrant population as a whole had a (seperate from elitism) reproduction advantage over the locals, based on their culture and technology.

batman said...

The 'star-like' phylogeny is hardly anything but a reflection of the ancient reproduction-system, built on agnatic, five-stepped dynasties - forming 'extended families' ("aets") and effectively kingdoms and etnicities.

Using the present, post-christian monogamy as a standard model for the various stats and runs just won't do - neither deterministically nor stockastically.

From the facts known from the old civilizations the dominating reproduction-system was pyramidical and polygamous - rather than flat and monogamous.

bellbeakerblogger said...

Five additional mitogenomes from Southern Poland

Karl_K said...


In order to describe the 'star-like phylogeny" of Y chromosomes, you basically described a culture of elitism that is paternally inherited. Right? Dynasties and kingdoms.

But this was not always the case everywhere, because the Early Farmers of Europe did not have this pattern of inheritence. It was introduced to central and western Europe with the Bronze Age migrants.

Just like you yourself are saying.

huijbregts said...

@Algan mardi
You report that the results of your Global10/nMonte runs were very sensitive; weighting the model made them more coherent.
Your suggestion is that the weighting has improved the quality of the estimations. That is not necessarily true.
I don't doubt that penalizing the higher dimensions makes the results less sensitive, but the question is: what did you filter away, noise or relevant signal?
I invite you to have a closer look at the highest dimension of the Global10. At the negative end of dimension 10 you will find 22 African samples; at the positive end of dimension 10 you will find 3 African samples, followed by a lot of EN samples..
This definitely doesn't resemble noise.
So by penalizing the higher dimensions of the Global10 you are excluding information from the calculation of the Euclidean distance. I don't think this is a sound practice.

Algan mardi said...

First, thanks for the answer, second, congratulations about nMonte, its fantastic!
I don´t understand how it works plenty. AFAIK, the convergence works around (colMeans(matAdmix), I agree with you that penalizing the higher dimensions of the Global10 we are excluding information, but the point is if the PC´s of PCA are scaled or not, if not, maybe weighted them the euclidean distance betwen vectors reflects better the real distance betwen pops. I don´t know, you are the master, i just learn. Great work.

Algan mardi said...

Just semantic, maybe not excluding information, only weight it.

Alberto said...

Those last qpAdm models with better outgroups look good. Here compared to Global 10:

Sample - qpAdm proportions (AN/Hu_HG/Yamn) - Global 10 (idem)
CWC:I0049 - 16.3/7.2/76.5 - 15/15.4/69.6
CWC:I0103 - 17.5/13.1/69.4 - 17.6/11,6/70.8
CWC:I0104 - 22.5/8.5/68.9 - 19.6/12.2/68.2
CWC:I1532 - 21.9/15.3/62.8 - 22.8/14.8/62.4
CWC:RISE00 - 21.8/13.6/64.6 - 22/21/57
Halberstadt_LBA:I0099 - 35.4/21.8/42.8 - 33.4/23.4/43.2

Not bad.

Alberto said...

@Algan mardi

I think it's all debated there already. For some reason, the output from Global 10 seems to upscale the higher dimensions. I can only speculate that this might be for the purpose of being able to make plots that are visually meaningful (because if it gave the mathematically correct values, and you tried to plot PC1 vs. PC8, it will look almost like a straight line due to the much higher variance in PC1, which is not visually informative).

So weighting with the sqrt of the eigenvalues seems to get the PCA back to their mathematically correct values. Which doesn't have much of an advantage in most normal cases, but in a few ones it's clearly superior (since, after all, it's the correct ones that give correct euclidean distances between world populations, unlike the original, unweighted ones).

I've been running both side by side for a while. Most of the time one shouldn't worry about it, but overall I lean towards weighted values being better (because they avoid those bad cases, and overall seem to be the correct ones). In any case, that's not something that should affect nMonte. It should affect the input data (that is, you should weight the values of the datasheet with the correct weights posted on that thread too, not the ones first posted on anthrogenica).

Algan mardi said...

Of course it´s not a problem about nMonte but regard of the input data. Your answer really help and clarify.
Gracias por tus comentarios. Salud.

Olympus Mons said...


Aren't those half breeds BBC in there pulling too much towards Neolithic Portugal (NPO) and Hunter Gatherer south (HGS) in the PCA? -:)

Algan mardi said...

could you link the correct weights, i don´t find it.

Simon_W said...

"Somehow, they ended up ruling the regions they migrated to. But in all but 1 case (England) they got assimilated. Only in England, they managed to make their language survive."

I've heard this claim before, but it's clearly wrong. Think of all the German and Dutch speaking lands west of the Rhine, south of the Danube and beyond the Upper Germanic-Rhaetian Limes.

Alberto said...

@Algan mardi

These are the ones that Huijbregts suggested based on the sqrt of the eigenvalues. They seem to work good, soling the obvious problems without introducing bigger ones:

1, 0.846718713, 0.452874391, 0.323627789, 0.298822531, 0.284710759, 0.271762513, 0.220536005, 0.217007202, 0.214022091

Basically you'd need to multiply the values of each PC by those above.

xyyman said...

keep burying head in the sand. FACTS will not go away

huijbregts said...

@ Alberto
Do I understand you correctly that the Global10 scores are tweaked to make the graphics look better?
I think Davidski would have told us so.
There must be a better explanation.

Matt said...

Re: "star-like structures", when we talk about "star-like structures" , in the words of S Yan 2014 (the paper about Chinese Neolithic "super grandfathers"), we're really just talking about "multiple lineages branching off from a single node", in opposition to a history of "bifurcations" which "indicat(es) strong expansion events". No more or less than that.

To some degree I think you do see in all lineages - are any of the survivors today are rooted in a simple history of bifurcations over time since the early Holocene? Kivisild 2017 to me seems the best map for this at the moment, at least in West Eurasia and North America with quite different timings (the R1 clades the latest). Older stuff that lacks much coverage at depth outside Western and Northern Europe might be weaker at seeing when more and less "star-like" phases happened for groups outside R1.

(Though on the note the structure of pre-Indo European populations one thing I would say is that by Kivisild, the ydna I2a I-L621 clade that shows high peaks in South East Europe today and is essentially the I2 survivor at high frequency seems to have begun population expansion at 7,200 KYA, around 2,500 KYA before the R1 subclade expansions.

OTOH Sardinian I2a M26 has expansion around 10,000 KYA, at the same time as Sardinian G2a L166.

So not all of what we might assume to be pre-Yamnaya y-dna groups expanded at exactly the same time.)

@ Alberto & Algan, yeah, Alberto has said it all.

When we've talked about "weighting" PCA, this has kind of confused matters, because all we're really talking about is the pros of using an unscaled PCA vs an eigenvector scaled PCA . That is an unscaled PCA with each dimension * square root of the dimension eigenvector = eigenvector scaled PCA. They're exactly the same thing - when you tell a PCA software / algorithm to do eigenvector scaling, that's all it does.

(I believe some like huijbregts were dubious of the idea of using "weighting" in part because the impression and discussion came off that we looking at adding further "weighting" to an already eigenvector scaled PCA or in a way that was different from applying eigenvector scaling.

I think also others inc. Alberto realised the correct eigenvector scaling method much faster than I did, as I was trying out some other scaling factors, which were wrong, but after comparing output from PAST3 in eigenvector scaled vs eigenvector unscaled mode for PCA and Principal Coordinates Analysis, it's evident that the above is the correct way to do eigenvector scaling).

As Alberto says, there was a finding I think during the experiments that funnily enough the unscaled data actually still picks up expected closest relatives in euclidean distances and, with nMonte, most likely ancestors.
So the unscaled PCA are not necessarily even giving very wrong conclusions. I would say this is probably because even without proper scaling, the relationships are just evident in the structure of the dimensions, whatever the weight.

But I personally agree with generally using eigenvector scaled PCA whenever possible, and whenever you look at using a PCA for methods like nMonte based on calculating distances, tree building, etc., it is best to establish first whether the PCA has been eigenvector scaled or not.

Algan mardi said...

Thank you very much, very instructive and deep reply.
In brief, we need know first if "the PCA has been eigenvector scaled or not".

@ Alberto

Grey said...

mooreisbetter said...

"There are at least 10 different ways that Population B can have more descendants than Population A. Here are a few:"

do they all result in the same ydna/adna/mtdna pattern?

Karl_K said...


Of course most of us know how a star-like phylogeny arises from a sudden burst of success for a lineage.

The issue is only that these were seen in Bronze Age populations only in the Y haplogroup, and not in the mtDNA haplogroup.

This means that the direct male lineages had explosive expansions, while the direct female lineages, at exactly the same time and place, did not.

So, it wasn't simply a rapid expansion of the entire population that caused this.

In short, some (but definitely not all) lineages of men, over a very short period of time, had much much more successful children than the women at that same time had.

There is no other way around this fact.

Matt said...

@KarlK, OK, though would be interested in your comment on Karmin 2015's Cumulative Bayesian skyline plots of Y chromosome and mtDNA diversity by world regions (Fig 2 -

In neither case does population history of expansion of mtdna match y-dna population expansion (no shared expansion). Expansion of the population with Neolithic and later revolutions sees expansion in y-dna, after an initial regress (earlier in Near East, later in Europe), but seems to be no change in mtdna effective population size. Seems from that, is *no* "star-like" expansion of mtdna ever, not even in the early Neolithic due to generalised population growth.

Karl_K said...


The difference in y vs mt means that there were never any major (successful) population movements of modern humans with both men and women that were 'explosive' in population growth. Most, like the out-of-Africa or Sahul situations, were slow steady growth in a new environment.

The Bronze Age was different.

Matt said...

@Karl: But my point was that the Neolithic everywhere generally shows a pattern of explosive growth in males and steady state in females. Not just Bronze Age Europe. That is "The Bronze Age was (not) different (from the Neolithic generally)". Or am I reading these plots incorrectly?

huijbregts said...

@Matt, Alberto
As far as I know eigenvector scaling implies that you simultaneously change the eigenvectors and eigenvalues in an appropriate way. That will conserve the value of the resulting PCA scores.
The weighting method of Sangarius/Eren is weighting the PCA SCORES before using them in nMonte. This is a peculiar weighting method, which is different from eigenvector scaling. I suspect the mathematics of it very much. But in many cases the results might be not too far from the unweighted results, because 'Sangarius' specifically affects the higher dimensions.
Now it is possible that Alberto has found a correct scaling method, that I could not imagine. Unfortunately he refuses to share the script with me, so I don't know. But I doubt it.

capra internetensis said...


The time resolution of mtDNA is 20-30 times coarser than that of Y-DNA, so I don't think you could actually tell how fast the expansion is. For instance M has something like 50 primary branches, which is probably equivalent to the branching levels of F-GHIJK-HIJK-IJK-K-K2-MPS-P, C-C1-C1b-C1b1-C1b1b, etc, over several millennia.

xyyman said...

Quote by poster
“So you agree with xyyman that modern Europeans are not descendants of Central Asians
but are instead depigmented Africans who went into Europe less than 10,000 years ago”

Genetic differentiation between upland and lowland populations shapes the Y‑chromosomal landscape of West Asia - O. Balanovsky 2017
Yamnaya subpopulations studied to date), the question arises of [b]whether Yamnaya Y-chromosomes **also** originated
from West Asia[/b]

The currently available dataset does not contradict the hypothesis that R-GG400 marks a link between the East
European steppe dwellers and West Asians, though the route and [b]***even ***direction of this migration is disputable[/b]. xyyman comment – they know the truth but would not say.

Ric Hern said...

Depigmented Africans ? Less than 10 000 years ago ? CHG, WHG and EHG have been in Europe and the Caucasus for a very long time. Much older than 10 000 years...

Ric Hern said...

Are the depigmented Northeast Asians also from Africa less than 10 000 years ago ?

Ric Hern said...

And do you know what Depigmented actually means....? Most Europeans do have pigment just not as much as Africans. If they didn't have any they wouldn't be able to tan. Albinism is a totally different thing altogether.....

