[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: [dinosaur] Fossil theropods, dated phylogenies: topology, divergence dates, and macroevolutionary inferences

Thomas R. Holtz, Jr. <tholtz@geology.umd.edu> wrote:

The point is that traditional phylogenetics and modern ones are both hindered in the same way: there IS no direct independent empirical evidence. So saying "we have 45% confidence it is the ancestor" (for instance) is actually not something we can verify.

I have to admit I find the notion of "verifying" degrees of belief (probabilistic statements) confused. It seems to invite a sort of infinite regress: first, we obtain the probability of the parameter of interest taking on a particular value, then we estimate the probability of the estimated probability being correct, etc. Even if we had some independent evidence, this is not what we'd want to do with it, nor would we use it just to say whether the original value of 45% was right or wrong and leave it at that. Instead, the proper Bayesian thing to do would be to update it. That way, we would get a new number, whose difference from the previous value would be a concise, easily interpretable summary of exactly what impact the new evidence had on our initial hypothesis.

In principle, I don't see any difference between "we have 45% confidence it is the ancestor" and "we have 45% confidence it is the sister group". If there really is none, and both statements should be viewed as problematic in light of the fact that paleontologists have access to no data sources other than those aspects of morphology that are likely to be preserved in the fossil record, that either means that inferring fossil-only phylogenies is not worth the effort (given your publication history, I'm sure that is not your actual viewpoint), or that we should content ourselves with point estimates, without trying to quantify the uncertainty that is associated with them. I find this second position weird – surely the absence of independent evidence is an argument for putting error bars on our estimates, rather than against it!

Again, why should anyone treat that number seriously? This has always been the difficulty of testing actual ancestor-descendant relationships.

By Bayes' rule, that number tells you the probability that your hypothesis is correct to the extent that you trust your model. If your model is no good, the number will be worthless as well, of course, but this brings us back to a point made by Bapst et al. (2016): models that allow for sampled ancestors are more realistic than those that don't.

That is difficult to reconcile with what you wrote above, since many
of the algorithms in question (cal3 in paleotree, SA-BDSS as
implemented in BEAST and MrBayes) attempt to estimate
ancestor-descendant relationships.

(Actually, they attempted to look at several different parameters, not just that.

Yes, but these parameters (node ages, branch rates, topology) are interdependent, and ignoring ancestor-descendant relationships will affect all of them. For example, if a given taxon is a tip rather than a sampled ancestor, the same (or lower) amount of character change will be spread out over a longer period of time, resulting in a decreased rate of evolution and older divergence times. This, in turn, can affect a lot of downstream inferences.

At the risk of going off-topic, an excellent example of this recently came up in linguistics. Several dating analyses implied that languages traditionally thought to be ancestral to various languages spoken today, such as Latin and Old English, had been evolving separately from the true ancestors for hundreds of years by the time they first showed up in historical records. This was coupled with strong support for the Anatolian hypothesis of Indo-European origins (age of 9500 to 8000 years). However, when Chang et al. (2015) constrained the extinct languages to be directly ancestral rather than sister to their putative descendants, it was the Kurgan hypothesis (age of 6500 to 5500 years, more consistent with non-phylogenetic evidence) that emerged as the superior alternative. The analyses that did not account for sampled ancestors not only yielded the wrong topology (with tips that should not have been tips at all), but also the wrong divergence times, which in turn led to an incorrect ancestral range inference.
And ancestral STATE reconstruction is an extremely valuable new field that these algorithms are great at, but this is distinct from ancestral TAXON recognition.)

I fail to see what is so new about ancestral state reconstruction; people were already doing it back in the 1980s.


Chang W, Cathcart C, Hall D, Garrett A 2015 Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis. Language 91(1): 194–244

David Černý