I don’t want to belabor the whole thing here. But there are few issues that do have to be dealt with:
> In principle, I don't see any difference between "we have 45% confidence it is the ancestor" and "we have 45% confidence it is the sister group".
Um, yikes! Herein is a major problem. An ancestor is a very, very, very specific thing: a member of the population from whom the later clade of interest is directly derived. I.e., your grandparents are ancestors, but your granduncles and aunts are not.
A sister group is simply the taxon within a given analysis which is more closely related to the clade of interest than all other taxa *in that analysis*. So finding that A is the sister taxon to C in an analysis is *NOT* a statement that “A was the very closest relative to C of all organisms that ever lived.” It is simply an analysis-level topology. [see note below]
In contrast, saying A is the ancestor to C means that in the actual real world, all members of C are directly derived from the population A. That is an exceedingly difficult and problematic thing to demonstrate.
To constrain it to the issue of bird origins, we might be confident (hypothetically here: in reality, the base of Paraves remains in a real state of flux) that Archaeopteryx lithographica is closer to extant birds than to all other operational taxonomic units in our study. But we really can’t say that it was Archaeopteryx lithographica that was the ancestor. This is especially true for the pre-Pleistocene fossil record, where recovery of a given type of fossil is extremely spotty. Can we be secure that it was the Bavarian species that was the ancestor to birds, and not a species in Greenland, or western North America, or Australia, or Africa, or what have you? No, the sampling is by no means anywhere near dense enough to even know the regional species geography of early Tithonian small theropods.
(In the very youngest part of the fossil record, sampling for some taxa (medium-to-large bodied mammals) might well be good enough to actually be more secure in such a statement.)
Furthermore, there really isn’t even a strong knowledge benefit of knowing a given Tithonian maniraptoran was the direct ancestor of Passer and Struthio. To know the ancestral states present in that part of the tree might be good enough to make reasonable tests of such issues as the origin of avian flight or the dietary habits of birds in that part of the tree.
[My note from above: for this discussion, I am dealing with taxa as operational taxonomic units (entities put into the matrix). This is independent of the issue of phylogenetic taxonomic nomenclature. In that situation, but the use of node-stem triplets, we can define sister groups. But that wasn’t the sort of matter being discussed.]
Since most of the remaining discussion was predicated on not seeing a difference between sister group and ancestor, I’ll let it go. But I do want to address this bit:
Ø I fail to see what is so new about ancestral state reconstruction; people were already doing it back in the 1980s.
I was referring to the newer software, which does in fact allow far more sophisticated means of assessing likelihoods and so forth of a given ancestral node state than traditional parsimony ACCTRANS/DELTRANS methods. Advances in computer speed and more sophisticated algorithms have greatly improved these, and (in my opinion) it is this end of phylogenetics and not necessarily tree reconstruction per se which has seen the greater advances in the last decade.
Thomas R. Holtz, Jr.
Email: firstname.lastname@example.org Phone: 301-405-4084
Principal Lecturer, Vertebrate Paleontology
Office: Geology 4106, 8000 Regents Dr., College Park MD 20742
Dept. of Geology, University of Maryland
Faculty Director, Science & Global Change Program, College Park Scholars
Office: Centreville 1216, 4243 Valley Dr., College Park MD 20742
Mailing Address: Thomas R. Holtz, Jr.
Department of Geology
Building 237, Room 1117
8000 Regents Drive
University of Maryland
College Park, MD 20742-4211 USA
From: email@example.com [mailto:firstname.lastname@example.org] On Behalf Of David Cerný
Sent: Wednesday, July 20, 2016 1:29 PM
Subject: Re: [dinosaur] Fossil theropods, dated phylogenies: topology, divergence dates, and macroevolutionary inferences
The point is that traditional phylogenetics and modern ones are both hindered in the same way: there IS no direct independent empirical evidence. So saying "we have 45% confidence it is the ancestor" (for instance) is actually not something we can verify.
I have to admit I find the notion of "verifying" degrees of belief (probabilistic statements) confused. It seems to invite a sort of infinite regress: first, we obtain the probability of the parameter of interest taking on a particular value, then we estimate the probability of the estimated probability being correct, etc. Even if we had some independent evidence, this is not what we'd want to do with it, nor would we use it just to say whether the original value of 45% was right or wrong and leave it at that. Instead, the proper Bayesian thing to do would be to update it. That way, we would get a new number, whose difference from the previous value would be a concise, easily interpretable summary of exactly what impact the new evidence had on our initial hypothesis.
In principle, I don't see any difference between "we have 45% confidence it is the ancestor" and "we have 45% confidence it is the sister group". If there really is none, and both statements should be viewed as problematic in light of the fact that paleontologists have access to no data sources other than those aspects of morphology that are likely to be preserved in the fossil record, that either means that inferring fossil-only phylogenies is not worth the effort (given your publication history, I'm sure that is not your actual viewpoint), or that we should content ourselves with point estimates, without trying to quantify the uncertainty that is associated with them. I find this second position weird – surely the absence of independent evidence is an argument for putting error bars on our estimates, rather than against it!
Again, why should anyone treat that number seriously? This has always been the difficulty of testing actual ancestor-descendant relationships.
By Bayes' rule, that number tells you the probability that your hypothesis is correct to the extent that you trust your model. If your model is no good, the number will be worthless as well, of course, but this brings us back to a point made by Bapst et al. (2016): models that allow for sampled ancestors are more realistic than those that don't.
That is difficult to reconcile with what you wrote above, since many
of the algorithms in question (cal3 in paleotree, SA-BDSS as
implemented in BEAST and MrBayes) attempt to estimate
(Actually, they attempted to look at several different parameters, not just that.
Yes, but these parameters (node ages, branch rates, topology) are interdependent, and ignoring ancestor-descendant relationships will affect all of them. For example, if a given taxon is a tip rather than a sampled ancestor, the same (or lower) amount of character change will be spread out over a longer period of time, resulting in a decreased rate of evolution and older divergence times. This, in turn, can affect a lot of downstream inferences.
At the risk of going off-topic, an excellent example of this recently came up in linguistics. Several dating analyses implied that languages traditionally thought to be ancestral to various languages spoken today, such as Latin and Old English, had been evolving separately from the true ancestors for hundreds of years by the time they first showed up in historical records. This was coupled with strong support for the Anatolian hypothesis of Indo-European origins (age of 9500 to 8000 years). However, when Chang et al. (2015) constrained the extinct languages to be directly ancestral rather than sister to their putative descendants, it was the Kurgan hypothesis (age of 6500 to 5500 years, more consistent with non-phylogenetic evidence) that emerged as the superior alternative. The analyses that did not account for sampled ancestors not only yielded the wrong topology (with tips that should not have been tips at all), but also the wrong divergence times, which in turn led to an incorrect ancestral range inference.
And ancestral STATE reconstruction is an extremely valuable new field that these algorithms are great at, but this is distinct from ancestral TAXON recognition.)
I fail to see what is so new about ancestral state reconstruction; people were already doing it back in the 1980s.
Chang W, Cathcart C, Hall D, Garrett A 2015 Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis. Language 91(1): 194–244