[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: "Ratite" polyphyly and paleognathous dromornithids

David Marjanović <david.marjanovic@gmx.at> wrote:

> When rates per character are unequal, however -- and they usually are --,
> model-based methods get into trouble when the model doesn't contain enough
> rate categories. Parsimony does not assume any correlation between the rates
> of any two characters and therefore performs better in those situations.

Is this Kolaczkowski & Thornton (2004) again? They showed that if the
sequence evolution follows the Jukes-Cantor model and the proportion
of heterotachous sites is between 32% and 68%, MP outperforms
common-mechanism maximum likelihood. That's an interesting finding,
but you seem to overestimate its importance. Using K2P+gamma instead
of JC69 for the simulation is enough to make the difference almost
completely disappear (Spencer et al. 2005). And, of course, that was
in 2004. Since then, mixture models have been developed in a Bayesian
framework to accommodate heterotachy by summing the likelihood for
each site over multiple sets of branch lengths (Zhou et al. 2007;
Kolaczkowski & Thornton 2008; Pagel & Meade 2008). That's far more
flexible than a discrete gamma model with rate categories (where the
ratio of each branch length to the others is constant across the
categories), better than a partitioned model (because you don't have
to divide characters into partitions a priori -- you don't even have
to specify the number of partitions prior to the analysis), and
more... well, _parsimonious_ than parsimony, because it's still far
from parameterizing every branch-length/character combination
separately. It still might lead to overparameterization, but
reversible jump Markov chain Monte Carlo can take care of that (Pagel
& Meade 2008).

> NJ, even ME, is still phenetic. They work on similarity that has been
> corrected by a model, but it's still similarity, it's still shared character
> states (whether observed directly or recalculated from the observations by
> the use of a model). Parsimony works exclusively on shared _derived_
> character states, not shared character states in total. That's what makes it
> phylogenetic.

Nice, but if that doesn't guarantee it finds the right phylogeny more
often than a non-phylogenetic method -- and by this point, every
method except parsimony is "non-phylogenetic", as "work[ing]
exclusively on shared derived character states" means dependence on a
particular character-state assigment to the internal nodes of a fixed
topology, which is a unique trait of parsimony -- then what's being
phylogenetic good for? Is it a good idea to claim that only the
possession of this property makes a method phylogenetic, if it
actually isn't very helpful in inferring phylogenies? If methods
working on all character states perform better than methods working
exclusively on derived states, what justification is there left for
using only the latter?

> UPGMA can only be right for the wrong reasons: it can only give you a tree
> that is congruent with a phylogenetic tree if there's little enough
> homoplasy in the data. When this condition is met, the phenetic tree happens
> to be identical to the phylogenetic tree. But you can't assume _a priori_
> that there's little enough homoplasy in your dataset!

I don't see how that's different from parsimony. Parsimony, too, can
only give you a tree that is congruent with a phylogenetic tree if
there's little enough variation in rates of evolution among the taxa
in your data set. When this is true, it works just fine, but you can't
assume it to be true a priori. I wouldn't call that "being right for
the wrong reasons", I would call it "being right because the
assumptions of the method aren't violated by the data". Your
description applies quite well to some related situations, though,
such as the behavior of both parsimony and UPGMA in the Farris zone
(see below).

> Adding a model can compensate for this assumption to varying extents. When
> you do that with parsimony, it isn't called parsimony anymore...

You cannot add a model to parsimony, because parsimony itself is a
model. Or rather it's a nonparametric shortcut to several different
models (Farris 1973; Goldman 1990; Tuffley & Steel 1997), which all
exhibit some rather strange properties. Their number of parameters
grows as fast as new data are added to the analysis -- F73 and G90
achieve it by treating ancestral character states as nuisance
parameters, TS97 by giving a different set of branch-length parameters
to every single character (Huelsenbeck et al. 2008). This makes them
statistically inconsistent and, by the way, extremely

On the other hand, you can use a model to correct the data for
unobserved changes, just as with neighbor-joining, and subject the
resulting data matrix to a parsimony analysis (= to a
maximum-likelihood analysis using one of the "parsimony models").
Steel et al. (1993) described how to do it, it's still called
parsimony, nobody does it. Apparently it's philosophically

> Sure: in the simplest cases, in those where there's little enough homoplasy
> in the data, all methods (phenetic or phylogenetic) will give the same tree

No, that's not what I've been talking about. The case explored by
Swofford et al. involved an extreme amount of homoplasy -- but between
two adjacent branches that evolved much faster than the remaining two
(the "Farris zone"). Parsimony grouped the long branches together
(correctly) because of their homoplasies, UPGMA grouped the short
branches together (also correctly) because of their symplesiomorphies.
Both methods were able to find the correct tree only because their
bias worked in their favor.

> Eh, that depends. Naturally I forgot the reference *sigh*, but I remember
> reading that BI is biased toward finding too symmetric trees. If the
> true/simulated tree has a Hennig comb at its base, BI commonly fails to find
> it and puts the OTUs of that comb into one or two small clades.

Sounds interesting, although I couldn't find the reference either.
However, even if true, it doesn't seem unsolvable; it should be
possible to counter it by biasing the MCMC proposal mechanism in the
right direction.

> Also, Bayesian posterior probabilities are inflated for unknown reasons.
> (Bootstrap values are too low for likewise unknown reasons.)

That's only true for some cases. If the model of evolution is correct,
moderately overparameterized, or slightly oversimplified, the
posterior probability of a clade corresponds extremely closely to the
probability that the clade is correct given the data (Ronquist & Deans
2010). When the model is misspecified, posterior probabilities can be
either inflated or too conservative.


Farris JS 1973 A probability model for inferring evolutionary trees.
Syst Zool 22: 250-6

Goldman N 1990 Maximum likelihood inference of phylogenetic trees with
special reference to a Poisson process model of DNA substitution and
to parsimony analyses. Syst Zool 39: 345-61

Huelsenbeck JP, Ané C, Larget B, Ronquist F 2008 A Bayesian
perspective on a non-parsimonious parsimony model. Syst Biol 57(3):

Kolaczkowski B, Thornton JW 2004 Performance of maximum parsimony and
likelihood phylogenetics when evolution is heterogeneous. Nature
431(7011): 980-4

Kolaczkowski B, Thornton JW 2008 A mixed branch length model of
heterotachy improves phylogenetic accuracy. Mol Biol Evol 25(6):

Pagel M, Meade A 2008 Modelling heterotachy in phylogenetic inference
by reversible-jump Markov chain Monte Carlo. Phil Trans R Soc B
363(1512): 3955-64

Ronquist F, Deans AR 2010 Bayesian phylogenetics and its influence on
insect systematics. Annu Rev Entomol 55:189-206

Spencer M, Susko E, Roger AJ 2005 Likelihood, parsimony, and
heterogeneous evolution. Mol Biol Evol 22(5): 1161-4

Steel MA, Hendy MD, Penny D 1993 Parsimony can be consistent! Syst
Biol 42(4): 581-7

Tuffley C, Steel MA 1997 Links between maximum likelihood and maximum
parsimony under a simple model of site substitution. Bull Math Biol
59: 581-607

Zhou Y, Rodrigue N, Lartillot N, Philippe H 2007 Evaluation of models
handling heterotachy in phylogenetic inference. BMC Evol Biol 7: 206

David Černý