[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Phylogenetics was Re: "Ratite" polyphyly and paleognathous dromornithids

Sorry for the delay. I got busy in meatspace...

Am 08.08.2012 um 22:00 schrieb David Černý:

David Marjanović  <david.marjanovic@gmx.at> wrote:
>> When rates per character are unequal, however -- and they usually
>> are --, model-based methods get into trouble when the model doesn't
>> contain enough rate categories. Parsimony does not assume any
>> correlation between the rates of any two characters and therefore
>> performs better in those situations.
> Is this Kolaczkowski & Thornton (2004) again? They showed that if
> the sequence evolution follows the Jukes-Cantor model and the
> proportion of heterotachous sites is between 32% and 68%, MP
> outperforms common-mechanism maximum likelihood. That's an
> interesting finding, but you seem to overestimate its importance.
> Using K2P+gamma instead of JC69 for the simulation is enough to make
> the difference almost completely disappear (Spencer et al. 2005).

Oh, yes, sorry.

And, of course, that was in  2004. Since then, mixture models have
> been developed in a Bayesian framework to accommodate heterotachy by
> summing the likelihood for each site over multiple sets of branch
> lengths (Zhou et al. 2007; Kolaczkowski & Thornton 2008; Pagel &
> Meade 2008). That's far more flexible than a discrete gamma model
> with rate categories (where the ratio of each branch length to the
> others is constant across the categories), better than a partitioned
> model (because you don't have to divide characters into partitions a
> priori -- you don't even have to specify the number of partitions
> prior to the analysis),

That's great! I have to read up on those.

and more... well,  _parsimonious_ than parsimony, because it's still
> far from parameterizing every branch-length/character combination
> separately. It still might lead to overparameterization, but
> reversible jump Markov chain Monte Carlo can take care of that
> (Pagel & Meade 2008).

Agreed. :-)

NJ, even ME, is still  phenetic. They work on similarity that has
been corrected by a model,  but it's still similarity, it's still
>> shared character states (whether observed directly or recalculated
>> from the observations by the use of a model). Parsimony works
>> exclusively on shared _derived_ character states, not shared
>> character states in total. That's what makes it phylogenetic.
> Nice, but if that doesn't guarantee it finds the right phylogeny
> more often than a non-phylogenetic method -- and by this point,
> every method except parsimony is "non-phylogenetic", as "work[ing]
> exclusively on shared derived character states" means dependence on
> a particular character-state assigment to the internal nodes of a
> fixed topology, which is a unique trait of parsimony -- then what's
> being phylogenetic good for? Is it a good idea to claim that only
> the possession of this property makes a method phylogenetic, if it
> actually isn't very helpful in inferring phylogenies? If methods
> working on all character states perform better than methods working
> exclusively on derived states, what justification is there left for
> using only the latter?

Wait, wait. I haven't said anything against maximum likelihood in this context (I've of course included it in "model-based methods", the other topic). As far as I know, ML uses the parsimony-uninformative characters to estimate the parameters of the model; that doesn't make it non-phylogenetic -- quite the opposite. It still uses synapomorphies to build the trees, right? It just doesn't take them directly from the dataset the way MP does; it calculates them from the dataset and the model.

What makes NJ (and UPGMA and WPGMA...) non-phylogenetic is that they take all the differences between any two taxa, average them into a single number (the percentage of similarity), assemble these into a distance matrix, and then work on the distance matrix. There is no attempt in there to distinguish synapomorphies from symplesiomorphies. That's why these algorithms can be, and are, used for entirely non-phylogenetic problems like the similarities between faunas at different sites.

That must be why people use ML and BI instead of phenetic methods these days.

UPGMA can only be right for  the wrong reasons: it can only give you
>> a tree that is congruent with a phylogenetic tree if there's little
>> enough homoplasy in the data. When this condition is met, the
>> phenetic tree happens to be identical to the phylogenetic tree. But
>> you can't assume _a priori_ that there's little enough homoplasy in
>> your dataset!
> I don't see how that's different from parsimony. Parsimony, too, can
> only give you a tree that is congruent with a phylogenetic tree if
> there's little enough variation in rates of evolution among the taxa
> in your data set. When this is true, it works just fine, but you
> can't assume it to be true a priori.

Fair enough; that's why ML and BI were developed.

Adding a model can  compensate for this assumption to varying
>> extents. When you do that with parsimony, it isn't called parsimony
>> anymore...
> You cannot add a model to parsimony, because parsimony itself is a
> model. Or rather it's a nonparametric shortcut to several different
> models (Farris 1973; Goldman 1990; Tuffley & Steel 1997), which all
> exhibit some rather strange properties. Their number of parameters
> grows as fast as new data are added to the analysis -- F73 and G90
> achieve it by treating ancestral character states as nuisance
> parameters, TS97 by giving a different set of branch-length
> parameters to every single character (Huelsenbeck et al. 2008). This
> makes them statistically inconsistent and, by the way, extremely
> non-parsimonious.

I'll have to read those; however, I don't see a reason to assume a priori that any two characters (that aren't correlated) wold evolve at the same speed ( = have the same set of branch-length parameters).

On the other hand, you can use  a model to correct the data for
> unobserved changes, just as with neighbor-joining, and subject the
> resulting data matrix to a parsimony analysis (= to a
> maximum-likelihood analysis using one of the "parsimony models").
> Steel et al. (1993) described how to do it, it's still called
> parsimony, nobody does it. Apparently it's philosophically
> objectionable.

Huh. Maybe it was too computation-intensive for 1993, so people forgot about it?

Sure: in the simplest  cases, in those where there's little enough
>> homoplasy in the data, all methods (phenetic or phylogenetic) will
>> give the same tree
> No, that's not what I've been talking about. The case explored by
> Swofford et al. involved an extreme amount of homoplasy -- but
> between two adjacent branches that evolved much faster than the
> remaining two (the "Farris zone").

Oh, so long-branch repulsion would be expected, right?

Parsimony grouped the long  branches together (correctly) because of
> their homoplasies,

So its bias to long-branch attraction is strong enough to overcome long-branch repulsion. Good to know.

UPGMA grouped the short  branches together (also correctly) because of
> their symplesiomorphies.

Were there enough of those left, or were they independent reversals?

Eh, that depends. Naturally  I forgot the reference *sigh*, but I
>> remember reading that BI is biased toward finding too symmetric
>> trees. If the true/simulated tree has a Hennig comb at its base, BI
>> commonly fails to find it and puts the OTUs of that comb into one
>> or two small clades.
> Sounds interesting, although I couldn't find the reference either.
> However, even if true, it doesn't seem unsolvable; it should be
> possible to counter it by biasing the MCMC proposal mechanism in the
> right direction.

I suppose.

Also, Bayesian posterior  probabilities are inflated for unknown
>> reasons. (Bootstrap values are too low for likewise unknown
>> reasons.)
> That's only true for some cases. If the model of evolution is
> correct, moderately overparameterized, or slightly oversimplified,
> the posterior probability of a clade corresponds extremely closely to
> the probability that the clade is correct given the data (Ronquist &
> Deans 2010). When the model is misspecified, posterior probabilities
> can be either inflated or too conservative.

Then apparently the former happens a lot. If you look through publications, almost every node in a Bayesian tree has a PP of 0.99 or 1.00.

Estimating models clearly isn't a nontrivial problem. Unfortunately, I don't know how the successor to ModelTest does it, or what happened to MrModelTest.


Thanks! I'll check them out from Monday onwards.