[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: "Ratite" polyphyly and paleognathous dromornithids



And in which regions of  parameter space do real examples most often
>> occur?
>
> The infamous Felsenstein zone again (see above). Neighbor-joining
> generally outperforms MP when rates per branch are unequal (Kuhner &
> Felsenstein 1994).

When rates per character are unequal, however -- and they usually are --, model-based methods get into trouble when the model doesn't contain enough rate categories. Parsimony does not assume any correlation between the rates of any two characters and therefore performs better in those situations.

It's difficult to imagine  that a phenetic method would converge on
>> the right phylogeny more often than a phylogenetic method.
>
> Well, "phenetics" is an extremely loaded word. Some people use it to
> describe an approach to classification; ohers use it to describe a
> set of phylogenetic methods, but don't always agree on the exact
> contents of that set. They can usually agree on UPGMA, but
> neighbor-joining is where it starts to get complicated: it's a
> distance-based clustering algorithm, but it _does not_ cluster taxa
> on the basis of overall similarity. It allows different OTUs to have
> different rates of evolution, so A can be correctly linked to B even
> though it's more similar to C. It gives you an unrooted tree, just
> like parsimony and unlike UPGMA. And minimum evolution goes one step
> further by using an optimality criterion to compare multiple trees.
> It's basically parsimony for distance data: it prefers the tree with
> the least amount of change, the only difference being that parsimony
> measures the amount of change in character state transformations and
> ME in pairwise distances. Sure, it's still possible to refer to NJ
> and ME as "phenetics" and use the word as a shortcut for
> distance-based methods, but then the other claims about phenetics are
> no longer true: phenetics is _not_ based on overall similarity and
> there is certainly nothing inherently "non-phylogenetic" about it.

NJ, even ME, is still phenetic. They work on similarity that has been corrected by a model, but it's still similarity, it's still shared character states (whether observed directly or recalculated from the observations by the use of a model). Parsimony works exclusively on shared _derived_ character states, not shared character states in total. That's what makes it phylogenetic.

(In fact, there is nothing  non-phylogenetic about UPGMA. It gives
> you the right tree as long as the assumptions of the method are met
> -- i.e., there is a clock. Parsimony... gives you the right tree as
> long as the assumptions of the method are met -- i.e., rates of
> evolution do not vary significantly among OTUs.

UPGMA can only be right for the wrong reasons: it can only give you a tree that is congruent with a phylogenetic tree if there's little enough homoplasy in the data. When this condition is met, the phenetic tree happens to be identical to the phylogenetic tree. But you can't assume _a priori_ that there's little enough homoplasy in your dataset!

Adding a model can compensate for this assumption to varying extents. When you do that with parsimony, it isn't called parsimony anymore...

It's true that the assumptions  of parsimony are met more often that
> the assumptions of UPGMA, but their behavior can still be
> surprisingly similar in some cases: see Swofford et al. 2001.

Sure: in the simplest cases, in those where there's little enough homoplasy in the data, all methods (phenetic or phylogenetic) will give the same tree.

Both are inferior to  probabilistic
> methods such as Bayesian inference.)

Eh, that depends. Naturally I forgot the reference *sigh*, but I remember reading that BI is biased toward finding too symmetric trees. If the true/simulated tree has a Hennig comb at its base, BI commonly fails to find it and puts the OTUs of that comb into one or two small clades.

Also, Bayesian posterior probabilities are inflated for unknown reasons. (Bootstrap values are too low for likewise unknown reasons.)