[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

*To*: DML <dinosaur@usc.edu>*Subject*: Phylogenetics was Re: "Ratite" polyphyly and paleognathous dromornithids*From*: David Marjanovic <david.marjanovic@gmx.at>*Date*: Sat, 18 Aug 2012 11:23:50 +0200*In-reply-to*: <CADHyUaRASdxtnUbQuV9Sm=_EKbJuP3DO+jT8VxkEXeU0kc9GzQ@mail.gmail.com>*References*: <CADHyUaRf+WRx6HCWrh8Uzhitg5di18mBzz9pGrKeSiAryk5t=Q@mail.gmail.com> <501D8089.8050203@gmx.at> <CADHyUaTV8G1SPsNzp7e5J=L4wgAgBXtEnAT3pdr+QGXcZyjzTQ@mail.gmail.com> <CADHyUaRASdxtnUbQuV9Sm=_EKbJuP3DO+jT8VxkEXeU0kc9GzQ@mail.gmail.com>*Reply-to*: david.marjanovic@gmx.at*Sender*: owner-DINOSAUR@usc.edu

Sorry for the delay. I got busy in meatspace... Am 08.08.2012 um 22:00 schrieb David Černý:

David Marjanović <david.marjanovic@gmx.at> wrote:

> >> When rates per character are unequal, however -- and they usually >> are --, model-based methods get into trouble when the model doesn't >> contain enough rate categories. Parsimony does not assume any >> correlation between the rates of any two characters and therefore >> performs better in those situations. > > Is this Kolaczkowski & Thornton (2004) again? They showed that if > the sequence evolution follows the Jukes-Cantor model and the > proportion of heterotachous sites is between 32% and 68%, MP > outperforms common-mechanism maximum likelihood. That's an > interesting finding, but you seem to overestimate its importance. > Using K2P+gamma instead of JC69 for the simulation is enough to make > the difference almost completely disappear (Spencer et al. 2005). Oh, yes, sorry.

And, of course, that was in 2004. Since then, mixture models have

> been developed in a Bayesian framework to accommodate heterotachy by > summing the likelihood for each site over multiple sets of branch > lengths (Zhou et al. 2007; Kolaczkowski & Thornton 2008; Pagel & > Meade 2008). That's far more flexible than a discrete gamma model > with rate categories (where the ratio of each branch length to the > others is constant across the categories), better than a partitioned > model (because you don't have to divide characters into partitions a > priori -- you don't even have to specify the number of partitions > prior to the analysis), That's great! I have to read up on those.

and more... well, _parsimonious_ than parsimony, because it's still

> far from parameterizing every branch-length/character combination > separately. It still might lead to overparameterization, but > reversible jump Markov chain Monte Carlo can take care of that > (Pagel & Meade 2008). Agreed. :-)

NJ, even ME, is still phenetic. They work on similarity that has been corrected by a model, but it's still similarity, it's still

>> shared character states (whether observed directly or recalculated >> from the observations by the use of a model). Parsimony works >> exclusively on shared _derived_ character states, not shared >> character states in total. That's what makes it phylogenetic. > > Nice, but if that doesn't guarantee it finds the right phylogeny > more often than a non-phylogenetic method -- and by this point, > every method except parsimony is "non-phylogenetic", as "work[ing] > exclusively on shared derived character states" means dependence on > a particular character-state assigment to the internal nodes of a > fixed topology, which is a unique trait of parsimony -- then what's > being phylogenetic good for? Is it a good idea to claim that only > the possession of this property makes a method phylogenetic, if it > actually isn't very helpful in inferring phylogenies? If methods > working on all character states perform better than methods working > exclusively on derived states, what justification is there left for > using only the latter?

UPGMA can only be right for the wrong reasons: it can only give you

>> a tree that is congruent with a phylogenetic tree if there's little >> enough homoplasy in the data. When this condition is met, the >> phenetic tree happens to be identical to the phylogenetic tree. But >> you can't assume _a priori_ that there's little enough homoplasy in >> your dataset! > > I don't see how that's different from parsimony. Parsimony, too, can > only give you a tree that is congruent with a phylogenetic tree if > there's little enough variation in rates of evolution among the taxa > in your data set. When this is true, it works just fine, but you > can't assume it to be true a priori. Fair enough; that's why ML and BI were developed.

Adding a model can compensate for this assumption to varying

>> extents. When you do that with parsimony, it isn't called parsimony >> anymore... > > You cannot add a model to parsimony, because parsimony itself is a > model. Or rather it's a nonparametric shortcut to several different > models (Farris 1973; Goldman 1990; Tuffley & Steel 1997), which all > exhibit some rather strange properties. Their number of parameters > grows as fast as new data are added to the analysis -- F73 and G90 > achieve it by treating ancestral character states as nuisance > parameters, TS97 by giving a different set of branch-length > parameters to every single character (Huelsenbeck et al. 2008). This > makes them statistically inconsistent and, by the way, extremely > non-parsimonious.

On the other hand, you can use a model to correct the data for

> unobserved changes, just as with neighbor-joining, and subject the > resulting data matrix to a parsimony analysis (= to a > maximum-likelihood analysis using one of the "parsimony models"). > Steel et al. (1993) described how to do it, it's still called > parsimony, nobody does it. Apparently it's philosophically > objectionable.

Sure: in the simplest cases, in those where there's little enough

>> homoplasy in the data, all methods (phenetic or phylogenetic) will >> give the same tree > > No, that's not what I've been talking about. The case explored by > Swofford et al. involved an extreme amount of homoplasy -- but > between two adjacent branches that evolved much faster than the > remaining two (the "Farris zone"). Oh, so long-branch repulsion would be expected, right?

Parsimony grouped the long branches together (correctly) because of

> their homoplasies,

UPGMA grouped the short branches together (also correctly) because of

> their symplesiomorphies. Were there enough of those left, or were they independent reversals?

Eh, that depends. Naturally I forgot the reference *sigh*, but I

>> remember reading that BI is biased toward finding too symmetric >> trees. If the true/simulated tree has a Hennig comb at its base, BI >> commonly fails to find it and puts the OTUs of that comb into one >> or two small clades. > > Sounds interesting, although I couldn't find the reference either. > However, even if true, it doesn't seem unsolvable; it should be > possible to counter it by biasing the MCMC proposal mechanism in the > right direction. I suppose.

Also, Bayesian posterior probabilities are inflated for unknown

>> reasons. (Bootstrap values are too low for likewise unknown >> reasons.) > > That's only true for some cases. If the model of evolution is > correct, moderately overparameterized, or slightly oversimplified, > the posterior probability of a clade corresponds extremely closely to > the probability that the clade is correct given the data (Ronquist & > Deans 2010). When the model is misspecified, posterior probabilities > can be either inflated or too conservative.

*Refs:*

Thanks! I'll check them out from Monday onwards.

**Follow-Ups**:**Re: Phylogenetics was Re: "Ratite" polyphyly and paleognathous dromornithids***From:*David Černý <david.cerny1@gmail.com>

**References**:**"Ratite" polyphyly and paleognathous dromornithids***From:*David Černý <david.cerny1@gmail.com>

**Re: "Ratite" polyphyly and paleognathous dromornithids***From:*David Marjanovic <david.marjanovic@gmx.at>

**Re: "Ratite" polyphyly and paleognathous dromornithids***From:*David Černý <david.cerny1@gmail.com>

**Re: "Ratite" polyphyly and paleognathous dromornithids***From:*David Černý <david.cerny1@gmail.com>

- Prev by Date:
**Re: SVP meeting hotels -- followup** - Next by Date:
**Re: SVP meeting hotels -- followup** - Previous by thread:
**Re: "Ratite" polyphyly and paleognathous dromornithids** - Next by thread:
**Re: Phylogenetics was Re: "Ratite" polyphyly and paleognathous dromornithids** - Indexes: