[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Reduced Consensus (Was: Afrotheria revisited)

Allen Hazen writes:
 >     There are two cladograms described as the strict consensus trees 
 > on the basis of 52 character traits.  One is said to be the consensus 
 > of 105 equally parsimonious trees, of length 225, and is pretty much 
 > phylogenetic grass.  On the same page there is another tree, the 
 > strict consensus of 4 equally parsimonious trees of length 214: a 
 > remarkable improvement, since it is almost totally resolved (3 
 > trichotomies, as opposed to the "16-otomy" in the first).  No comment 
 > is made about the difference between the two, and it took a minute of 
 > looking-- and counting the taxa covered in the two analyses-- to 
 > figure out what the difference was:   the 52 traits gave grass for 23 
 > taxa, but almost perfect resolution for 22.
 >      The procedure may well be legitimate (the taxon dropped is 
 > Arsinotherium, and I can imagine it being autapomorphic enough to 
 > confuse an analysis of the others), but it FEELS like throwing out 
 > inconvenient data and I'd have preferred some discussion.

Hi, Allen.  I've only just got my head around this myself, so I'm
perfectly positioned to explain it :-)

The procudure you're seeing here is called Reduced Consensus (or
sometimes Reduced Strict Consensus or Reduced Component Consensus or
some other variant).  The idea is that you include all your taxa in
the analysis, so that (in this case) _Arsinotherium_'s funny character
states have a chance to influence the character transitions.  Then,
you find your 105 MPTs, but you see that you seem to have a big
uninformative polytomy.  Closer investigation shows that nearly all
this uncertainty is the fault of a single taxon, though, which jumps
around between different positions (perhaps because it has unusually
contradictory states, or simple because it's scored for too few
characters to allow a firm position to be assign).  So you chop
Arsinotherium out of the generated trees, and find that once you do
that, many of the previously differing trees are equivalent -- in this
case, leaving only four variants.

Why is this not "cheating"?  Because you are not discarding any
information (as you would be if you simply excluded Arsinotherium from
the analysis).  All you are doing is _not saying anything_ about
Arsinotherium's position.  And that's always OK.

Why is this necessary?  Because a single "rogue taxon" can break apart
many strict-consensus relationships.  Imagine you doing a phylogenetic
analysis for the whole of Dinosauria.  You include a fragmentary basal
form -- let's call it Eodino.  It turns out that, because it's so very
close to the base, it's not 100 certain that it's a saurischian.  In
one of your 200 MPTs, it comes out as an ornithischian.  That means
that when you make a strict consensus, you lose the classic
saurischian/ornithischian dichotomy at the bottom of your tree!  You
have a big basal dinosaurian polytomy with theropods, sauropodomorphs
and ornithischians all in there together.  But, really, your analysis
isn't suggesting that theropods might be more closely related to
ornithischians than to sauropodomorphs.  All your trees show those two
clades far away from each other -- it's just that in one of your
trees, Eodino is a basal theropod and in another it's a basal
ornithischian.  So by pruning it out of your trees and calculating the
reduced consensus, all your doing is allowing your non-Eodino results
to shine through.

Hope this helps.

 _/|_    ___________________________________________________________________
/o ) \/  Mike Taylor    <mike@indexdata.com>    http://www.miketaylor.org.uk
)_v__/\  "Debugging?  Klingons do not debug.  Our software does not coddle
         the weak." -- Klingon Programming Mantra