[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]
Taxon sampling in cladistic analyses - some results from DNA
a few loci later, I still find it almost impossible to get an Eufalconimorphae
clade based on "proper" gene sequences alone (i.e. without transposable
There are some interesting observations though.
For the following, if I say "robust" I mean "including resilience to changes in
taxon sample". Otherwise I say "strong" or similar. The two are most assuredly
NOT the same.
(The retroposon signal for Eufaconimorphae is *strong*. Its robustness has not
been tested at all, and this is is why I think the paper should not have been
Ornithine decarboxylase (OCD, some part of the region between exons 6 and 8 is
usually sequenced) is the only locus I have found yet where there is a strong
signal for falconid+passerine monophyly. It is not very robust though, and
parrots are convergent to Aequornithes. I have not looked at why (i.e. what the
offending sequence is). OCD in general has more severe problems with finding
the correct rootpoint for subclades than comparable loci. It is also being
remarkable for reliably clading accipitrids and cathartids, and for clading
half the "higher landbirds" with Aequornithes (inside them or sister to them)
while recovering ther est as robust clade.
Otherwise, falconids closer to passerines than to accipitrids is slightly more
often found than accipitrids closer to passerines than to falconids. But both
cases are neither common nor robust, usually the three are part of an
effectively unresolveable polytomy.
And here it gets interesting, because this polytomy *might* represent a
definite dichotomy in the Neoaves - Aequornithes vs "higher landbirds".
Almost always the following lineages clade consistently and rather robustly:
* "picocoracines" (s.l., including hoopoes etc)
"Almost" because you usually have one of these lineages for every locus that is
convergent with something entirely different. Like psittaciforms in OCD. But
the divergent lineage varies between loci both in its identitiy and its
attachment point outside "higher landbirds". I.e. it is not always or even
preferrably the same lineage of "higher landbirds" that drops out, and those
that drop out do not attach to constant parts elsewhere in the phylogeny.
There is one exception: passerines. We know this from mt data already. If
passerines drop out, they usually go to the area between the base of the
putative "higher landbird" clade and the root of Neoaves. I.e. they may appear
basal to other "higher landbirds", form a polytomy with the latter and
Aequornithes, or go basal in or even to Neoaves.
This is likely misroot attraction[*], i.e. convergence between the hypothesized
base of passeriforms and the hypothesized base of Neoaves.
There is also one problem, or rather two or three: especially columbids and
cypselomorphs, and to a lesser extent psittaciforms, are genetically "wild"....
and they are all candidates for inclusion in this clade.
The cypselomorph base *might* be boosted with taxa to an extent that they clade
more readily, but perhaps the effects of heterothermy running deep in this
lineage permeate their genome. Pigeons and doves OTOH are just weird... it is
sometimes barely recognizable that you deal with the same locus as in your
comparison taxa, it doesn't align properly at all!
I suspect pigeons to be the culprits behind "Metaves". Basically, *everything*
that is so beset with unusual and long transposons in bFibInt7 that it refuses
to readily clade with anything else LBAs to columbiforms. That's my present
working hypothesis at least.
As to "Eufalconimorphae", the troublemaker is almost certainly _Colius_. No
matter what you find clading with passerines in the "higher landbirds" - remove
or add a mousebird, and it usually looks *very* different. The mousebird
doesn't even have to be near passerines (it usually is not *that* close). Its
effect seems to be the disruption of the picocoracines, which ramifies through
the "higher landbirds".
I have no idea yet why this is so. Upupiforms have also very unusual sequences
(as far as they have been sampled, which is OK but not outstanding), but with
_Colius_ it is less obvious, it aligns better than one might expect from the
effect it has.
Basically, adding a mousebird seems to create pseudoconvergence within the
analysis. "Pseudo-" because you have to scrape the bottom of the data to
recover a phylogenetic signal, so I basically suspect the algorithms simply
*invent* a "phylogenetic signal" (which then happens to be convergent) from
Mousebirds, then, are perhaps the #1 avian taxon which can only be permitted
into a mol analysis to test for destabilizing effects. Until the reason for
their odd behavior is known, it is dangerous to include them "for
completeness". Furthermore I do not think that until this has happened, it is
impossible to resolve their affinities based on molecular data. All you will
end up with is a huge load of "phylogenetic signal" that may almost all be
invented whole cloth.
And this is perhaps the take-home message here: our analyses are by now
sophisticated to the point where, if they find no signal for the clear
dichotomies they are optimized to resolve the data into, they can invent one.
Essentially, we have advanced beyond the point where support values can be
depended upon as meaningful indicators of clade robustness independent of taxon
sample composition. I have a few dozen trees lying around which leave little
room for doubt as to this, and more are in the works. Support for
_Colius_+[whatever] is typically >0.75. Obviously, given that "[whatever]"
varies, this is at least in part artefactual.
But luckily, regarding mousebirds we have already this:
which builds upon the somewhat older and less complete
I have both papers, in case anyone needs them. The first step to resolve what
mousebirds *are* is to plug _Eocolius_ into a numerical analysis, e.g. the L&Z
matrix. It should be scorable, but perhaps not from the literature
(http://www.springerlink.com/content/q56148836757u02m/, I have this too). It is
no coliiform apparently, but might be the needed "missing link". If we could
narrows down the sister group of mousebirds beyond "unspecified 'higher
landbirds'" - and being essentially living fossils with an abundant hypodigm,
such an indication *cannot* come from DNA - this would help a lot with the DNA
(If I *had* to guess, I'd would put my money on upupiforms/bucerotiforms or
trogons as sister to mousebirds. Very distant sister though. They are
"similarly weird" in molecular analyses, their total effect is much like that
of cypselomorphs - doesn't clade (or only barely so) but alters everything
around it and then some.
Need to check out for what loci there are _Colius_ + _Urocolius_ sequences.
Perhaps it's just misrooting; then it could be solved via DNA. But considering
the work that has gone into quantitative analyses of the fossil record, it's a
waste disregarding that.)
There is definite need to control present-generation cladistic analyses for
taxon add/remove effects. This is obviously much easier for morph analyses,
because there you can use the qualitative assessment as guideline; you can tell
in advance which taxa are troublemaker candidates. For molecular analyses,
there are obvious cases like parrots sister to storks (OCD dataset)
I also think that the sequencing of the _Columba livia_ genome will allow to
answer a lot of questions. Especially since with mallard, chicken and
zebrafinch we have a phylogenetic framework of "mainstream" taxa to compare
And I think that point-mutation indels can be analyzed conventionally, or at
least they do little harm and may carry a useful phylogenetic signal. However,
it is always good to check whether they occur in "weak" regions of a locus
(where indels and point mutations are frequent), iin which case they may be
As regards transposable elements, any such analysis has to deal with the
caveats mentioned in the fallout of the "Pegasoferae" case before drawing any
Especially their distribution *within* a gene pool/species warrants attention.
I am not certain that interspecific variation is markedly higher here than
interspecific variation. They are called not "transposable" without a reason -
if you find one at a particular position in a particular species, you cannot
per se assume it's present just the same in the sister *individual* to the one
sequenced. However, there is probably insufficient comparison data to solve
this question yet (_Gallus gallus_ and _Anas platyrhynchos_ are the only taxa
for which enough individuals seem to have been sequenced).
* I haven't found a better term. It is a distinct phenomenon from LBA, but it
is just as significant. You need a fairly good taxon sample to notice it
though, hence I'm not the first to discuss it (except on-list I think) but
there are no papers either. It's occasional lab talk, it has been mentioned in
phylogenetic studies when they were still using phenetics even, but the data
were insufficient to actually research it until a few years ago.
It is easy to test: detach subtree and display as unrooted star phylogeny.
Misrooting does not significantly change the *relative* relationships among
lineages, only the *absolute* one.
PS: looking at the data, the mousebird problem may be best expressed as
Coliiformes having a fatal attraction to accipitrids. They don't clade, but
mousebirds alter tree topology to draw accipitrids away from passerines and/or
falcons. Given that all three seem to be pretty close relatives, this is
usually enough to push falcons 1-3 steps away from passerines.