[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: Keesey on a mathematical approach to defining clade names -- or -- Whatever Happened To Baby New Papers?

On 9/26/07, Roberto Takata <rmtakata@gmail.com> wrote:
> On 9/26/07, Mike Taylor <mike@indexdata.com> wrote:
> > Fine for node-based clades, not so hot for branch- or apomorphy-based,
> > nor for the more complex definition types encompassed by Mike's
> > calculus.
> A branch definition would not be difficult: we have just to *divide*
> by the sister-group clade-number.

Branch-based definitions use species or specimens (or genera or clades
within the sister group, outside of the PhyloCode) as external
specifiers, not the sister group itself. (And it's possible for a
branch-based clade to have multiple sister groups, although let's not
get into that right now.) Also, that strategy could potentially result
in numbers with unending sequences after the decimal point --
something computers are terrible at storing. You would have to store
that as a list of two numbers, in which case the division is
superfluous, anyway.

(And what about apomorphy-based definitions.)

Using primes to represent specifiers would be wasteful from a database
point of view, when you could just use integers as lookup keys and
form definitions based on lists of keys. Essentially that's all you're
doing, except you're limiting available keys to a subset of all
integers and making it very difficult to retrieve a specifier list
from the clade number. Figuring out the primes that constitute the
multiples of a large integer is not an easy calculation -- certainly a
lot less easy than just processing a list of integer keys. Points for
ingenuity, but not for efficiency. Too many unnecessary steps.

Other database projects (as far as I know) use the "list of integer
keys" strategy (or something like it, e.g., tables that link a key in
a "clade definition" table to multiple keys in a "specifier" table).
But, as I point out in the paper, it's not possible to cover every
conceivable type of definition that way. That approach requires that
you classify every type of definition as one of a finite list of
types. That will work for most definitions in existence, but there are
also definitions with qualifying clauses, modified definitions, and
other oddballs. (Two examples are in my paper, include Clarke's [2004]
definition of _Ichthyornis_.)

Math is more than just numbers. In fact, the math in my paper doesn't
use numbers at all, except briefly in the Appendix. Graph theory and
set theory are much more applicable to phylogenetics than arithmetic.
Phylogeny is branching, not linear.

Mike Keesey