[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: Rconstructing DNA (was Re: Dino-fuzz found in amber?)

On Sat, Sep 17, 2011 at 11:08 PM, Tim Williams <tijawi@gmail.com> wrote:
> And even then, I would not be convinced.  Your assertion is
> unjustifiable.

It is so justified that molecular biology aply that reasoning and
there is many papers published with reconstructed ancestral peptide or
DNA sequences.

>  There is just no way that because a chicken has CTT
> encoding leucine in the second amino acid of this particular peptide
> that we can say that _T. rex_ is also more likely to have CTT at the
> equivalent position in its (unknown) DNA sequence.

I've just show you how. In the simplified version, we could use the
parsimony. As far as we have more information - from a more lengthy
sequence - we could estimate some parameters, e.g.
transversion/transition rate, codon usage bias, neutral evolution or
selection driven and so on.

>  BTW, the zebra finch has proline here instead of leucine.  What does that 
> tell you?

It is even more informative. since proline codons are: CCT, CCC, CCA & CCG

So it *corroborates* the C in the first position of the second codon.

>>>  Why do you assume that the choice of base in the chicken is ancestral?
>> Excuse me. Where did I have assumed such a thing? I'm not assuming
>> chicken sequence as ancestral.
> Well, yes you are.  You reach this assumption in a roundabout fashion,
> but it is exactly what your assumption boils down to.

Nope. What was done is to see the compatible nucleotides. In T-rex, a
priori, the first position in the first codon could be either U or a
C. Since in the chicken it was C, *by parsimony*, we conclude that C
in the same position is more likely in the T-rex. A more or less
independent test could be done, e.g., with zebra finch, which have a C
in the same position.

Of course a different conclusion could be reached as more information
is added. In the same way as a phylogenetic tree topology could change
when more information is added (more OTUs or more characters...).

I didnt see the zebra finch exact codon for proline for the second aa,
but lets suppose that it is CCG.

CCG in zebra finch, CTT in chicken, (C)T(N) in T-rex. The third
position in the codon of T-rex will remains unknown.

If the zebra finch codon is CCT, then it suggests that T-rex will have
T in that position.

>> What I'm assuming is that chicken and
>> T-rex sequences are homologous, are derived from a common ancestral.
> The homology of the sequences is not really in doubt, since both
> peptide sequences derive from the same part of alpha-2 collagen from
> two animals.  You are not actually referring to 'homology' here, but
> to 'sequence identity'.  You are making a huge leap of faith by
> claiming that _T. rex_ probably has whatever codon the chicken has,
> unless proven otherwise.

Nope. I'm referring to homology. It could be not homologous if the
similarity was a result of convergence - in this case the analysis
could not be done in this way.

Sequence identity is not necessary, as I've showed you above with the
zebra finch proline.

> This is a complete misuse/abuse of parsimony.  You only have a sample
> number of n=2 (chicken and the purported _T. rex_ sequence).

Yes, the fact that there are only two sequences limit the analysis -
it is implied (actually it is explicit more than just implied) when
I've said: "Of course, comparing this just two sequences, using only
parsimony - since the peptide sequences is identifical, we will endup
assuming that the common ancestral will have the chicken sequence".
But what I'm showing you is the process of inference. As I've said
before, if we had the crocodile sequence the analysis could be

>> Of course, comparing this just two sequences, using only parsimony -
>> since the peptide sequences is identifical, we will endup assuming
>> that the common ancestral will have the chicken sequence. We could do
>> it, and assign a margin of error due to mutation fixation rate.
>> Lets say that the mutation rate is constant over the time and along
>> the sequence of, say, 1 SNPs per 100 mya. In this case, about 4 SNPs
>> is expected.
> You've lost me here.

If a substitution rate of 1 bp per 100 mya is used (it is just for the
sake of illustration, not intended to be a realistic exercize), about
4 differences it the short sequence is to be expected. In this case,
the chicken sequence could be used for T-rex added with that margin of
expected differences. (The same with the zebra finch sequence, or the

Keep in mind that it is just a demonstration of how the comparative
analysis works. A actual analysis will be much more complex: more
sequences (including outgroups), use of realistic molecular clock,
will consider the codon usage bias, would make a neutral evolution
test, etc.


Roberto Takata