[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Re: Rconstructing DNA (was Re: Dino-fuzz found in amber?)



On Thu, Sep 22, 2011 at 3:01 AM, Tim Williams <tijawi@gmail.com> wrote:
> No, it won't.  You can obtain as many sequences as you like (bird,
> crocodilian, whatever) but it won't bring you any closer to knowing
> what the sequence of _T. rex_ was.  You may believe that it does, but
> it really won't.

It will. Your error here is say "won't bring *any* closer". I will
show once more how it works.

We have a crocodile (Crocodylus johnsoni) peptide (COX3 fragment):
http://www.ncbi.nlm.nih.gov/protein/ADB07161.1
And Gallus gallus homologous protein fragment and correspondent DNA
sequence: http://www.ncbi.nlm.nih.gov/protein/YP_004300400.1,
http://www.ncbi.nlm.nih.gov/nuccore/NC_015238.2?report=fasta&from=8639&to=9422

I've done the analysis using only parsimony for the first 60 aa.

https://docs.google.com/spreadsheet/ccc?key=0Ap3350WTR-zndDY1MXRkWG1ESmpjNzFnYy1GbzVPc0E&hl=en_US

Of 60 codons, 14 would be dubious for the first and second position
analysing only the aa sequence and using the vertebrate mithocondrion
genetic code table. Using G. gallus DNA sequence as a guide, those
positions for 8 codons was right, 3 was wrong and 3 could not be
determined. Surely it is an improvement.

>> Bird DNA sequences, crocodilian DNA sequences and
>> dinosaurian peptide sequences. I've showed how bird DNA sequence help
>> to complete croc DNA sequence with the info obtained with croc peptide
>> sequence.
>
>
> Did you really "show" this... or did you just speculate that it could be done?

Showing example and showing how dubious positions is clarified.

> You're overlooking or ignoring a key point here.  A consensus sequence
> includes degenerate codon positions, where the base position can be
> two or more alternative bases.

It is not ignored. What is regarded is that, under parsimony, as few
mutations as possible is allowed (or we could use other parameters as
codon usage bias, transition/transversion rate and so on to build a
more complex model).

> No amount of hand-waving about "probabilities" and "parsimony" is going to 
>tell you what base was at that position for _T. rex_.

In fact, the third position is more complicated (and that is the
reason that I didn't worked with it in the examples that I've given in
this thread). But even it could be predicted in a improved way other
than a random guessing. Of course that if we had the homologous
protein sequences of other dinos it would help even more. In the case
of chicken/T-rex pairing, we could use the expected substitution rate
for the third position (or using the chicken/croc substitution rate as
proxy or, if sufficient information is available, we could calculate
from comparison of T-rex protein sequence and chicken DNA sequence:
some aa positions would imply a third position in the code to be
mutated and others would imply that they are the same).

> And this is all assuming that _T. rex_ conforms to the consensus sequence. 
>  Even if crocodilians and birds conform to CTN to
> encode a certain leucine residue, who's to say _T. rex_ didn't encode TTA or 
> TTG for this leucine?

It is parsimony. It is not to say - and I've alerted to it - that
surely T-rex have C in the first position. It is just that it is more
probable.

"We could not be 100% sure, but it is, under parsimony assumption,
likely that the leucine code used is CUC."
http://dml.cmnh.org/2011Sep/msg00174.html

"Of course we cannot rule out the possibility of point mutation
(either from T to C in chicken or C toT in T-rex)."
http://dml.cmnh.org/2011Sep/msg00204.html

> No, I'm not talking about practical use.  I'm talking about what's the point 
> of proposing a hypothetical DNA sequence for a _T. rex_
> peptide when there's no way of testing your hypothesis.

The technique could be tested with extant organisms - and I did it
with birds and croc. And yes it could potencially be tested for T-rex.
Lets put aside the possibility of ancient DNA be obtained. We could
get eventually more proteins.

> I'd say not.  If you're expressing a peptide transgenically, and you want to 
> maximize expression of that peptide, you'll want the codon
> usage to match that of the host organism that is expressing the peptide, not 
> the organism it came from.

It must be taken into account, but it is not the only factor. As I've
said: the secondary and tertiary strucuture of mRNA is very important.
Even some post-translational modification depends on the some
sequences.

[]s,

Roberto Takata