Comptes Rendus Biologies (Apr 2021)
On the origin of the genetic code: a 27-codon hypothetical precursor of an intricate 64-codon intermediate shaped the modern code
Abstract
The modern genetic code reveals numerous traces of specific relationships between the early codons which, together with its internal asymmetries, suggest a sequential appearance of the nucleobases in primitive RNA molecules. Keeping the hypothesis of triplet pairings between primitive RNA molecules at the origin of the code, this work systematically examines complete codon–anticodon interaction matrices assuming distinct pairing options at each position of the triplet duplexes. Application of these principles suggests that a 27-codon precursor having a reasonable coding capacity for short peptide synthesis could have started with primitive RNA molecules able to form two distinct pairs with different free energies between a single purine and two pyrimidines (such as G with C and U). Conservation of the same pairing options at positions 1 and 2 of codons at the arrival of a second purine with distinct pairing preferences (such as A) generated a 64-codon intermediate code made of interrelated pairs or groups of codons (designated here as intricacy). The numerous traces of this hypothetical scheme that are visible in the standard and variant forms of the modern code demonstrate without ambiguity that the ancestral codon–anticodon duplexes required high energetic pairings at their central position (Watson–Crick) but tolerated less energetic pairings at the first codon position (G • U type). Combined with the sequential appearance of the nucleobases, the predicted codon intricacy allows a stepwise reconstruction of the evolution of the coding repertoire, by simple a posteriori comparison to the modern code. This reconstruction reveals a remarkable internal coherence in terms of amino acids and tRNA synthetases recruitment. The code started with a group of amino acids (Ala, Gly, Pro, Ser and Thr) that are now all activated by class II tRNA synthetases before reaching an intermediate period during which up to 14 distinct amino acids could be encoded by a full set of intricated codons. The perfect coincidence between the last 6 amino acids predicted in this reconstruction and the speculated action of the arrival of free atmospheric oxygen on proteins is spectacular, and suggests that the code has only reached its present form after the great oxidation event.
Keywords