Codon Bits Step 1

The Codon Table has 3 elements of 4 cases each. This article converts this to 6 elements of 2 cases each. The first step in aligning with the Paleo Alphabet.[unfinished]

Nucleotides

Human DNA Codon Table

For review, the following is the human DNA codon table.

The following table is on the sense strand in the 5' to 3' direction.

Codons are 3 nucleotides, in order. In the following table the columns are marked N0, N1 and N2 for each nucleotide respectively. The amino acid name, short name, letter code, chemical properties and notes are also given.

N0 N1 N2 Amino Acid Abbrs Property Note
Section 1
T T T Phenylalanine Phe/F Nonpolar
T T C Phenylalanine Phe/F Nonpolar
T T A Leucine Leu/L Nonpolar
T T G Leucine Leu/L Nonpolar Possible Start
C T T Leucine Leu/L Nonpolar
C T C Leucine Leu/L Nonpolar
C T A Leucine Leu/L Nonpolar
C T G Leucine Leu/L Nonpolar
A T T Isoleucine Ile/I Nonpolar
A T C Isoleucine Ile/I Nonpolar
A T A Isoleucine Ile/I Nonpolar
A T G Methionine Met/M Nonpolar Possible Start
G T T Valine Val/V Nonpolar
G T C Valine Val/V Nonpolar
G T A Valine Val/V Nonpolar
G T G Valine Val/V Nonpolar Possible Start
Section 2
T C T Serine Ser/S Polar
T C C Serine Ser/S Polar
T C A Serine Ser/S Polar
T C G Serine Ser/S Polar
C C T Proline Pro/P Nonpolar
C C C Proline Pro/P Nonpolar
C C A Proline Pro/P Nonpolar
C C G Proline Pro/P Nonpolar
A C T Threonine Thr/T Polar
A C C Threonine Thr/T Polar
A C A Threonine Thr/T Polar
A C G Threonine Thr/T Polar
G C T Alanine Ala/A Nonpolar
G C C Alanine Ala/A Nonpolar
G C A Alanine Ala/A Nonpolar
G C G Alanine Ala/A Nonpolar
Section 3
T A T Tyrosine Tyr/Y Polar
T A C Tyrosine Tyr/Y Polar
T A A Ochre Stop
T A G Amber Stop
C A T Histidine His/H Basic
C A C Histidine His/H Basic
C A A Glutamine Gln/Q Polar
C A G Glutamine Gln/Q Polar
A A T Asparagine Asn/N Polar
A A C Asparagine Asn/N Polar
A A A Lysine Lys/K Basic
A A G Lysine Lys/K Basic
G A T Aspartic Acid Asp/D Acidic
G A C Aspartic Acid Asp/D Acidic
G A A Glutamic Acid Glu/E Acidic
G A G Glutamic Acid Glu/E Acidic
Section 4
T G T Cysteine Cys/C Polar
T G C Cysteine Cys/C Polar
T G A Opal Stop
T G G Tryptophan Trp/W Nonpolar
C G T Arginine Arg/R Basic
C G C Arginine Arg/R Basic
C G A Arginine Arg/R Basic
C G G Arginine Arg/R Basic
A G T Serine Ser/S Polar (Duplicate)
A G C Serine Ser/S Polar (Duplicate)
A G A Arginine Arg/R Basic (Duplicate)
A G G Arginine Arg/R Basic (Duplicate)
G G T Glycine Gly/G Nonpolar
G G C Glycine Gly/G Nonpolar
G G A Glycine Gly/G Nonpolar
G G G Glycine Gly/G Nonpolar

Adding Bits

The table above lists 3 columns with 4 cases each. The paleo alphabet tables reflect 6 bits of information, where each bit has 2 choices each. These 2 tables must be reconciled in order to determine the encoding in DNA of Paleo letters.

What is Fixed?

The codon table given above has columns in the order of appearance along the DNA strands. So the columns here cannot be reordered without loss of fidelity to the underlying physical reality of the DNA data found through sequencing.

The related Paleo Folding tables have bits that map to tiers in the 3D folding system, but those can be rearranged if needed and still retain the bit-to-letter mappings contained within those tables.

So the codon table above must be expanded with bit data while retaining the order of the columns.

Strange Order

By inspection on the table above, if you consider the nucleotides as base 4 numbers, then the Least Significant Digit (LSD) is labeled N2, while the Most Significant Digit (MSD) is labeled N1.

You can see this because of the pace of change moving down the column. N2 is the fastest changing. N0 is next, and N1 is the slowest changing column.

The MSD would be expected to be N0, so something strange is going on. Remember, the table is designed to be in order, and shows the groupings of the amino acids produced by each group of letters.

It could be that the final encoding for Paleo is using a different reference frame. If so that frame would start at N1, then go left to N0 and then left again to N2, remembering of course that there are around 1,000,000,000 codons in order down the entire human genome.

This unusual reference frame will need to be studied experimentally in software later. For now we will operate under the assumption that the 5' to 3' direction, with no offset for the reference frame, is what is used for Paleo language encoding in the genome.

Adding Bits

Reading down each of the Nx columns in the table above you can see the order of the letter codes is T-C-A-G. These 4 values must be assigned a bit map where 2 bits are assigned to each of the 4 cases.

By inspection in section 3 of the table you can see that N2 for T and C are Don't Care to the amino acid they produce. Tyrosine, for example.

This suggests the LSB of the bit code for these 4 codes selects T vs C. The MSB for this 2 bit bit code must be selecting TC against AG. This observation lets us deduce the following bit to nucleotide table.

N B0 B1
T 0 0
C 0 1
A 1 0
G 1 1

This table shows how a 2 bit value encodes the 4 nucleotides.

We can use this table and completely expand the full codon table and assign bits to each place.

Codon Table With Bits Assigned

The following table augments the stock codon table given above with 6 more columns. These columns add bit encoding for each letter in the codon table.

I have labeled this bits in their order of signficance. Because the middle, or N1, column varies the slowest, it carries the most signficant bits, labeled B0 and B1.

This bits should match the bits named the same way in the Paleo Alphabet table.

N0 B2 B3 N1 B0 B1 N2 B4 B5 Amino Acid Abbrs Property Note
Section 1
T 0 0 T 0 0 T 0 0 Phenylalanine Phe/F Nonpolar
T 0 0 T 0 0 C 0 1 Phenylalanine Phe/F Nonpolar
T 0 0 T 0 0 A 1 0 Leucine Leu/L Nonpolar
T 0 0 T 0 0 G 1 1 Leucine Leu/L Nonpolar Possible Start
C 0 1 T 0 0 T 0 0 Leucine Leu/L Nonpolar
C 0 1 T 0 0 C 0 1 Leucine Leu/L Nonpolar
C 0 1 T 0 0 A 1 0 Leucine Leu/L Nonpolar
C 0 1 T 0 0 G 1 1 Leucine Leu/L Nonpolar
A 1 0 T 0 0 T 0 0 Isoleucine Ile/I Nonpolar
A 1 0 T 0 0 C 0 1 Isoleucine Ile/I Nonpolar
A 1 0 T 0 0 A 1 0 Isoleucine Ile/I Nonpolar
A 1 0 T 0 0 G 1 1 Methionine Met/M Nonpolar Possible Start
G 1 1 T 0 0 T 0 0 Valine Val/V Nonpolar
G 1 1 T 0 0 C 0 1 Valine Val/V Nonpolar
G 1 1 T 0 0 A 1 0 Valine Val/V Nonpolar
G 1 1 T 0 0 G 1 1 Valine Val/V Nonpolar Possible Start
Section 2
T 0 0 C 0 1 T 0 0 Serine Ser/S Polar
T 0 0 C 0 1 C 0 1 Serine Ser/S Polar
T 0 0 C 0 1 A 1 0 Serine Ser/S Polar
T 0 0 C 0 1 G 1 1 Serine Ser/S Polar
C 0 1 C 0 1 T 0 0 Proline Pro/P Nonpolar
C 0 1 C 0 1 C 0 1 Proline Pro/P Nonpolar
C 0 1 C 0 1 A 1 0 Proline Pro/P Nonpolar
C 0 1 C 0 1 G 1 1 Proline Pro/P Nonpolar
A 1 0 C 0 1 T 0 0 Threonine Thr/T Polar
A 1 0 C 0 1 C 0 1 Threonine Thr/T Polar
A 1 0 C 0 1 A 1 0 Threonine Thr/T Polar
A 1 0 C 0 1 G 1 1 Threonine Thr/T Polar
G 1 1 C 0 1 T 0 0 Alanine Ala/A Nonpolar
G 1 1 C 0 1 C 0 1 Alanine Ala/A Nonpolar
G 1 1 C 0 1 A 1 0 Alanine Ala/A Nonpolar
G 1 1 C 0 1 G 1 1 Alanine Ala/A Nonpolar
Section 3
T 0 0 A 1 0 T 0 0 Tyrosine Tyr/Y Polar
T 0 0 A 1 0 C 0 1 Tyrosine Tyr/Y Polar
T 0 0 A 1 0 A 1 0 Ochre Stop
T 0 0 A 1 0 G 1 1 Amber Stop
C 0 1 A 1 0 T 0 0 Histidine His/H Basic
C 0 1 A 1 0 C 0 1 Histidine His/H Basic
C 0 1 A 1 0 A 1 0 Glutamine Gln/Q Polar
C 0 1 A 1 0 G 1 1 Glutamine Gln/Q Polar
A 1 0 A 1 0 T 0 0 Asparagine Asn/N Polar
A 1 0 A 1 0 C 0 1 Asparagine Asn/N Polar
A 1 0 A 1 0 A 1 0 Lysine Lys/K Basic
A 1 0 A 1 0 G 1 1 Lysine Lys/K Basic
G 1 1 A 1 0 T 0 0 Aspartic Acid Asp/D Acidic
G 1 1 A 1 0 C 0 1 Aspartic Acid Asp/D Acidic
G 1 1 A 1 0 A 1 0 Glutamic Acid Glu/E Acidic
G 1 1 A 1 0 G 1 1 Glutamic Acid Glu/E Acidic
Section 4
T 0 0 G 1 1 T 0 0 Cysteine Cys/C Polar
T 0 0 G 1 1 C 0 1 Cysteine Cys/C Polar
T 0 0 G 1 1 A 1 0 Opal Stop
T 0 0 G 1 1 G 1 1 Tryptophan Trp/W Nonpolar
C 0 1 G 1 1 T 0 0 Arginine Arg/R Basic
C 0 1 G 1 1 C 0 1 Arginine Arg/R Basic
C 0 1 G 1 1 A 1 0 Arginine Arg/R Basic
C 0 1 G 1 1 G 1 1 Arginine Arg/R Basic
A 1 0 G 1 1 T 0 0 Serine Ser/S Polar (Duplicate)
A 1 0 G 1 1 C 0 1 Serine Ser/S Polar (Duplicate)
A 1 0 G 1 1 A 1 0 Arginine Arg/R Basic (Duplicate)
A 1 0 G 1 1 G 1 1 Arginine Arg/R Basic (Duplicate)
G 1 1 G 1 1 T 0 0 Glycine Gly/G Nonpolar
G 1 1 G 1 1 C 0 1 Glycine Gly/G Nonpolar
G 1 1 G 1 1 A 1 0 Glycine Gly/G Nonpolar
G 1 1 G 1 1 G 1 1 Glycine Gly/G Nonpolar