Why is the genetic code universal? Biosynthesis of protein and nucleic acids

In the body's metabolism leading role belongs to proteins and nucleic acids.
Protein substances form the basis of all vital cell structures, have an unusually high reactivity, and are endowed with catalytic functions.
Nucleic acids are part of the most important organ of the cell - the nucleus, as well as the cytoplasm, ribosomes, mitochondria, etc. Nucleic acids play an important, primary role in heredity, variability of the body, and in protein synthesis.

Plan synthesis protein is stored in the cell nucleus, and direct synthesis occurs outside the nucleus, so it is necessary delivery service encoded plan from the nucleus to the place of synthesis. This delivery service is performed by RNA molecules.

The process starts at core cells: part of the DNA “ladder” unwinds and opens. Thanks to this, the RNA letters form bonds with the open DNA letters of one of the DNA strands. The enzyme transfers the RNA letters to join them into a strand. This is how the letters of DNA are “rewritten” into the letters of RNA. The newly formed RNA chain is separated, and the DNA “ladder” twists again. The process of reading information from DNA and synthesizing it using its RNA matrix is ​​called transcription , and the synthesized RNA is called messenger or mRNA .

After further modifications, this type of encoded mRNA is ready. mRNA comes out of the nucleus and goes to the site of protein synthesis, where the letters of the mRNA are deciphered. Each set of three i-RNA letters forms a “letter” that represents one specific amino acid.

Another type of RNA finds this amino acid, captures it with the help of an enzyme, and delivers it to the site of protein synthesis. This RNA is called transfer RNA, or t-RNA. As the mRNA message is read and translated, the chain of amino acids grows. This chain twists and folds into a unique shape, creating one type of protein. Even the protein folding process is remarkable: it takes a computer to calculate everything options folding an average-sized protein consisting of 100 amino acids would take 1027 (!) years. And it takes no more than one second to form a chain of 20 amino acids in the body, and this process occurs continuously in all cells of the body.

Genes, genetic code and its properties.

About 7 billion people live on Earth. Apart from the 25-30 million pairs of identical twins, genetically all people are different : everyone is unique, has unique hereditary characteristics, character traits, abilities, and temperament.

These differences are explained differences in genotypes- sets of genes of the organism; Each one is unique. The genetic characteristics of a particular organism are embodied in proteins - therefore, the structure of the protein of one person differs, although very slightly, from the protein of another person.

It does not mean that no two people have exactly the same proteins. Proteins that perform the same functions may be the same or differ only slightly by one or two amino acids from each other. But does not exist on Earth of people (with the exception of identical twins) who would have all their proteins are the same .

Protein Primary Structure Information encoded as a sequence of nucleotides in a section of a DNA molecule, gene – a unit of hereditary information of an organism. Each DNA molecule contains many genes. The totality of all the genes of an organism constitutes it genotype . Thus,

Gene is a unit of hereditary information of an organism, which corresponds to a separate section of DNA

Coding of hereditary information occurs using genetic code , which is universal for all organisms and differs only in the alternation of nucleotides that form genes and encode proteins of specific organisms.

Genetic code consists of triplets (triplets) of DNA nucleotides, combined in different sequences (AAT, HCA, ACG, THC, etc.), each of which encodes a specific amino acid (which will be built into the polypeptide chain).

Actually code counts sequence of nucleotides in an mRNA molecule , because it removes information from DNA (process transcriptions ) and translates it into a sequence of amino acids in the molecules of synthesized proteins (the process broadcasts ).
The composition of mRNA includes nucleotides A-C-G-U, the triplets of which are called codons : a triplet on DNA CGT on i-RNA will become a triplet GCA, and a triplet DNA AAG will become a triplet UUC. Exactly mRNA codons the genetic code is reflected in the record.

Thus, genetic code - a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides . The genetic code is based on the use of an alphabet consisting of only four letters-nucleotides, distinguished by nitrogenous bases: A, T, G, C.

Basic properties of the genetic code:

1. Genetic code triplet. A triplet (codon) is a sequence of three nucleotides encoding one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide ( Since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides are also not enough to encode amino acids, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid must be at least three. In this case, the number of possible nucleotide triplets is 43 = 64.

2. Redundancy (degeneracy) The code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids and 64 triplets), with the exception of methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions: in an mRNA molecule, triplets UAA, UAG, UGA are stop codons, i.e. stop-signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), located at the beginning of the DNA chain, does not code for an amino acid, but performs the function of initiating (exciting) reading.

3. Unambiguity code - at the same time as redundancy, code has the property unambiguity : each codon matches only one a certain amino acid.

4. Collinearity code, i.e. nucleotide sequence in a gene exactly corresponds to the sequence of amino acids in a protein.

5. Genetic code non-overlapping and compact , i.e. does not contain “punctuation marks”. This means that the reading process does not allow the possibility of overlapping columns (triplets), and, starting at a certain codon, reading proceeds continuously, triplet after triplet, until stop-signals ( stop codons).

6. Genetic code universal , i.e., the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and systematic position of these organisms.

Exist genetic code tables for decryption codons mRNA and construction of chains of protein molecules.

Matrix synthesis reactions.

Reactions unknown in inanimate nature occur in living systems - matrix synthesis reactions.

The term "matrix" in technology they designate a mold used for casting coins, medals, and typographic fonts: the hardened metal exactly reproduces all the details of the mold used for casting. Matrix synthesis resembles casting on a matrix: new molecules are synthesized in exact accordance with the plan laid down in the structure of existing molecules.

The matrix principle lies at the core the most important synthetic reactions of the cell, such as the synthesis of nucleic acids and proteins. These reactions ensure the exact, strictly specific sequence of monomer units in the synthesized polymers.

There is directional action going on here. pulling monomers to a specific location cells - into molecules that serve as a matrix where the reaction takes place. If such reactions occurred as a result of random collisions of molecules, they would proceed infinitely slowly. The synthesis of complex molecules based on the template principle is carried out quickly and accurately. The role of the matrix macromolecules of nucleic acids play in matrix reactions DNA or RNA .

Monomeric molecules from which the polymer is synthesized - nucleotides or amino acids - in accordance with the principle of complementarity, are located and fixed on the matrix in a strictly defined, specified order.

Then it happens "cross-linking" of monomer units into a polymer chain, and the finished polymer is discharged from the matrix.

After that matrix is ​​ready to the assembly of a new polymer molecule. It is clear that just as on a given mold only one coin or one letter can be cast, so on a given matrix molecule only one polymer can be “assembled”.

Matrix reaction type- a specific feature of the chemistry of living systems. They are the basis of the fundamental property of all living things - its ability to reproduce its own kind.

Template synthesis reactions

1. DNA replication - replication (from Latin replicatio - renewal) - the process of synthesis of a daughter molecule of deoxyribonucleic acid on the matrix of the parent DNA molecule. During the subsequent division of the mother cell, each daughter cell receives one copy of a DNA molecule that is identical to the DNA of the original mother cell. This process ensures that genetic information is accurately passed on from generation to generation. DNA replication is carried out by a complex enzyme complex consisting of 15-20 different proteins, called replisome . The material for synthesis is free nucleotides present in the cytoplasm of cells. The biological meaning of replication lies in the accurate transfer of hereditary information from the mother molecule to the daughter molecules, which normally occurs during the division of somatic cells.

A DNA molecule consists of two complementary strands. These chains are held together by weak hydrogen bonds that can be broken by enzymes. The DNA molecule is capable of self-duplication (replication), and on each old half of the molecule a new half is synthesized.
In addition, an mRNA molecule can be synthesized on a DNA molecule, which then transfers the information received from DNA to the site of protein synthesis.

Information transfer and protein synthesis proceed according to a matrix principle, comparable to the operation of a printing press in a printing house. Information from DNA is copied many times. If errors occur during copying, they will be repeated in all subsequent copies.

True, some errors when copying information with a DNA molecule can be corrected - the process of error elimination is called reparation. The first of the reactions in the process of information transfer is the replication of the DNA molecule and the synthesis of new DNA chains.

2. Transcription (from Latin transcriptio - rewriting) - the process of RNA synthesis using DNA as a template, occurring in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. RNA polymerase moves along the DNA molecule in the direction 3" → 5". Transcription consists of stages initiation, elongation and termination . The unit of transcription is an operon, a fragment of a DNA molecule consisting of promoter, transcribed part and terminator . mRNA consists of a single chain and is synthesized on DNA in accordance with the rule of complementarity with the participation of an enzyme that activates the beginning and end of the synthesis of the mRNA molecule.

The finished mRNA molecule enters the cytoplasm onto ribosomes, where the synthesis of polypeptide chains occurs.

3. Broadcast (from lat. translation- transfer, movement) - the process of protein synthesis from amino acids on a matrix of information (messenger) RNA (mRNA, mRNA), carried out by the ribosome. In other words, this is the process of translating the information contained in the sequence of nucleotides of mRNA into the sequence of amino acids in the polypeptide.

4. Reverse transcription is the process of forming double-stranded DNA based on information from single-stranded RNA. This process is called reverse transcription, since the transfer of genetic information occurs in the “reverse” direction relative to transcription. The idea of ​​reverse transcription was initially very unpopular because it contradicted the central dogma of molecular biology, which assumed that DNA is transcribed into RNA and then translated into proteins.

However, in 1970, Temin and Baltimore independently discovered an enzyme called reverse transcriptase (revertase) , and the possibility of reverse transcription was finally confirmed. In 1975, Temin and Baltimore were awarded the Nobel Prize in Physiology or Medicine. Some viruses (such as the human immunodeficiency virus, which causes HIV infection) have the ability to transcribe RNA into DNA. HIV has an RNA genome that is integrated into DNA. As a result, the DNA of the virus can be combined with the genome of the host cell. The main enzyme responsible for the synthesis of DNA from RNA is called reversease. One of the functions of reversease is to create complementary DNA (cDNA) from the viral genome. The associated enzyme ribonuclease cleaves RNA, and reversease synthesizes cDNA from the DNA double helix. The cDNA is integrated into the host cell genome by integrase. The result is synthesis of viral proteins by the host cell, which form new viruses. In the case of HIV, apoptosis (cell death) of T-lymphocytes is also programmed. In other cases, the cell may remain a distributor of viruses.

The sequence of matrix reactions during protein biosynthesis can be represented in the form of a diagram.

Thus, protein biosynthesis- this is one of the types of plastic exchange, during which hereditary information encoded in DNA genes is implemented into a specific sequence of amino acids in protein molecules.

Protein molecules are essentially polypeptide chains made up of individual amino acids. But amino acids are not active enough to combine with each other on their own. Therefore, before they combine with each other and form a protein molecule, amino acids must activate . This activation occurs under the action of special enzymes.

As a result of activation, the amino acid becomes more labile and, under the action of the same enzyme, binds to t- RNA. Each amino acid corresponds to a strictly specific t- RNA, which finds “its” amino acid and transfers it into the ribosome.

Consequently, various activated amino acids combined with their own T- RNA. The ribosome is like conveyor to assemble a protein chain from various amino acids supplied to it.

Simultaneously with t-RNA, on which its own amino acid “sits,” “ signal"from the DNA that is contained in the nucleus. In accordance with this signal, one or another protein is synthesized in the ribosome.

The directing influence of DNA on protein synthesis is not carried out directly, but with the help of a special intermediary - matrix or messenger RNA (m-RNA or mRNA), which synthesized into the nucleus e under the influence of DNA, so its composition reflects the composition of DNA. The RNA molecule is like a cast of the DNA form. The synthesized mRNA enters the ribosome and, as it were, transfers it to this structure plan- in what order must the activated amino acids entering the ribosome be combined with each other in order for a specific protein to be synthesized? Otherwise, genetic information encoded in DNA is transferred to mRNA and then to protein.

The mRNA molecule enters the ribosome and stitches her. That segment of it that is currently located in the ribosome is determined codon (triplet), interacts in a completely specific manner with those that are structurally similar to it triplet (anticodon) in transfer RNA, which brought the amino acid into the ribosome.

Transfer RNA with its amino acid matches a specific codon of the mRNA and connects with him; to the next, neighboring section of mRNA another tRNA with a different amino acid is added and so on until the entire chain of i-RNA is read, until all the amino acids are reduced in the appropriate order, forming a protein molecule. And tRNA, which delivered the amino acid to a specific part of the polypeptide chain, freed from its amino acid and exits the ribosome.

Then, again in the cytoplasm, the desired amino acid can join it and again transfer it to the ribosome. In the process of protein synthesis, not one, but several ribosomes - polyribosomes - are involved simultaneously.

The main stages of the transfer of genetic information:

1. Synthesis on DNA as a template for mRNA (transcription)
2. Synthesis of a polypeptide chain in ribosomes according to the program contained in mRNA (translation) .

The stages are universal for all living beings, but the temporal and spatial relationships of these processes differ in pro- and eukaryotes.

U prokaryote transcription and translation can occur simultaneously because DNA is located in the cytoplasm. U eukaryotes transcription and translation are strictly separated in space and time: the synthesis of various RNAs occurs in the nucleus, after which the RNA molecules must leave the nucleus by passing through the nuclear membrane. The RNAs are then transported in the cytoplasm to the site of protein synthesis.

The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in a protein.

Properties of the genetic code.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also propose other properties of the code related to the chemical characteristics of the nucleotides included in the code or the frequency of occurrence of individual amino acids in the body’s proteins, etc. However, these properties follow from those listed above, so we will consider them there.

A. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. Typically, triplets of mRNA are called codons. In the genetic code, a codon performs several functions. Firstly, its main function is that it encodes a single amino acid. Secondly, the codon may not code for an amino acid, but, in this case, it performs another function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). Codon – characterizes elementary semantic unit genome - three nucleotides determine the attachment of one amino acid to the polypeptide chain.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded with one or two nucleotides because there are only 4 of the latter. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids available in living organisms (see Table 1).

The 64 nucleotide combinations presented in table have two features. Firstly, of the 64 triplet variants, only 61 are codons and encode any amino acid; they are called sense codons. Three triplets do not encode

amino acids a are stop signals indicating the end of translation. There are three such triplets - UAA, UAG, UGA, they are also called “meaningless” (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a nonsense codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its information part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with this pathology will experience a lack of protein and will experience symptoms associated with this deficiency. For example, this kind of mutation was identified in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is quickly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. A serious disease occurs that develops as hemolytic anemia (beta-zero thalassemia, from the Greek word “Thalas” - Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons differs from the mechanism of action of sense codons. This follows from the fact that for all codons encoding amino acids, corresponding tRNAs have been found. No tRNAs were found for nonsense codons. Consequently, tRNA does not take part in the process of stopping protein synthesis.

CodonAUG (sometimes GUG in bacteria) not only encode the amino acids methionine and valine, but are alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets encode 20 amino acids. This three-fold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20 and, secondly, amino acids can be encoded by several codons. Research has shown that nature used the latter option.

His preference is obvious. If out of 64 variant triplets only 20 were involved in encoding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Previously, we pointed out how dangerous it is for the life of a cell to transform a coding triplet as a result of mutation into a nonsense codon - this significantly disrupts the normal functioning of RNA polymerase, ultimately leading to the development of diseases. Currently, three codons in our genome are nonsense, but now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. Thus, the amino acid leucine can be encoded by six triplets - UUA, UUG, TSUU, TsUC, TsUA, TsUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with recording the same information with different symbols is called degeneracy.

The number of codons designated for one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the likelihood of its damage by mutagenic factors. Therefore, it is clear that a mutated codon has a greater chance of encoding the same amino acid if it is highly degenerate. From this perspective, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense. Thus, the bulk of the information in a codon is contained in the first two nucleotides; the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base.” The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is to transport oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is performed by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, the hemoglobin molecule contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with replacing one nucleotide with another and the appearance of a new codon in the gene, which may encode a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known that affect the integrity of the globin genes. Near 400 of which are associated with the replacement of single nucleotides in a gene and the corresponding amino acid replacement in a polypeptide. Of these only 100 replacements lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the above-mentioned “degeneracy of the third base,” when a replacement of the third nucleotide in a triplet encoding serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonymous codon encoding the same amino acid. Such a mutation will not manifest itself phenotypically. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of the iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and protein - globin. Adult hemoglobin (HbA) contains two identical-chains and two-chains. Molecule-chain contains 141 amino acid residues,-chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. Gene encoding-the chain is located in the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Substitution in the gene encoding-the hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and serious consequences for the patient. For example, replacing “C” in one of the triplets CAU (histidine) with “Y” will lead to the appearance of a new triplet UAU, encoding another amino acid - tyrosine. Phenotypically this will manifest itself in a severe disease.. A similar substitution in position 63-chain of histidine polypeptide to tyrosine will lead to destabilization of hemoglobin. The disease methemoglobinemia develops. Replacement, as a result of mutation, of glutamic acid with valine in the 6th position-chain is the cause of the most severe disease - sickle cell anemia. Let's not continue the sad list. Let us only note that when replacing the first two nucleotides, an amino acid with physicochemical properties similar to the previous one may appear. Thus, replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain with “U” leads to the appearance of a new triplet (GUA), encoding valine, and replacing the first nucleotide with “A” forms the triplet AAA, encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, replacing hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while replacing hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if in the CAC triplet uracil was replaced by cytosine and a CAC triplet appeared, then practically no phenotypic changes will be detected in humans. This is understandable, because both triplets code for the same amino acid – histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are inherent in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except nonsense) encodes only one amino acid. Thus, in the direction codon - amino acid the genetic code is unambiguous, in the direction amino acid - codon it is ambiguous (degenerate).

Unambiguous

Amino acid codon

Degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another option, when translating the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. Cell metabolism would switch to the “one gene – several polypeptides” mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and mRNA occurs only in one direction. Polarity is important for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about how lower-order structures determine higher-order structures. Tertiary structure and higher order structures in proteins are formed as soon as the synthesized RNA chain leaves the DNA molecule or the polypeptide chain leaves the ribosome. While the free end of an RNA or polypeptide acquires a tertiary structure, the other end of the chain continues to be synthesized on DNA (if RNA is transcribed) or a ribosome (if a polypeptide is transcribed).

Therefore, the unidirectional process of reading information (during the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the strict determination of secondary, tertiary, etc. structures.

d. Non-overlapping.

The code may be overlapping or non-overlapping. Most organisms have a non-overlapping code. Overlapping code is found in some phages.

The essence of a non-overlapping code is that a nucleotide of one codon cannot simultaneously be a nucleotide of another codon. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if there is one nucleotide in common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been established that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have clearly shown that the genetic code is non-overlapping. Without going into details of the experiment, we note that if you replace the third nucleotide in the sequence of nucleotides (see Fig. 34)U (marked with an asterisk) to some other thing:

1. With a non-overlapping code, the protein controlled by this sequence would have a substitution of one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a substitution would occur in two (first and second) amino acids (marked with asterisks). Under option B, the replacement would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is disrupted, the disruption in the protein always affects only one amino acid, which is typical for a non-overlapping code.

GZUGZUG GZUGZUG GZUGZUG

GCU GCU GCU UGC GCU GCU GCU UGC GCU GCU GCU

*** *** *** *** *** ***

Alanin - Alanin Ala - Cis - Ley Ala - Ley - Ley - Ala - Ley

A B C

Non-overlapping code Overlapping code

Rice. 34. A diagram explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlap of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding methionine AUG.

It should be noted that a person still has a small number of genes that deviate from the general rule and overlap.

e. Compactness.

There is no punctuation between codons. In other words, triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of “punctuation marks” in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that all bacterial and eukaryotic genomes use the same sets of code values. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which reads the same as the codon UGG, encoding the amino acid tryptophan. Other rarer deviations from universality were also found.

DNA code system.

The genetic code of DNA consists of 64 triplets of nucleotides. These triplets are called codons. Each codon codes for one of the 20 amino acids used in protein synthesis. This gives some redundancy in the code: most amino acids are coded for by more than one codon.
One codon performs two interrelated functions: it signals the beginning of translation and encodes the inclusion of the amino acid methionine (Met) in the growing polypeptide chain. The DNA coding system is designed so that the genetic code can be expressed either as RNA codons or DNA codons. RNA codons are found in RNA (mRNA) and these codons are able to read information during the synthesis of polypeptides (a process called translation). But each mRNA molecule acquires a nucleotide sequence in transcription from the corresponding gene.

All but two amino acids (Met and Trp) can be encoded by 2 to 6 different codons. However, the genome of most organisms shows that certain codons are favored over others. In humans, for example, alanine is encoded by GCC four times more often than by GCG. This probably indicates greater translation efficiency of the translation apparatus (for example, the ribosome) for some codons.

The genetic code is almost universal. The same codons are assigned to the same section of amino acids and the same start and stop signals are overwhelmingly the same in animals, plants and microorganisms. However, some exceptions have been found. Most involve assigning one or two of the three stop codons to an amino acid.

They line up in chains and thus produce sequences of genetic letters.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L)Leucine
CUC (Leu/L)Leucine
CUA (Leu/L)Leucine
CUG (Leu/L)Leucine

In some proteins, nonstandard amino acids, such as selenocysteine ​​and pyrrolysine, are inserted by a ribosome reading the stop codon, depending on the sequences in the mRNA. Selenocysteine ​​is now considered to be the 21st, and pyrrolysine the 22nd, amino acids that make up proteins.

Despite these exceptions, all living organisms have common genetic codes: a codon consists of three nucleotides, where the first two are decisive; codons are translated by tRNA and ribosomes into an amino acid sequence.

Deviations from the standard genetic code.
Example Codon Normal meaning Reads like:
Some types of yeast Candida C.U.G. Leucine Serin
Mitochondria, in particular in Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serin
Mitochondria of higher plants CGG Arginine Tryptophan
Mitochondria (in all studied organisms without exception) U.G.A. Stop Tryptophan
Mitochondria in mammals, Drosophila, S. cerevisiae and many protozoa AUA Isoleucine Methionine = Start
Prokaryotes G.U.G. Valin Start
Eukaryotes (rare) C.U.G. Leucine Start
Eukaryotes (rare) G.U.G. Valin Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) A.C.G. Threonine Start
Mammalian mitochondria AGC, AGU Serin Stop
Drosophila mitochondria A.G.A. Arginine Stop
Mammalian mitochondria AG(A, G) Arginine Stop

History of ideas about the genetic code

However, in the early 60s of the 20th century, new data revealed the inconsistency of the “code without commas” hypothesis. Then experiments showed that codons, considered meaningless by Crick, could provoke protein synthesis in vitro, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a whole series of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but at the same time serves as a start codon - translation usually begins with the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. Jukes TH, Osawa S, The genetic code in mitochondria and chloroplasts., Experience. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code." Microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins." Adv Protein Chem. 7 : 1-67. PMID 14933251.
  7. M. Ichas Biological code. - World, 1971.
  8. WATSON JD, CRICK FH. (April 1953). “Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid." Nature 171 : 737-738. PMID 13054692.
  9. WATSON JD, CRICK FH. (May 1953). "Genetic implications of the structure of deoxyribonucleic acid." Nature 171 : 964-967. PMID 13063483.
  10. Crick FH. (April 1966). “The genetic code - yesterday, today, and tomorrow.” Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relation between Deoxyribonucleic Acid and Protein Structures." Nature 173 : 318. DOI:10.1038/173318a0. PMID 13882203.
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins." Adv Biol Med Phys. 4 : 23-68. PMID 13354508.
  13. Gamow G, Ycas M. (1955). “STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. " Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789.
  14. Crick FH, Griffith JS, Orgel LE. (1957). “CODES WITHOUT COMMAS. " Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American Scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to deciphering DNA. - M.: Tsentrpoligraf, 2006. - 208 pp. - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros educational journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation. 2010.

The genetic code is a way of encoding the sequence of amino acids in a protein molecule using the sequence of nucleotides in a nucleic acid molecule. The properties of the genetic code arise from the characteristics of this coding.

Each protein amino acid is matched to three consecutive nucleic acid nucleotides - triplet, or codon. Each nucleotide can contain one of four nitrogenous bases. In RNA these are adenine (A), uracil (U), guanine (G), cytosine (C). By combining nitrogenous bases (in this case, nucleotides containing them) in different ways, you can get many different triplets: AAA, GAU, UCC, GCA, AUC, etc. The total number of possible combinations is 64, i.e. 43.

The proteins of living organisms contain about 20 amino acids. If nature “planned” to encode each amino acid not with three, but with two nucleotides, then the variety of such pairs would not be enough, since there would be only 16 of them, i.e. 42.

Thus, the main property of the genetic code is its triplicity. Each amino acid is encoded by a triplet of nucleotides.

Since there are significantly more possible different triplets than the amino acids used in biological molecules, the following property has been realized in living nature: redundancy genetic code. Many amino acids began to be encoded not by one codon, but by several. For example, the amino acid glycine is encoded by four different codons: GGU, GGC, GGA, GGG. Redundancy is also called degeneracy.

The correspondence between amino acids and codons is shown in tables. For example, these:

In relation to nucleotides, the genetic code has the following property: unambiguity(or specificity): each codon corresponds to only one amino acid. For example, the GGU codon can only code for glycine and no other amino acid.

Again. Redundancy means that several triplets can code for the same amino acid. Specificity - each specific codon can code for only one amino acid.

There are no special punctuation marks in the genetic code (except for stop codons, which indicate the end of polypeptide synthesis). The function of punctuation marks is performed by the triplets themselves - the end of one means that another will begin next. This implies the following two properties of the genetic code: continuity And non-overlapping. Continuity refers to the reading of triplets immediately after each other. Non-overlapping means that each nucleotide can be part of only one triplet. So the first nucleotide of the next triplet always comes after the third nucleotide of the previous triplet. A codon cannot begin with the second or third nucleotide of the preceding codon. In other words, the code does not overlap.

The genetic code has the property versatility. It is the same for all organisms on Earth, which indicates the unity of the origin of life. There are very rare exceptions to this. For example, some triplets in mitochondria and chloroplasts encode amino acids other than their usual ones. This may suggest that at the dawn of life there were slightly different variations of the genetic code.

Finally, the genetic code has noise immunity, which is a consequence of its property as redundancy. Point mutations, which sometimes occur in DNA, usually result in the replacement of one nitrogenous base with another. This changes the triplet. For example, it was AAA, but after the mutation it became AAG. However, such changes do not always lead to a change in the amino acid in the synthesized polypeptide, since both triplets, due to the redundancy property of the genetic code, can correspond to one amino acid. Considering that mutations are often harmful, the property of noise immunity is useful.

The genetic, or biological, code is one of the universal properties of living nature, proving the unity of its origin. Genetic code is a method of encoding the sequence of amino acids of a polypeptide using a sequence of nucleic acid nucleotides (messenger RNA or a complementary DNA section on which mRNA is synthesized).

There are other definitions.

Genetic code- this is the correspondence of each amino acid (part of living proteins) to a specific sequence of three nucleotides. Genetic code is the relationship between nucleic acid bases and protein amino acids.

In the scientific literature, the genetic code does not mean the sequence of nucleotides in the DNA of an organism that determines its individuality.

It is incorrect to assume that one organism or species has one code, and another has another. The genetic code is how amino acids are encoded by nucleotides (i.e. principle, mechanism); it is universal for all living things, the same for all organisms.

Therefore, it is incorrect to say, for example, “The genetic code of a person” or “The genetic code of an organism,” which is often used in pseudo-scientific literature and films.

In these cases, we usually mean the genome of a person, an organism, etc.

The diversity of living organisms and the characteristics of their life activity is primarily due to the diversity of proteins.

The specific structure of a protein is determined by the order and quantity of the various amino acids that make up its composition. The amino acid sequence of the peptide is encrypted in DNA using a biological code. From the point of view of the diversity of the set of monomers, DNA is a more primitive molecule than a peptide. DNA consists of different alternations of just four nucleotides. This has long prevented researchers from considering DNA as the material of heredity.

How are amino acids coded by nucleotides?

1) Nucleic acids (DNA and RNA) are polymers consisting of nucleotides.

Each nucleotide can contain one of four nitrogenous bases: adenine (A, en: A), guanine (G, G), cytosine (C, en: C), thymine (T, en: T). In the case of RNA, thymine is replaced by uracil (U, U).

When considering the genetic code, only nitrogenous bases are taken into account.

Then the DNA chain can be represented as their linear sequence. For example:

The mRNA section complementary to this code will be as follows:

2) Proteins (polypeptides) are polymers consisting of amino acids.

In living organisms, 20 amino acids are used to build polypeptides (a few more are very rare). To designate them, you can also use one letter (although more often they use three - an abbreviation for the name of the amino acid).

The amino acids in a polypeptide are also connected linearly by a peptide bond. For example, suppose there is a section of a protein with the following sequence of amino acids (each amino acid is designated by one letter):

3) If the task is to encode each amino acid using nucleotides, then it comes down to how to encode 20 letters using 4 letters.

This can be done by matching letters of a 20-letter alphabet with words made up of several letters of a 4-letter alphabet.

If one amino acid is encoded by one nucleotide, then only four amino acids can be encoded.

If each amino acid is associated with two consecutive nucleotides in the RNA chain, then sixteen amino acids can be encoded.

Indeed, if there are four letters (A, U, G, C), then the number of their different pair combinations will be 16: (AU, UA), (AG, GA), (AC, CA), (UG, GU), ( UC, CU), (GC, CG), (AA, UU, GG, CC).

[Brackets are used for ease of perception.] This means that only 16 different amino acids can be encoded with such a code (a two-letter word): each will have its own word (two consecutive nucleotides).

From mathematics, the formula to determine the number of combinations looks like this: ab = n.

Here n is the number of different combinations, a is the number of letters of the alphabet (or the base of the number system), b is the number of letters in the word (or digits in the number). If we substitute the 4-letter alphabet and words consisting of two letters into this formula, we get 42 = 16.

If three consecutive nucleotides are used as the code word for each amino acid, then 43 = 64 different amino acids can be encoded, since 64 different combinations can be made from four letters taken in groups of three (for example, AUG, GAA, CAU, GGU, etc.).

d.). This is already more than enough to encode 20 amino acids.

Exactly three letter code used in genetic code. Three consecutive nucleotides coding for one amino acid are called triplet(or codon).

Each amino acid is associated with a specific triplet of nucleotides.

In addition, since the combinations of triplets overlap the number of amino acids in excess, many amino acids are encoded by several triplets.

Three triplets do not code for any of the amino acids (UAA, UAG, UGA).

They mark the end of the broadcast and are called stop codons(or nonsense codons).

The AUG triplet encodes not only the amino acid methionine, but also initiates translation (plays the role of a start codon).

Below are tables of amino acid correspondence to nucleoitide triplets.

Using the first table, it is convenient to determine the corresponding amino acid from a given triplet. For the second - for a given amino acid, the triplets corresponding to it.

Let's consider an example of the implementation of a genetic code. Let there be an mRNA with the following content:

Let's split the nucleotide sequence into triplets:

Let us associate each triplet with the amino acid of the polypeptide it encodes:

Methionine - Aspartic acid - Serine - Threonine - Tryptophan - Leucine - Leucine - Lysine - Asparagine - Glutamine

The last triplet is a stop codon.

Properties of the genetic code

The properties of the genetic code are largely a consequence of the way amino acids are encoded.

The first and obvious property is triplicity.

It refers to the fact that the unit of code is a sequence of three nucleotides.

An important property of the genetic code is its non-overlapping. A nucleotide included in one triplet cannot be included in another.

That is, the sequence AGUGAA can only be read as AGU-GAA, but not, for example, like this: AGU-GUG-GAA. That is, if a GU pair is included in one triplet, it cannot already be a component of another.

Under unambiguity The genetic code understands that each triplet corresponds to only one amino acid.

For example, the AGU triplet codes for the amino acid serine and nothing else.

Genetic code

This triplet uniquely corresponds to only one amino acid.

On the other hand, several triplets can correspond to one amino acid. For example, the same serine, in addition to AGU, corresponds to the AGC codon. This property is called degeneracy genetic code.

Degeneracy allows many mutations to remain harmless, since often replacing one nucleotide in DNA does not lead to a change in the value of the triplet. If you look closely at the table of amino acid correspondence to triplets, you can see that if an amino acid is encoded by several triplets, they often differ in the last nucleotide, i.e. it can be anything.

Some other properties of the genetic code are also noted (continuity, noise immunity, universality, etc.).

Resilience as the adaptation of plants to living conditions. Basic reactions of plants to the action of unfavorable factors.

Plant resistance is the ability to withstand the effects of extreme environmental factors (soil and air drought).

The uniqueness of the genetic code is manifested in the fact that

This property was developed during the process of evolution and was genetically fixed. In areas with unfavorable conditions, stable ornamental forms and local varieties of drought-resistant cultivated plants have formed. A particular level of resistance inherent in plants is revealed only under the influence of extreme environmental factors.

As a result of the onset of such a factor, the irritation phase begins - a sharp deviation from the norm of a number of physiological parameters and their rapid return to normal. Then there is a change in metabolic rate and damage to intracellular structures. At the same time, all synthetic ones are suppressed, all hydrolytic ones are activated, and the overall energy supply of the body decreases. If the effect of the factor does not exceed the threshold value, the adaptation phase begins.

An adapted plant reacts less to repeated or increasing exposure to an extreme factor. At the organismal level, interaction between organs is added to the adaptation mechanisms. The weakening of the movement of water flows, mineral and organic compounds through the plant exacerbates competition between organs, and their growth stops.

Biostability in plants defined. the maximum value of the extreme factor at which plants still form viable seeds. Agronomic stability is determined by the degree of yield reduction. Plants are characterized by their resistance to a specific type of extreme factor - wintering, gas-resistant, salt-resistant, drought-resistant.

The type of roundworms, unlike flatworms, have a primary body cavity - a schizocoel, formed due to the destruction of parenchyma that fills the gaps between the body wall and internal organs - its function is transport.

It maintains homeostasis. The body shape is round in diameter. The integument is cuticulated. The muscles are represented by a layer of longitudinal muscles. The intestine is through and consists of 3 sections: anterior, middle and posterior. The mouth opening is located on the ventral surface of the anterior end of the body. The pharynx has a characteristic triangular lumen. The excretory system is represented by protonephridia or special skin glands - hypodermal glands. Most species are dioecious and reproduce only sexually.

Development is direct, less often with metamorphosis. They have a constant cellular composition of the body and lack the ability to regenerate. The anterior intestine consists of the oral cavity, pharynx, and esophagus.

They do not have a middle or posterior section. The excretory system consists of 1-2 giant cells of the hypodermis. Longitudinal excretory canals lie in the lateral ridges of the hypodermis.

Properties of the genetic code. Evidence of triplet code. Decoding codons. Stop codons. The concept of genetic suppression.

The idea that a gene encodes information in the primary structure of a protein was concretized by F.

Crick in his sequence hypothesis, according to which the sequence of gene elements determines the sequence of amino acid residues in the polypeptide chain. The validity of the sequence hypothesis is proven by the colinearity of the structures of the gene and the polypeptide it encodes. The most significant development in 1953 was the consideration that. That the code is most likely triplet.

; DNA base pairs: A-T, T-A, G-C, C-G - can only encode 4 amino acids if each pair corresponds to one amino acid. As you know, proteins contain 20 basic amino acids. If we assume that each amino acid has 2 base pairs, then 16 amino acids (4*4) can be encoded - this is again not enough.

If the code is triplet, then 64 codons (4*4*4) can be made from 4 base pairs, which is more than enough to encode 20 amino acids. Crick and his colleagues assumed that the code was triplet; there were no “commas” between the codons, i.e., separating marks; The code within a gene is read from a fixed point in one direction. In the summer of 1961, Kirenberg and Mattei reported the decoding of the first codon and suggested a method for establishing the composition of codons in a cell-free protein synthesis system.

Thus, the codon for phenylalanine was transcribed as UUU in mRNA. Further, as a result of the application of methods developed by Korana, Nirenberg and Leder in 1965.

a code dictionary in its modern form was compiled. Thus, the occurrence of mutations in T4 phages caused by the loss or addition of bases was evidence of the triplet nature of the code (property 1). These deletions and additions, leading to frame shifts when “reading” the code, were eliminated only by restoring the correctness of the code; this prevented the appearance of mutants. These experiments also showed that triplets do not overlap, that is, each base can belong to only one triplet (property 2).

Most amino acids have several codons. A code in which the number of amino acids is less than the number of codons is called degenerate (property 3), i.e.

e. a given amino acid can be encoded by more than one triplet. In addition, three codons do not code for any amino acid at all (“nonsense codons”) and act as a “stop signal.” A stop codon is the end point of a functional unit of DNA, the cistron. Stop codons are the same in all species and are represented as UAA, UAG, UGA. A notable feature of the code is that it is universal (property 4).

In all living organisms, the same triplets code for the same amino acids.

The existence of three types of mutant codon terminators and their suppression have been demonstrated in E. coli and yeast. The discovery of suppressor genes that “interpret” nonsense alleles of different genes indicates that the translation of the genetic code can change.

Mutations affecting the anticodon of tRNAs change their codon specificity and create the possibility of suppression of mutations at the translational level. Suppression at the translational level can occur due to mutations in the genes encoding certain ribosomal proteins. As a result of these mutations, the ribosome “makes mistakes,” for example, in reading nonsense codons and “interprets” them using some non-mutant tRNAs. Along with genotypic suppression acting at the translation level, phenotypic suppression of nonsense alleles is also possible: when the temperature decreases, when cells are exposed to aminoglycoside antibiotics that bind to ribosomes, for example streptomycin.

22. Reproduction of higher plants: vegetative and asexual. Sporulation, spore structure, equal and heterosporous. Reproduction as a property of living matter, i.e. the ability of an individual to give rise to its own kind, existed in the early stages of evolution.

Forms of reproduction can be divided into 2 types: asexual and sexual. Asexual reproduction itself is carried out without the participation of germ cells, with the help of specialized cells - spores. They are formed in the organs of asexual reproduction - sporangia as a result of mitotic division.

During its germination, the spore reproduces a new individual, similar to the mother, with the exception of spores of seed plants, in which the spore has lost the function of reproduction and dispersal. Spores can also be formed by reduction division, with single-celled spores spilling out.

Reproduction of plants using vegetative (part of a shoot, leaf, root) or division of unicellular algae in half is called vegetative (bulb, cuttings).

Sexual reproduction is carried out by special sex cells - gametes.

Gametes are formed as a result of meiosis, there are female and male. As a result of their fusion, a zygote appears, from which a new organism subsequently develops.

Plants differ in the types of gametes. In some unicellular organisms it functions as a gamete at certain times. Organisms of different sexes (gametes) merge - this sexual process is called hologamia. If male and female gametes are morphologically similar and mobile, these are isogametes.

And the sexual process - isogamous. If female gametes are somewhat larger and less mobile than male ones, then these are heterogametes, and the process is heterogamy. Oogamy - female gametes are very large and immobile, male gametes are small and mobile.

12345678910Next ⇒

Genetic code - correspondence between DNA triplets and protein amino acids

The need to encode the structure of proteins in the linear sequence of nucleotides of mRNA and DNA is dictated by the fact that during translation:

  • there is no correspondence between the number of monomers in the mRNA matrix and the product - the synthesized protein;
  • there is no structural similarity between RNA and protein monomers.

This eliminates the complementary interaction between the matrix and the product - the principle by which the construction of new DNA and RNA molecules is carried out during replication and transcription.

From this it becomes clear that there must be a “dictionary” that allows one to find out which sequence of mRNA nucleotides ensures the inclusion of amino acids in a protein in a given sequence. This “dictionary” is called the genetic, biological, nucleotide, or amino acid code. It allows you to encrypt the amino acids that make up proteins using a specific sequence of nucleotides in DNA and mRNA. It is characterized by certain properties.

Tripletity. One of the main questions in determining the properties of the code was the question of the number of nucleotides, which should determine the inclusion of one amino acid in the protein.

It was found that the coding elements in the encryption of an amino acid sequence are indeed triplets of nucleotides, or triplets, which were named "codons".

The meaning of codons.

It was possible to establish that out of 64 codons, the inclusion of amino acids in the synthesized polypeptide chain encodes 61 triplets, and the remaining 3 - UAA, UAG, UGA - do not encode the inclusion of amino acids in the protein and were originally called meaningless, or non-sense codons. However, it was later shown that these triplets signal the completion of translation, and therefore they came to be called termination or stop codons.

The codons of mRNA and triplets of nucleotides in the coding strand of DNA with the direction from the 5′ to the 3′ end have the same sequence of nitrogenous bases, except that in DNA instead of uracil (U), characteristic of mRNA, there is thymine (T).

Specificity.

Each codon corresponds to only one specific amino acid. In this sense, the genetic code is strictly unambiguous.

Table 4-3.

Unambiguousness is one of the properties of the genetic code, manifested in the fact that...

Main components of the protein synthesizing system

Required Components Functions
1 . Amino acids Substrates for protein synthesis
2. tRNA tRNAs act as adapters. Their acceptor end interacts with amino acids, and their anticodon interacts with the codon of the mRNA.
3.

Aminoacyl-tRNA synthetase

Each aa-tRNA synthetase catalyzes the specific binding of one of 20 amino acids to the corresponding tRNA
4.mRNA The matrix contains a linear sequence of codons that determine the primary structure of proteins
5. Ribosomes Ribonucleoprotein subcellular structures that are the site of protein synthesis
6. Energy sources
7. Protein factors of initiation, elongation, termination Specific extraribosomal proteins required for the translation process (12 initiation factors: elF; 2 elongation factors: eEFl, eEF2, and termination factors: eRF)
8.

Magnesium ions

Cofactor that stabilizes ribosome structure

Notes: elF( eukaryotic initiation factors) — initiation factors; eEF ( eukaryotic elongation factors) — elongation factors; eRF ( eukaryotic releasing factors) are termination factors.

Degeneracy. There are 61 triplets in mRNA and DNA, each of which encodes the inclusion of one of 20 amino acids in the protein.

It follows from this that in information molecules the inclusion of the same amino acid in a protein is determined by several codons. This property of the biological code is called degeneracy.

In humans, only 2 amino acids are encoded with one codon - Met and Tri, while Leu, Ser and Apr - with six codons, and Ala, Val, Gly, Pro, Tre - with four codons (Table

Redundancy of coding sequences is the most valuable property of a code, since it increases the stability of the information flow to the adverse effects of the external and internal environment. When determining the nature of the amino acid to be included in a protein, the third nucleotide in a codon is not as important as the first two. As can be seen from table. 4-4, for many amino acids, replacing a nucleotide in the third position of a codon does not affect its meaning.

Linearity of information recording.

During translation, mRNA codons are “read” from a fixed starting point sequentially and do not overlap. The information record does not contain signals indicating the end of one codon and the beginning of the next. The AUG codon is the initiation codon and is read both at the beginning and in other parts of the mRNA as Met. The triplets following it are read sequentially without any gaps until the stop codon, at which the synthesis of the polypeptide chain is completed.

Versatility.

Until recently, it was believed that the code was absolutely universal, i.e. the meaning of code words is the same for all studied organisms: viruses, bacteria, plants, amphibians, mammals, including humans.

However, one exception later became known; it turned out that mitochondrial mRNA contains 4 triplets that have a different meaning than in nuclear-origin mRNA. Thus, in mitochondrial mRNA, the triplet UGA encodes Tri, AUA encodes Met, and ACA and AGG are read as additional stop codons.

Colinearity of gene and product.

In prokaryotes, a linear correspondence between the codon sequence of a gene and the amino acid sequence in the protein product has been found, or, as they say, there is colinearity between the gene and the product.

Table 4-4.

Genetic code

First base Second base
U WITH A G
U UUU Hairdryer UCU Cep UAU Shooting Range UGU Cis
UUC Hairdryer UCC Ser iASTir UGC Cis
UUA Lei UCA Cep UAA* UGA*
UUG Lei UCG Ser UAG* UGG April
WITH CUU Lei CCU Pro CAU Gis CGU April
CUC Lei SSS Pro SAS Gis CGC April
CUA Lei SSA Pro SAA Gln CGA April
CUG Lei CCG Pro CAG Gln CGG April
A AUU Ile ACU Tpe AAU Asn AGU Ser
AUC Ile ACC Tre AAS Asn AGG Gray
AUA Meth ASA Tre AAA Liz AGA April
AUG Met ACG Tre AAG Liz AGG April
G GUU Ban GCU Ala GAU Asp GGU Gli
GUC Val GCC Ala GAC Asp GGC Gli
GUA Val GSA Ala GAA Glu GGA Gli
GUG Val GСG Ala GAG Glu GGG Glee

Notes: U - uracil; C - cytosine; A - adenine; G - guanine; *—termination codon.

In eukaryotes, base sequences in a gene that are colinear with the amino acid sequence in the protein are interrupted by nitrones.

Therefore, in eukaryotic cells, the amino acid sequence of a protein is colinear with the sequence of exons in a gene or mature mRNA after post-transcriptional removal of introns.

Nucleotides DNA and RNA
  1. Purines: adenine, guanine
  2. Pyrimidine: cytosine, thymine (uracil)

Codon- a triplet of nucleotides encoding a specific amino acid.

tab. 1. Amino acids that are commonly found in proteins
Name Abbreviation
1. AlanineAla
2. ArginineArg
3. AsparagineAsn
4. Aspartic acidAsp
5. CysteineCys
6. Glutamic acidGlu
7. GlutamineGln
8. GlycineGly
9. HistidineHis
10. IsoleucineIle
11. LeucineLeu
12. LysineLys
13. MethionineMet
14. PhenylalaninePhe
15. ProlinePro
16. SeriesSer
17. ThreonineThr
18. TryptophanTrp
19. TyrosineTyr
20. ValinVal

The genetic code, also called the amino acid code, is a system for recording information about the sequence of amino acids in a protein using the sequence of nucleotide residues in DNA that contain one of 4 nitrogenous bases: adenine (A), guanine (G), cytosine (C) and thymine (T). However, since the double-stranded DNA helix is ​​not directly involved in the synthesis of the protein that is encoded by one of these strands (i.e., RNA), the code is written in RNA language, which contains uracil (U) instead of thymine. For the same reason, it is customary to say that a code is a sequence of nucleotides, and not pairs of nucleotides.

The genetic code is represented by certain code words, called codons.

The first code word was deciphered by Nirenberg and Mattei in 1961. They obtained an extract from E. coli containing ribosomes and other factors necessary for protein synthesis. The result was a cell-free system for protein synthesis, which could assemble proteins from amino acids if the necessary mRNA was added to the medium. By adding synthetic RNA consisting only of uracils to the medium, they discovered that a protein was formed consisting only of phenylalanine (polyphenylalanine). Thus, it was established that the triplet of nucleotides UUU (codon) corresponds to phenylalanine. Over the next 5-6 years, all codons of the genetic code were determined.

The genetic code is a kind of dictionary that translates text written with four nucleotides into protein text written with 20 amino acids. The remaining amino acids found in protein are modifications of one of the 20 amino acids.

Properties of the genetic code

The genetic code has the following properties.

  1. Triplety- Each amino acid corresponds to a triple of nucleotides. It is easy to calculate that there are 4 3 = 64 codons. Of these, 61 are semantic and 3 are nonsense (termination, stop codons).
  2. Continuity(no separating marks between nucleotides) - absence of intragenic punctuation marks;

    Within a gene, each nucleotide is part of a significant codon. In 1961 Seymour Benzer and Francis Crick experimentally proved the triplet nature of the code and its continuity (compactness) [show]

    The essence of the experiment: “+” mutation - insertion of one nucleotide. "-" mutation - loss of one nucleotide.

    A single mutation ("+" or "-") at the beginning of a gene or a double mutation ("+" or "-") spoils the entire gene.

    A triple mutation ("+" or "-") at the beginning of a gene spoils only part of the gene.

    A quadruple “+” or “-” mutation again spoils the entire gene.

    The experiment was carried out on two adjacent phage genes and showed that

    1. the code is triplet and there is no punctuation inside the gene
    2. there are punctuation marks between genes
  3. Presence of intergenic punctuation marks- the presence among triplets of initiating codons (they begin protein biosynthesis), and terminator codons (indicating the end of protein biosynthesis);

    Conventionally, the AUG codon, the first after the leader sequence, also belongs to punctuation marks. It functions as a capital letter. In this position it encodes formylmethionine (in prokaryotes).

    At the end of each gene encoding a polypeptide there is at least one of 3 stop codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

  4. Colinearity- correspondence of the linear sequence of codons of mRNA and amino acids in the protein.
  5. Specificity- each amino acid corresponds only to certain codons that cannot be used for another amino acid.
  6. Unidirectionality- codons are read in one direction - from the first nucleotide to the subsequent ones
  7. Degeneracy or redundancy, - one amino acid can be encoded by several triplets (amino acids - 20, possible triplets - 64, 61 of them are semantic, i.e., on average, each amino acid corresponds to about 3 codons); the exceptions are methionine (Met) and tryptophan (Trp).

    The reason for the degeneracy of the code is that the main semantic load is carried by the first two nucleotides in the triplet, and the third is not so important. From here code degeneracy rule : If two codons have the same first two nucleotides and their third nucleotides belong to the same class (purine or pyrimidine), then they code for the same amino acid.

    However, there are two exceptions to this ideal rule. This is the AUA codon, which should correspond not to isoleucine, but to methionine, and the UGA codon, which is a stop codon, whereas it should correspond to tryptophan. The degeneracy of the code obviously has an adaptive significance.

  8. Versatility- all of the above properties of the genetic code are characteristic of all living organisms.
    Codon Universal code Mitochondrial codes
    Vertebrates Invertebrates Yeast Plants
    U.G.A.STOPTrpTrpTrpSTOP
    AUAIleMetMetMetIle
    CUALeuLeuLeuThrLeu
    A.G.A.ArgSTOPSerArgArg
    AGGArgSTOPSerArgArg

    Recently, the principle of code universality has been shaken in connection with the discovery by Berrell in 1979 of the ideal code of human mitochondria, in which the rule of code degeneracy is satisfied. In the mitochondrial code, the UGA codon corresponds to tryptophan, and AUA to methionine, as required by the code degeneracy rule.

    Perhaps at the beginning of evolution, all simple organisms had the same code as mitochondria, and then it underwent slight deviations.

  9. Non-overlapping- each of the triplets of the genetic text is independent of each other, one nucleotide is included in only one triplet; In Fig. shows the difference between overlapping and non-overlapping code.

    In 1976 The DNA of phage φX174 was sequenced. It has single-stranded circular DNA consisting of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

    It turned out that there is an overlap. Gene E is located entirely within gene D. Its start codon appears as a result of a frame shift of one nucleotide. Gene J begins where gene D ends. The start codon of gene J overlaps with the stop codon of gene D as a result of a two-nucleotide shift. The construction is called a “reading frameshift” by a number of nucleotides not a multiple of three. To date, overlap has only been shown for a few phages.

  10. Noise immunity- the ratio of the number of conservative substitutions to the number of radical substitutions.

    Nucleotide substitution mutations that do not lead to a change in the class of the encoded amino acid are called conservative. Nucleotide substitution mutations that lead to a change in the class of the encoded amino acid are called radical.

    Since the same amino acid can be encoded by different triplets, some substitutions in triplets do not lead to a change in the encoded amino acid (for example, UUU -> UUC leaves phenylalanine). Some substitutions change an amino acid to another from the same class (non-polar, polar, basic, acidic), other substitutions also change the class of the amino acid.

    In each triplet, 9 single substitutions can be made, i.e. There are three ways to choose which position to change (1st or 2nd or 3rd), and the selected letter (nucleotide) can be changed to 4-1=3 other letters (nucleotide). The total number of possible nucleotide substitutions is 61 by 9 = 549.

    By direct calculation using the genetic code table, you can verify that of these: 23 nucleotide substitutions lead to the appearance of codons - translation terminators. 134 substitutions do not change the encoded amino acid. 230 substitutions do not change the class of the encoded amino acid. 162 substitutions lead to a change in amino acid class, i.e. are radical. Of the 183 substitutions of the 3rd nucleotide, 7 lead to the appearance of translation terminators, and 176 are conservative. Of the 183 substitutions of the 1st nucleotide, 9 lead to the appearance of terminators, 114 are conservative and 60 are radical. Of the 183 substitutions of the 2nd nucleotide, 7 lead to the appearance of terminators, 74 are conservative, 102 are radical.


Latest materials in the section:

Liquid crystal polymers
Liquid crystal polymers

Ministry of Education and Science of the Russian Federation Kazan (Volga Region) Federal University Chemical Institute named after. A. M. Butlerov...

The initial period of the Cold War where
The initial period of the Cold War where

The main events of international politics in the second half of the 20th century were determined by the Cold War between two superpowers - the USSR and the USA. Her...

Formulas and units of measurement Traditional systems of measures
Formulas and units of measurement Traditional systems of measures

When typing text in the Word editor, it is recommended to write formulas using the built-in formula editor, saving in it the settings specified by...