Transcription: Synthesis of RNA

13 Transcription: Synthesis of RNA

Synthesis of RNA from a DNA template is called transcription. Genes are transcribed by enzymes called RNA polymerases that generate a single-stranded RNA identical in sequence (with the exception of U in place of T) to one of the strands of the double-stranded DNA. The DNA strand that directs the sequence of nucleotides in the RNA by complementary base pairing is the template strand. The RNA strand that is initially generated is the primary transcript. The DNA template is copied in the 3′-to-5′ direction, and the RNA transcript is synthesized in the 5′-to-3′ direction. RNA polymerases differ from DNA polymerases in that they can initiate the synthesis of new strands in the absence of a primer.

In addition to catalyzing the polymerization of ribonucleotides, RNA polymerases must be able to recognize the appropriate gene to transcribe, the appropriate strand of the double-stranded DNA to copy, and the start point of transcription (Fig. 13.1). Specific sequences on DNA, called promoters, determine where the RNA polymerase binds and how frequently it initiates transcription. Other regulatory sequences, such as promoter-proximal elements and enhancers, also affect the frequency of transcription.

FIGURE 13.1 Regions of a gene. A gene is a segment of DNA that functions as a unit to generate an RNA product or, through the processes of transcription and translation, a polypeptide chain. The transcribed region of a gene contains the template for synthesis of an RNA, which begins at the start point. A gene also includes regions of DNA that regulate production of the encoded product, such as a promoter region. In a structural gene, the transcribed region contains the coding sequences that dictate the amino acid sequence of a polypeptide chain.

In bacteria, a single RNA polymerase produces the primary transcript precursors for all three major classes of RNA: messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). Because bacteria do not contain nuclei, ribosomes bind to mRNA as it is being transcribed, and protein synthesis occurs simultaneously with transcription.

Eukaryotic genes are transcribed in the nucleus by three different RNA polymerases, each principally responsible for one of the major classes of RNA. The primary transcripts are modified and trimmed to produce the mature RNAs. The precursors of mRNA (called pre-mRNA) have a guanosine “cap” added at the 5′-end and a poly(A) “tail” at the 3′-end. Exons, which contain the coding sequences for the proteins, are separated in pre-mRNA by introns, regions that have no coding function. During splicing reactions, introns are removed and the exons connected to form the mature mRNA. In eukaryotes, tRNA and rRNA precursors are also modified and trimmed, although not as extensively as pre-mRNA.

THE WAITING ROOM

Lisa N. is a 4-year-old girl of Mediterranean ancestry whose height and body weight are below the 20th percentile for girls of her age. She tires easily and complains of loss of appetite and shortness of breath on exertion. A dull pain has been present in her right upper quadrant for the last 3 months and she appears pale. Initial laboratory studies indicate a severe anemia (decreased red blood cell count) with a hemoglobin of 7.0 g/dL (reference range, 12 to 16 g/dL). A battery of additional hematologic tests reveals that Lisa N. has β⁺-thalassemia, intermediate type.

The thalassemias are a heterogeneous group of hereditary anemias that constitute the most common gene disorder in the world, with a carrier rate of almost 5%. The disease was first discovered in countries around the Mediterranean Sea and was named for the Greek word “thalassa,” meaning “sea.” However, it is also present in areas extending into India and China that are near the Equator.

The thalassemia syndromes are caused by mutations that decrease or abolish the synthesis of the α- or β-chains in the adult hemoglobin A tetramer. Individual syndromes are named according to the chain whose synthesis is affected and the severity of the deficiency. Thus, in β⁰-thalassemia, the superscript 0 denotes none of the β-chain is present; in β⁺-thalassemia, the + denotes a partial reduction in the synthesis of the β-chain. More than 170 different mutations have been identified that cause β-thalassemia; most of these interfere with the transcription of β-globin mRNA or its processing or translation.

Isabel S., a patient with HIV (see Chapters 11 and 12), has developed a cough with gray, slightly blood-tinged sputum. A chest X-ray indicates a cavitary infiltrate in the right upper lung field. A stain of sputum shows the presence of acid-fast bacilli, suggesting a diagnosis of pulmonary tuberculosis caused by Mycobacterium tuberculosis.

Catherine T. picked mushrooms in a wooded area near her home. A few hours after eating one small mushroom, she experienced mild nausea and diarrhea. She brought a mushroom with her to the hospital emergency room. A poison expert identified it as Amanita phalloides (the “death cap”). These mushrooms contain the toxin α-amanitin.

Sarah L., a 28-year-old computer programmer, notes increasing fatigue, pleuritic chest pain, and a nonproductive cough. In addition, she complains of joint pains, especially in her hands. A rash on both cheeks and the bridge of her nose (“butterfly rash”) has been present for the last 6 months. Initial laboratory studies reveal a subnormal white blood cell count and a mild reduction in hemoglobin. Tests result in a diagnosis of systemic lupus erythematous (SLE) (frequently called lupus).

The measurement of hemoglobin levels in blood is important for the appropriate diagnosis of many diseases, such as anemia. Laboratories measure hemoglobin content by first exposing the sample (usually lysed blood cells, to release the hemoglobin from the red blood cells) to an oxidizing agent, which converts the ferrous iron in hemoglobin to its ferric state. The level of ferric iron is then determined with a second reagent (either a cyanide or an azide derivative), which reacts with the ferric iron and generates a colored product, whose concentration can be determined spectrophotometrically.

I. Action of RNA Polymerase

Transcription, the synthesis of RNA from a DNA template, is carried out by RNA polymerases (Fig. 13.2). Like DNA polymerases, RNA polymerases catalyze the formation of ester bonds between nucleotides that base-pair with the complementary nucleotides on the DNA template. Unlike DNA polymerases, RNA polymerases can initiate the synthesis of new chains in the absence of primers. They also lack the 3′-to-5′ exonuclease activity found in DNA polymerases, although they do perform rudimentary error-checking through a different mechanism. A strand of DNA serves as the template for RNA synthesis and is copied in the 3′-to-5′ direction. Synthesis of the new RNA molecule occurs in the 5′-to-3′ direction. The ribonucleoside triphosphates adenosine triphosphate (ATP), guanosine triphosphate (GTP), cytidine triphosphate (CTP), and uridine triphosphate (UTP) serve as the precursors. Each nucleotide base sequentially pairs with the complementary deoxyribonucleotide base on the DNA template (A, G, C, and U pair with T, C, G, and A, respectively). The polymerase forms an ester bond between the α-phosphate on the ribose 5′-hydroxyl of the nucleotide precursor and the ribose 3′-hydroxyl at the end of the growing RNA chain. The cleavage of a high-energy phosphate bond in the nucleotide triphosphate and release of pyrophosphate (from the β- and γ-phosphates) provides the energy for this polymerization reaction. Subsequent cleavage of the pyrophosphate by a pyrophosphatase also helps to drive the polymerization reaction forward by removing a product. The overall error rate of RNA polymerase is 1 in 100,000 bases.

FIGURE 13.2 RNA synthesis. The α-phosphate from the added nucleotide connects the ribosyl groups.

Patients with AIDS frequently develop tuberculosis. After Isabel S.’s sputum stain suggested that she had tuberculosis, a multidrug antituberculous regimen, which includes an antibiotic of the rifamycin family (rifampin), was begun. A culture of her sputum was taken to confirm the diagnosis.

Rifampin inhibits bacterial RNA polymerase, selectively killing the bacteria that cause the infection. The nuclear RNA polymerase from eukaryotic cells is not affected. Although rifampin can inhibit the synthesis of mitochondrial RNA, the concentration required is considerably higher than that used for treatment of tuberculosis.

RNA polymerases must be able to recognize the start point for transcription of each gene and the appropriate strand of DNA to use as a template. They also must be sensitive to signals that reflect the need for the gene product and control the frequency of transcription. A region of regulatory sequences called the promoter (often composed of smaller sequences called boxes or elements), usually contiguous with the transcribed region, controls the binding of RNA polymerase to DNA and identifies the start point (see Fig. 13.1). The frequency of transcription is controlled by regulatory sequences within the promoter and nearby the promoter (promoter-proximal elements) and by other regulatory sequences, such as enhancers (also called distal-promoter elements), that may be located at considerable distances—sometimes thousands of nucleotides—from the start point. Both the promoter-proximal elements and the enhancers interact with proteins that stabilize RNA polymerase binding to the promoter.

II. Types of RNA Polymerases

Bacterial cells have a single RNA polymerase that transcribes DNA to generate all of the different types of RNA (mRNA, rRNA, and tRNA). The RNA polymerase of Escherichia coli contains five subunits (2α, β, β′, and ω), which form the core enzyme. Another protein called a σ (sigma) factor binds the core enzyme and directs binding of RNA polymerase to specific promoter regions of the DNA template. The σ factor dissociates shortly after transcription begins. E. coli has a number of different σ factors that recognize the promoter regions of different groups of genes. The major σ factor is σ⁷⁰, a designation related to its molecular weight of 70,000 Da.

In contrast to prokaryotes, eukaryotic cells have three nuclear RNA polymerases (Table 13.1). Polymerase I produces most of the rRNAs, polymerase II produces mRNA and microRNAs (microRNAs regulate gene expression and are discussed in more detail in Chapter 15), and polymerase III produces small RNAs, such as tRNA and 5S rRNA. All of these RNA polymerases have the same mechanism of action. However, they recognize different types of promoters. The mitochondria has its own RNA polymerase to transcribe genes located on the mitochondrial genome.

TABLE 13.1 Products of Nuclear Eukaryotic RNA Polymerases

POLYMERASE	PRODUCT
RNA polymerase I	rRNA
RNA polymerase II	mRNA + micro RNA (miRNA)
RNA polymerase III	tRNA + other small RNAs

A. Sequences of Genes

Double-stranded DNA consists of a coding strand and a template strand (Fig. 13.3). The DNA template strand is the strand that is actually used by RNA polymerase during the process of transcription. It is complementary and antiparallel both to the coding (nontemplate) strand of the DNA and to the RNA transcript produced from the template. Thus, the coding strand of the DNA is identical in base sequence and direction to the RNA transcript, except, of course, that wherever this DNA strand contains a T, the RNA transcript contains a U. By convention, the nucleotide sequence of a gene is represented by the letters of the nitrogenous bases of the coding strand of the DNA duplex. It is written from left to right in the 5′-to-3′ direction.

FIGURE 13.3 Relationship between the coding strand of DNA (also known as the sense strand, or the nontemplate strand), the DNA template strand (also known as the antisense strand), the mRNA transcript, and the protein produced from the gene. The bases in mRNA are used in sets of three (called codons) to specify the order of the amino acids inserted into the growing polypeptide chain during the process of translation (see Chapter 14).

During translation, mRNA is read 5′ to 3′ in sets of three bases, called codons, that determine the amino acid sequence of the protein (see Fig. 13.3). Thus, the base sequence of the coding strand of the DNA can be used to determine the amino acid sequence of the protein. For this reason, when gene sequences are given, they refer to the coding strand.

A gene consists of the transcribed region and the regions that regulate transcription of the gene (e.g., promoter and enhancer regions) (Fig. 13.4). The base in the coding strand of the gene serving as the start point for transcription is numbered +1. This nucleotide corresponds to the first nucleotide incorporated into the RNA at the 5′-end of the transcript. Subsequent nucleotides within the transcribed region of the gene are numbered +2, +3, and so on, toward the 3′-end of the gene. Untranscribed sequences to the left of the start point, known as the 5′-flanking region of the gene, are numbered −1, −2, −3, and so on, starting with the nucleotide (−1) immediately to the left of the start point (+1) and moving from right to left. By analogy to a river, the sequences to the left of the start point are said to be upstream from the start point and those to the right are said to be downstream.

FIGURE 13.4 A schematic view of a eukaryotic gene, and steps required to produce a protein product. The gene consists of promoter and transcribed regions. The transcribed region contains introns, which do not contain coding sequences for proteins, and exons, which do carry coding sequences for proteins. The first RNA form produced is heterogeneous nuclear RNA (hnRNA), which contains both intronic and exonic sequences. The hnRNA is modified such that a cap is added at the 5′-end (cap site) and a poly(A) tail is added to the 3′-end. The introns are removed (a process called splicing) to produce the mature mRNA, which leaves the nucleus to direct protein synthesis in the cytoplasm. Py is pyrimidine (C or T). Although the TATA box is still included in this figure for historical reasons, only 12.5% of eukaryotic promoters contain this sequence.

B. Recognition of Genes by RNA Polymerase

For genes to be expressed, RNA polymerase must recognize the appropriate point at which to start transcription and the strand of the DNA to transcribe (the template strand). RNA polymerase also must recognize which genes to transcribe because transcribed genes are only a small fraction of the total DNA. The genes that are transcribed differ from one type of cell to another and change with alterations in physiologic conditions. These signals in DNA that RNA polymerase recognizes are called promoters. Promoters are sequences in DNA (often composed of smaller sequences called boxes or elements) that determine the start point and the frequency of transcription. Because they are located on the same molecule of DNA and near the gene they regulate, they are said to be cis-acting (i.e., “cis” refers to acting on the same side). Proteins that bind to these DNA sequences and facilitate or prevent the binding of RNA polymerase are said to be trans-acting.

C. Promoter Regions of Genes for mRNA

The binding of RNA polymerase and the subsequent initiation of gene transcription involves a number of consensus sequences in the promoter regions of the gene (Fig. 13.5). A consensus sequence is the sequence that is most commonly found in a given region when many genes are examined. In prokaryotes, an adenine- and thymine-rich consensus sequence in the promoter determines the start point of transcription by binding proteins that facilitate the binding of RNA polymerase. In the prokaryote E. coli, this consensus sequence is TATAAT, which is known as the TATA or Pribnow box. It is centered about −10 and is recognized by the sigma factor σ⁷⁰. A similar sequence in the −25 region of about 12.5% of eukaryotic genes has a consensus sequence of TATA(A/T)A. (The [A/T] in the fifth position indicates that either A or T occurs with equal frequency.) This eukaryotic sequence is also known as a TATA box, but it is sometimes named the Hogness or Hogness–Goldberg box after its discoverers. Other consensus sequences involved in binding of RNA polymerase are found farther upstream in the promoter region (see Fig. 13.5) or downstream after the transcriptional start signal. Bacterial promoters contain a sequence TTGACA in the −35 region. Eukaryotes frequently have disparate sequences, such as the TFIIB-recognition element (a GC-rich sequence, abbreviated as BRE), the initiator element, the downstream promoter element (DPE), and the motif ten element (MTE). The DPE and MTE are found downstream from the transcription start site. Eukaryotic genes also contain promoter-proximal elements (in the region of −100 to −200), which are sites that bind other gene regulatory proteins. Genes vary in the number of such sequences present. An analysis of close to 10,000 promoter sequences indicated that the initiator element was the most common element in these promoters (about 50%), whereas BRE and DPE were present in about 15% of the promoters, and TATA the least abundant, at 12.5% of the promoters.

FIGURE 13.5 Prokaryotic and eukaryotic promoters. The promoter-proximal region contains binding sites for transcription factors that can accelerate the rate at which RNA polymerase binds to the promoter. BRE, TFIIB-recognition element; DPE, downstream promoter element; Inr, initiator element; MTE, motif ten element; Pu, purine; Py, pyrimidine.

Lisa N. has a β⁺-thalassemia classified clinically as β-thalassemia intermedia. She produces an intermediate amount of functional β-globin chains (her hemoglobin is 7.0 g/dL; normal is 12 to 16 g/dL). β-Thalassemia intermedia is usually the result of two different mutations (one that mildly affects the rate of synthesis of β-globin and one severely affecting its rate of synthesis) or, less frequently, homozygosity for a mild mutation in the rate of synthesis, or a complex combination of mutations. For example, mutations within the promoter region of the β-globin gene could result in a significantly decreased rate of β-globin synthesis in an individual who is homozygous for the allele, without completely abolishing synthesis of the protein.

Two of the point mutations that result in a β⁺-phenotype are within the TATA box (A → G or A → C in the −28 to −31 region) for the β-globin gene. These mutations reduce the accuracy of the start point of transcription so that only 20% to 25% of the normal amount of β-globin is synthesized. Other mutations that also reduce the frequency of β-globin transcription have been observed farther upstream in the promoter region (−87 C → G and −88 C → T).

In bacteria, a number of protein-producing genes may be linked together and controlled by a single promoter. This genetic unit is called an operon (Fig. 13.6). One mRNA is produced that contains the coding information for all of the proteins encoded by the operon. Proteins bind to the promoter and either inhibit or facilitate transcription of the operon. Repressors are proteins that bind to a region in the promoter known as the operator and inhibit transcription by preventing the binding of RNA polymerase to DNA. Activators are proteins that stimulate transcription by binding within the −35 region or upstream from it, facilitating the binding of RNA polymerase. (Operons are described in more detail in Chapter 15.)

FIGURE 13.6. Bacterial operon. A cistron encodes a single polypeptide chain. In bacteria, a single promoter may control transcription of an operon containing many cistrons. A single polycistronic mRNA is transcribed. Its translation produces a number of polypeptide chains.

In eukaryotes, proteins known as general transcription factors (or basal factors) bind to the TATA box (or other promoter elements, in the case of TATA-less promoters) and facilitate the binding of RNA polymerase II, the polymerase that transcribes mRNA (Fig. 13.7). This binding process involves at least six basal transcription factors (labeled as TFIIs, transcription factors for RNA polymerase II). The TATA-binding protein (TBP), which is a component of TFIID, initially binds to the TATA box. TFIID consists of both the TBP and a number of transcriptional coactivators. Components of TFIID will also recognize initiator and DPE boxes in the absence of a TATA box. TFIIA and TFIIB interact with TBP. RNA polymerase II binds to the complex of transcription factors and to DNA and is aligned at the start point for transcription. TFIIE, TFIIF, and TFIIH subsequently bind, cleaving adenosine triphosphate (ATP), and transcription of the gene is initiated.

FIGURE 13.7 Transcription apparatus. The TATA-binding protein (TBP), a component of TFIID, binds to the TATA box. Transcription factors TFII A and B bind to TBP. RNA polymerase binds, then TFII E, F, and H bind. This complex can transcribe at a basal level. Some coactivator proteins are present as a component of TFIID, and these can bind to other regulatory DNA-binding proteins (called specific transcription factors or transcriptional activators). TFIID also recognizes the initiator element (Inr) and the DPE in the case of TATA-less promoters (see Fig. 13.5).

With only these transcription (or basal) factors and RNA polymerase II attached (the basal transcription complex), the gene is transcribed at a low or basal rate. TFIIH plays a number of roles in both transcription and DNA repair. In both processes, it acts as an ATP-dependent DNA helicase, unwinding DNA for either transcription or repair to occur. Two of the forms of xeroderma pigmentosum (XPB and XPD; see Chapter 12) arise from mutations within two different helicase subunits of TFIIH. TFIIH also contains a kinase activity, and RNA polymerase II is phosphorylated by this factor during certain phases of transcription.

The rate of transcription can be further increased by binding of other regulatory DNA-binding proteins to additional gene regulatory sequences (such as the promoter-proximal or enhancer regions). These regulatory DNA-binding proteins are called gene-specific transcription factors (or transactivators) because they are specific to the gene involved (see Chapter 15). They interact with coactivators in the basal transcription complex. These are depicted in Figure 13.7 under the general term “coactivators.” Coactivators consist of transcription associated factors (TAFs) that interact with transcription factors through an activation domain on the transcription factor (which is bound to DNA). The TAFs interact with other factors (described as the mediator proteins), which in turn interact with the RNA polymerase complex. These interactions are further discussed in Chapter 15.

III. Transcription of Bacterial Genes

In bacteria, binding of RNA polymerase with a σ factor to the promoter region of DNA causes the two DNA strands to unwind and separate within a region approximately 10 to 20 nucleotides in length. As the polymerase transcribes the DNA, the untranscribed region of the helix continues to separate, whereas the transcribed region of the DNA template rejoins its DNA partner (Fig. 13.8). The σ factor is released when the growing RNA chain is approximately 10 nucleotides long. The elongation reactions continue until the RNA polymerase encounters a transcription termination signal. One type of termination signal involves the formation of a hairpin loop in the transcript, preceding a number of U residues. The second type of mechanism for termination involves the binding of a protein, the rho factor, which causes release of the RNA transcript from the template in an energy-requiring mechanism. The signal for both termination processes is the sequence of bases in the newly synthesized RNA.

FIGURE 13.8 An overview of transcription at the site of RNA synthesis.

A cistron is a region of DNA that encodes a single polypeptide chain. In bacteria, mRNA is usually generated from an operon as a polycistronic transcript (one that contains the information to produce a number of different proteins). Because bacteria do not contain a nucleus, the polycistronic transcript is translated as it is being transcribed. This process is known as coupled transcription translation. This transcript is not modified and trimmed, and it does not contain introns (regions within the coding sequence of a transcript that are removed before translation occurs). Several different proteins are produced during translation of the polycistronic transcript, one from each cistron (see Fig. 13.6).

In prokaryotes, rRNA is produced as a single, long transcript that is cleaved to produce the 16S, 23S, and 5S ribosomal RNAs. tRNA is also cleaved from larger transcripts (Fig. 13.9). One of the cleavage enzymes, RNase P, is a protein containing an RNA molecule. This RNA actually catalyzes the cleavage reaction.

FIGURE 13.9 Bacterial ribosomal RNA (rRNA) and transfer RNA (tRNA) transcripts. One large precursor is cleaved (at *arrows*) to produce 16S, 23S, and 5S rRNA and some tRNAs.

IV. Transcription of Eukaryotic Genes

The process of transcription in eukaryotes is similar to that in prokaryotes. RNA polymerase binds to the transcription factor complex in the promoter region and to the DNA, the helix unwinds within a region near the start point of transcription, DNA strand separation occurs, synthesis of the RNA transcript is initiated, and the RNA transcript is elongated, copying the DNA template. The DNA strands separate as the polymerase approaches and rejoin as the polymerase passes.

One of the major differences between eukaryotes and prokaryotes is that eukaryotes have more elaborate mechanisms for processing the transcripts, particularly the precursors of mRNA (pre-mRNA). Eukaryotes also have three polymerases, rather than just the one present in prokaryotes. Other differences include the facts that eukaryotic mRNA usually contains the coding information for only one polypeptide chain and that eukaryotic RNA is transcribed in the nucleus and migrates to the cytoplasm where translation occurs. Thus, coupled transcription translation does not occur in eukaryotes.

A. Synthesis of Eukaryotic mRNA

In eukaryotes, extensive processing of the primary transcript occurs before the mature mRNA is formed and can migrate to the cytosol where it is translated into a protein product. RNA polymerase II synthesizes a large primary transcript from the template strand that is capped at the 5′-end as it is transcribed (Fig. 13.10). The transcript also rapidly acquires a poly(A) tail at the 3′-end. Pre-mRNAs thus contain untranslated regions at both the 5′- and 3′-ends (the leader and trailing sequences, respectively). These untranslated regions are retained in the mature mRNA. The coding region of the pre-mRNA, which begins with the start codon for protein synthesis and ends with the stop codon, contains both exons and introns. Exons consist of the nucleotide codons that dictate the amino acid sequence of the eventual protein product. Between the exons, interspersing regions called introns contain nucleotide sequences that are removed by splicing reactions to form the mature RNA. The mature RNA thus contains a leader sequence (that includes the cap), a coding region comprising exons, and a trailing sequence that includes the poly(A) tail.

FIGURE 13.10 Overview of eukaryotic messenger RNA (mRNA) synthesis. Transcription produces heterogeneous nuclear RNA (hnRNA; also known as pre-mRNA) from the DNA template. hnRNA processing involves addition of a 5′-cap and a poly(A) tail and splicing to join exons and remove introns. The product, mRNA, migrates to the cytoplasm, where it will direct protein synthesis.

There are three different types of methyl caps, shown in red:

CAP 0 refers to the methylated guanosine (on the nitrogen at the seven position, N⁷) added in the 5′-to-5′ linkage to the mRNA; CAP 1 refers to CAP 0 with the addition of a methyl group to the 2′-carbon of ribose on the nucleotide (N₁) at the 5′-end of the chain; and CAP 2 refers to CAP 1 with the addition of another 2′-methyl group to the next nucleotide (N₂). The methyl groups are donated by SAM.

Once SAM donates its methyl group, it must be regenerated by reactions that require the vitamins folate and B₁₂. Thus, formation of mRNA is also one of the processes affected by a deficiency of these vitamins.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Marks Basic Medical Biochemistry A Clinical Approach

Aug 7, 2022 | Posted by admin in BIOCHEMISTRY | Comments Off

Basicmedical Key

Fastest Basicmedical Insight Engine

Transcription: Synthesis of RNA

13