Synthesis of mRNAs

An overview of mRNA synthesis and degradation is shown in Figure 16-1.

Figure 16-1 Synthesis and degradation of eukaryotic mRNAs. Nascent mRNA transcripts are transcribed by RNA polymerase II. Formation of the 5′ cap structure and cleavage and polyadenylation of the 3′ end of the mRNA both occur cotranscriptionally and involve factors that are recruited by the C-terminal domain (CTD) of the transcribing polymerase (see Fig. 15-4). The termination of transcription requires both the recognition of the site of polyadenylation and the activity of the 5′-exonuclease Rat1, which degrades the nascent RNA transcripts. Rat1 binds to the polymerase CTD via Rtt103. Pre-mRNA splicing can either be cotranscriptional or occur shortly after transcript release, and recruitment of splicing factors is not strongly dependent on the CTD. In human cells, the spliceosome deposits the exon-junction complex (EJC) around 24 nucleotides upstream of the site of splicing. Several steps in nuclear mRNA maturation also are subject to surveillance. In yeast, nuclear pre-mRNAs can be either 3′ degraded by the nuclear exosome complex or decapped and 5′ degraded by the exonuclease Rat1. Nuclear decapping requires the Lsm2–8 complex and is probably performed by the Dcp1/2 decapping complex. Once in the cytoplasm, the mRNA is translated into proteins and undergoes degradation. Several different mRNA degradation pathways have been identified. A, Nonsense-mediated decay (NMD). If the EJCs all lie within or very close to the ORF, they will be displaced by the translating ribosomes. However, if an EJC lies beyond the end of the ORF, it will remain on the translated mRNA. This is taken as evidence that translation has terminated prematurely and triggers the NMD pathway. Recognition of the EJC requires the Upf1/2/3 surveillance complex, which also interacts with the ribosomes as they terminate translation. In yeast, NMD triggers both rapid decapping and 5′ degradation, without prior deadenylation, and 3′ degradation by the exosome. B, General mRNA turnover. During translation, most mRNAs undergo progressive poly(A) tail shortening. Loss of the poly(A) tail leads to rapid degradation. As in the nucleus, cytoplasmic mRNAs can be degraded from either the 5′ or the 3′ end. 5′ degradation occurs largely in a specialized cytoplasmic region termed the P body in yeast or cytoplasmic foci in human cells. Here, the mRNAs are decapped by the Dcp1/2 heterodimer and then degraded by the cytoplasmic 5′-exonuclease Xrn1. Both activities are strongly stimulated by the cytoplasmic Lsm1–7 complex. Alternatively, deadenylated mRNAs can be 3′ degraded by the cytoplasmic exosome. C, ARE-mediated decay. In this pathway, specific A+U rich elements (AREs) are recognized by ARE-binding proteins (ARE-BP) in the nucleus. These are transported to the cytoplasm in association with the mRNA and recruit the cytoplasmic exosome to rapidly degrade the RNA. D, Nonstop decay. If the mRNA lacks a translation termination codon, the first translating ribosome will stall and be trapped at the 3′ end of the RNA. The Ski7 protein, which is associated with the cytoplasmic exosome complex, is believed to release the stalled ribosome and target the RNA for 3′ degradation by the exosome. Note that this legend provides detail beyond the text.

mRNA Capping and Polyadenylation

Two distinguishing features set mRNA apart from other RNAs: a 5′ cap structure and a 3′ poly(A) tail. Both of these elements help to protect the mRNA against degradation and act synergistically to promote translation in the cytoplasm.

The mRNA cap is an unusual structure. It consists of an inverted 7-methylguanosine residue, which is joined onto the body of the mRNA by a 5′-triphosphate-5′ linkage (Fig. 16-2). Cap addition involves three enzymatic activities: A 5′ RNA triphosphatase cleaves the 5′ triphosphate on the nascent transcript to a diphosphate; RNA guanylyltransferase forms a covalent enzyme–GMP complex and then caps the RNA by transferring this to the diphosphate; and RNA (guanine-7) methyltransferase covalently alters the guanosine base by methylation, generating m⁷G. In addition, the first encoded nucleotides are frequently modified by methylation of the 2′ hydroxyl position on the ribose group, but the functional significance of these internal modifications is currently unclear.

Figure 16-2 mRNAs have a distinctive 5′ cap structure. A, The 5′ ends of mRNAs are blocked by an inverted guanosine residue that is attached to the body of the mRNA by a 5′–5′ triphosphate linkage. The N7 position of the guanosine is methylated (red). The first encoded nucleotide of the mRNA (Nuc 1) is also methylated on the 2′-hydroxyl of the ribose ring. The second nucleotide (Nuc 2) may also be methylated. B, Capping of mRNAs is a multistep process.

During 3′ processing, the nascent pre-mRNA is cleaved by an endonuclease, and a tail of adenosine residues is added by poly(A) polymerase. Around 200 to 250 A residues are added to mRNAs in human cells, and around 70 to 90 are added in yeast. Cleavage and polyadenylation are performed by a large complex containing approximately 20 proteins that recognizes sequences in the mRNA, of which the best defined is a highly conserved AAUAAA motif located upstream of the site of polyadenylation (Fig. 16-3).

Figure 16-3 Signals for pre-mRNA polyadenylation. A, Poly(A) tails are added to pre-mRNAs following transcription. After pol II transcribes the protein-coding region of the mRNA, it encounters two sequence elements: AAUAAA and a GU-rich element. These act as signals for the assembly of a large 3′ processing complex that cleaves the nascent pre-mRNA, releasing it from the transcription complex, and adds a tail of up to 200 adenosine residues. B, The poly A signal is highly conserved in vertebrates.

Links between mRNA Processing and Transcription

The processes of cap addition and 3′ cleavage and polyadenylation are both linked to transcription of the mRNA by RNA polymerase II and occur cotranscriptionally on the nascent RNA (Fig. 16-1). The C-terminal domain (CTD) of the largest subunit of RNA polymerase II (RNA pol II) consists of many copies of a seven-amino-acid repeat (YSPTSPS), which undergo reversible modification by phosphorylation (see Fig. 15-4). A pronounced change in the CTD phosphorylation pattern coincides with the release of the polymerase from initiation mode into processive elongation mode. Immediately following transcription initiation, the repeats are largely phosphorylated on the serine residue at position 5. This modification is lost, while serine 2 phosphorylation increases, as the polymerase moves along the transcript. Capping of the 5′ end of the mRNA occurs by the time the transcript is approximately 25 to 30 nucleotides long, and the capping enzyme interacts with the serine 5 phosphorylated CTD. This and other interactions with the polymerase result in strong allosteric activation of capping activity. In contrast, the cleavage and polyadenylation factors involved in 3′ end processing are recruited by interaction with the CTD phosphorylated at serine 2.

The termination of transcription by RNA polymerase II is dependent on RNA processing. Termination requires recognition of the poly(A) site by the cleavage and polyadenylation factors. These are carried with the transcribing polymerase, and their offloading might make the polymerase competent for termination. Cleavage of the nascent transcript also allows the entry of a 5′ exonuclease—an enzyme that can degrade RNA from the 5′ end in a 3′ direction. This enzyme, which is called Rat1 in yeast and Xrn2 in humans, then chases after the transcribing polymerase, degrading the newly transcribed RNA strand as it goes. When the exonuclease catches the polymerase, it stimulates termination of transcription. This is referred to as the Torpedo model for transcription termination.

Human β-globin mRNA precursors contain an additional cleavage site (termed the cotranscriptional cleavage site) downstream of the site of polyadenylation. The cotranscriptional cleavage site RNA sequence has intrinsic self-cleavage activity in the absence of proteins. Such an RNA is referred to as a self-cleaving ribozyme. This cleavage provides an entry site for the Xrn2 nuclease, allowing more efficient termination.

Regulated 3′ End Formation on Histone mRNAs

A different 3′ end processing system is seen for mRNAs encoding the major, replication-dependent histone proteins. These are highly expressed only during DNA replication, when they must package the newly synthesized DNA. A sequence in the 3′ untranslated region (3′ UTR) of these mRNAs is recognized by base pairing to a small RNA: the U7 snRNA. In addition, a specific stem-loop structure is recognized by a protein that is referred to as the stem-loop binding protein. Endonuclease cleavage generates the mature 3′ end of the mRNA, which is not polyadenylated but is protected by the stem-loop binding protein. The efficiency of histone mRNA synthesis is increased during DNA replication at least in part by increased abundance of stem-loop binding protein. Minor histone variants that are synthesized throughout the cell cycle are polyadenylated like other mRNAs.

Pre-mRNA Splicing

Important experiments in the 1950s and 1960s established that genes were collinear with their protein products. It therefore came as a considerable surprise when, in the late 1970s, it emerged that genes in animals and plants frequently had numerous strikingly large inserts whose sequence was not included in the mature mRNA or the protein product. It turns out that most human pre-mRNAs undergo splicing reactions, in which specific regions are cut out and the remaining RNA is covalently rejoined. The regions that will form the mRNA are termed exons, and the bits that are cut out (and are normally degraded) are called introns. In unicellular eukaryotes, introns are generally a few hundred nucleotides in length or shorter. In metazoans, however, they are often several kilobases in length, and pre-mRNAs can contain many introns. It is therefore remarkable that all of the sites can be precisely identified and spliced.

Signals for Splicing

The signals in the pre-mRNA that identify the introns and exons are recognized by a combination of proteins and a group of small RNAs called the small nuclear RNAs (snRNAs). The snRNAs function in complexes with proteins in small nuclear ribonucleoprotein (snRNP) particles. Splicing occurs in a large complex termed the spliceosome, within which the pre-mRNA assembles together with five snRNAs (U1, U2, U4, U5, and U6) and around 100 different proteins. Particularly important protein-splicing factors are members of a large group of SR-proteins—so named because they contain domains rich in serine-arginine dipeptides.

Three conserved sequences within introns play key roles in their accurate recognition by the splicing machinery (Fig. 16-4). These lie immediately adjacent to the 5′ splice site and the 3′ splice site and surrounding an internal region that will form the intron branch point during the splicing reaction. The U1 and U6 snRNAs have sequences that are complementary to the 5′ splice site, while U2 is complementary to the branch point region.

Figure 16-4 Signals and mechanism of pre-mrna splicing. The precursors to most mRNAs in humans and other eukaryotes contain regions (introns) that will not form part of the mature mRNA and do not encode protein products. During pre-mRNA splicing, the introns are removed and the flanking regions (exons) are ligated. A, Introns contain three conserved sequence elements that are recognized during splicing. These lie at the 5′ and 3′ splice sites and surrounding the branch-point adenosine within the intron. Numbers indicate the degree of conservation at each position in mammalian pre-mRNAs. The branch point sequence is much more highly conserved between different pre-mRNAs in yeast. The region between the branch point and the 3′ splice site frequently contains a run of pyrimidine residues, which is referred to as the polypyrimidine tract. B, Pre-mRNA splicing involves two catalytic steps. An attack by the branch-point adenosine on the 5′ splice site releases the 5′ exon and intron as a circularized molecule (referred to as the intron lariat) joined to the 3′ exon. In the second step, the 3′ end of the 5′ exon attacks the 3′ splice site releasing the joined exons and the free intron lariat. The lariat is subsequently linearized (debranched) and degraded.

While the spliceosome will finally bring together the sequences at each end of the intron, it is believed that the splicing machinery initially recognizes the exons in a reaction termed exon definition. This makes sense because mRNA exons are generally quite small—up to a few hundred nucleotides in length—whereas the introns can be many kilobases long.

No sequences in the exons are strictly required for splicing, but there are important stimulatory elements termed exonic splicing enhancers (ESEs), which generally bind members of the SR-protein family. The ESEs have two major functions: They stimulate the use of the flanking 5′ and 3′ splice sites, promoting exon definition, and they prevent the exon in which they are located from being included in an intron. This latter function is particularly important in ensuring that all introns are spliced out without the splicing machinery skipping from the 5′ end of one intron to the 3′ end of a downstream intron.

The Pre-mRNA Splicing Reaction

The splicing reaction proceeds in two steps (Fig. 16-4). In the first, the 5′–3′ phosphate linkage that joins the 5′ exon to the first nucleotide of the intron—at the 5′ splice site—is attacked and broken. This reaction leaves the 5′ end of the intron attached to the adenosine residue via an unusual 5′–2′ phosphate linkage. Since this adenosine remains attached to the flanking nucleotides by conventional 5′ and 3′ phosphodiester bonds, this creates a circular molecule with a tail that includes the 3′ exon. This structure is termed the intron lariat, and the adenosine to which the 5′ end of the intron is attached is termed the branch point, because it has a branched structure. In the second step of splicing, the free 3′ hydroxyl on the 5′ exon is used to attack and break the linkage between the last nucleotide of the intron and the 3′ exon—at the 3′ splice site. This leaves the 5′ and 3′ exons joined by a conventional 5′–3′ linkage and releases the intron as a lariat. This is linearized by the debranching enzyme and is probably rapidly degraded from both ends by exonucleases.

The initial steps in splicing are the recognition of the 5′ splice site by the U1 snRNA and the binding of U2 snRNA to the branch-point region, assisted by SR-proteins (Fig. 16-5). Base pairing between U2 and the pre-mRNA leaves a single adenosine bulged out of a helix and available for interaction with the 5′ splice site. The U4 and U6 snRNAs then join the spliceosome as a base-paired duplex, within a large complex that also contains the U5 snRNA. The U4 and U6 base pairing is opened, and the liberated U6 sequences displace U1 at the 5′ splice site. They also bind to U2—bringing the 5′ splice site and branch point into close proximity. At this point, the first enzymatic step of splicing occurs. This reaction is believed to be directly catalyzed by the intricate structure of the snRNA/pre-mRNA interactions rather than by the protein components of the spliceosome. The 5′ splice site is attacked and broken by the ribose 2′ hydroxyl group of the adenosine residue that is bulged out of the U2-intron duplex. The U5 snRNA and its associated proteins are responsible for holding onto the now free 5′ exon and correctly aligning it with the 3′ exon for the second catalytic step of splicing.

Figure 16-5 Small nuclear rnas play key role in pre-mrna splicing. Although shown as RNAs, the snRNAs function in large RNA-protein complexes termed snRNPs. Despite this fact, the major steps in both intron recognition and catalysis are believed to be performed by the snRNAs. The 5′ splice site and intron branch point are recognized by base pairing to the U1 and U2 snRNAs, respectively. The U5 snRNA enters the spliceosome in a complex with U4 and U6, which are tightly base-paired. U5 forms contacts with both the 5′ and 3′ exons. U4 releases U6, which base-pairs to U2 and then displaces U1 in binding to the 5′ splice site. Within this very complex RNA structure, the 2′ hydroxyl group on the branch point adenosine, which is bulged out of the duplex between U2 and the pre-mRNA, attacks the phosphate group at the junction between the 5′ exon and the intron. In a transesterification reaction, the phosphate backbone is broken at the 5′ splice site. The 5′ exon is released with a 3′ OH group, and the 5′ phosphate of the intron is transferred onto the 2′ position of the ribose on the branch point adenosine, creating the intron lariat structure. U5 retains the 5′ exon and aligns it for a second transesterification reaction, during which the 3′ hydroxyl on the 5′ exon attacks the 3′ splice site, joining the exons and releasing the intron lariat.

Both catalytic steps in splicing are technically termed transesterification reactions, because nucleotides are linked by phosphodiester bonds, and the new bond is made at the same time as the old bond is broken. For this reason, the splicing reactions do not, in principle, require any input of energy. However, the assembly and subsequent disassembly of the spliceosome require numerous ATPases. Most of these belong to a family of proteins that are generally termed RNA helicases. These are believed to use the energy of ATP hydrolysis to catalyze structural rearrangements within the assembling and disassembly spliceosome.

AT-AC Introns

The large majority of human mRNA splice sites have a GU dinucleotide at the 5′ splice site and AG at the 3′ splice site (Fig. 16-4). However, a minor group of introns contain different consensus splicing signals and are termed AT-AC (pronounced “attack”) introns because of the identities of the nucleotides located at the 5′ and 3′ splice sites. The splicing of the AT-AC introns involves a distinct set of snRNAs—U11, U12, U4_ATAC, and U6_ATAC—which replace U1, U2, U4, and U6, respectively. Only U5 is common to both spliceosomes. However, the underlying splicing mechanism is believed to be the same for both classes of intron.