47
Glycoproteins
OBJECTIVES
After studying this chapter, you should be able to:
Have a general appreciation of the importance of glycobiology and glycomics, and in particular of glycoproteins, in health and disease.
Know the principal sugars found in glycoproteins.
Be aware of the several major classes of glycoproteins (N-linked, O-linked, and GPI-linked).
Understand the major features of the pathways of biosynthesis and degradation of O– and N-linked glycoproteins.
Understand the importance of advanced glycation end-products in causing tissue damage in diabetes mellitus.
Be able to indicate the involvement of glycoproteins in inflammation and in a host of conditions including I-cell disease, congenital disorders of glycation, paroxysmal nocturnal hemoglobinuria and cancer.
Be familiar with the concept that many microorganisms, such as influenza virus, attach to cell surfaces via sugar chains.
BIOMEDICAL IMPORTANCE
Glycobiology is the study of the roles of sugars in health and disease. The glycome is the entire complement of sugars, whether free or present in more complex molecules, of an organism. Glycomics, an analogous term to genomics and proteomics, is the comprehensive study of glycomes, including genetic, physiologic, pathologic, and other aspects.
One major class of molecules included in the glycome is glycoproteins. These are proteins that contain oligosaccharide chains (glycans) covalently attached to their polypeptide backbones. It has been estimated that approximately 50% of eukaryotic proteins have sugars attached, so that glycosylation (enzymic attachment of sugars) is the most frequent posttranslational modification of proteins. Nonenzymic attachment of sugars to proteins can also occur, and is referred to as glycation. This process can have serious pathologic consequences (eg, in poorly controlled diabetes mellitus). Glycoproteins are one class of glycoconjugate or complex carbohydrate—equivalent terms used to denote molecules containing one or more carbohydrate chains covalently linked to protein (to form glycoproteins or proteoglycans) or lipid (to form glycolipids). (Proteoglycans are discussed in Chapter 48 and glycolipids in Chapter 15.) Almost all the plasma proteins of humans—with the notable exception of albumin—are glycoproteins. Many proteins of cellular membranes (Chapter 40) contain substantial amounts of carbohydrate. A number of the blood group substances are glycoproteins, whereas others are glycosphingolipids. Certain hormones (eg, chorionic gonadotropin) are glycoproteins. A major problem in cancer is metastasis, the phenomenon whereby cancer cells leave their tissue of origin (eg, the breast), migrate through the bloodstream to some distant site in the body (eg, the brain), and grow there in an unregulated manner, with catastrophic results for the affected individual. Many cancer researchers think that alterations in the structures of glycoproteins and other glycoconjugates on the surfaces of cancer cells are important in the phenomenon of metastasis.
GLYCOPROTEINS OCCUR WIDELY & PERFORM NUMEROUS FUNCTIONS
Glycoproteins occur in most organisms, from bacteria to humans. Many viruses also contain glycoproteins, some of which have been much investigated, in part because they often play key roles in viral attachment to cells (eg, HIV-1 and influenza A virus). Numerous proteins with diverse functions are glycoproteins (Table 47-1); their carbohydrate content ranges from 1% to over 85% by weight.
TABLE 47–1 Some Functions Served by Glycoproteins
Many studies have been conducted in an attempt to define the precise roles oligosaccharide chains play in the functions of glycoproteins. Table 47-2 summarizes results from such studies. Some of the functions listed are firmly established; others are still under investigation.
TABLE 47–2 Some Functions of the Oligosaccharide Chains of Glycoproteins
OLIGOSACCHARIDE CHAINS ENCODE BIOLOGIC INFORMATION
An enormous number of glycosidic linkages can be generated between sugars. For example, three different hexoses may be linked to each other to form over 1000 different trisaccharides. The conformations of the sugars in oligosaccharide chains vary depending on their linkages and proximity to other molecules with which the oligosaccharides may interact. It is now established that certain oligosaccharide chains encode biologic information and that this depends upon their constituent sugars, their sequences, and their linkages. For instance, man-nose 6-phosphate residues target newly synthesized lysosomal enzymes to that organelle (see later). The biologic information that sugars contain is expressed via interactions between specific sugars, either free or in glycoconjugates, and proteins (such as lectins; see below) or other molecules. These interactions lead to changes of the cellular activity. Thus, deciphering the so-called sugar code of life (one of the principal aims of glycomics) entails elucidating all of the interactions that sugars and sugar-containing molecules participate in, and also the results of these interactions on cellular behavior. This will not be an easy task, considering the diversity of glycans found in cells.
TECHNIQUES ARE AVAILABLE FOR DETECTION, PURIFICATION, STRUCTURAL ANALYSIS & SYNTHESIS OF GLYCOPROTEINS
A variety of methods used in the detection, purification, and structural analysis of glycoproteins are listed in Table 47-3. The conventional methods used to purify proteins and enzymes are also applicable to the purification of glycoproteins. Once a glycoprotein has been purified, the use of mass spectrometry and high-resolution NMR spectroscopy can often identify the structures of its glycan chains. Analysis of glycoproteins can be complicated by the fact that they often exist as glycoforms; these are proteins with identical amino acid sequences but somewhat different oligosaccharide compositions. Although linkage details are not stressed in this chapter, it is critical to appreciate that the precise natures of the linkages between the sugars of glycoproteins are of fundamental importance in determining the structures and functions of these molecules.
TABLE 47–3 Some Important Methods Used to Study Glycoproteins
Impressive advances are also being made in synthetic chemistry, allowing synthesis of complex glycans that can be tested for the biologic and pharmacologic activity. In addition, methods have been developed that use simple organisms, such as yeasts, to secrete human glycoproteins of therapeutic value (eg, erythropoietin) into their surrounding medium.
EIGHT SUGARS PREDOMINATE IN HUMAN GLYCOPROTEINS
About 200 monosaccharides are found in nature; however, only eight are commonly found in the oligosaccharide chains of glycoproteins (Table 47-4). Most of these sugars were described in Chapter 14. N-acetylneuraminic acid (NeuAc) is usually found at the termini of oligosaccharide chains, attached to subterminal galactose (Gal) or N-acetylgalactosamine (Gal-NAc) residues. The other sugars listed are generally found in more internal positions. Sulfate is often found in glycoproteins, usually attached to Gal, GalNAc, or GlcNAc.
TABLE 47–4 The Principal Sugars Found in Human Glycoproteins1
NUCLEOTIDE SUGARS ACT AS SUGAR DONORS IN MANY BIOSYNTHETIC REACTIONS
It is important to understand that in most biosynthetic reactions, it is not the free sugar or phosphorylated sugar that is involved in such reactions, but rather the corresponding nucleotide sugar. The first nucleotide sugar to be reported was uridine diphosphate glucose (UDP-Glc); its structure is shown in Figure 19–2. The common nucleotide sugars involved in the biosynthesis of glycoproteins are listed in Table 47-4; the reasons some contain UDP and others guanosine diphosphate (GDP) or cytidine monophosphate (CMP) are not clear. Many of the glycosylation reactions are involved in the biosynthesis of glycoproteins utilize these compounds (see below). The anhydro nature of the linkage between the phosphate group and the sugars is of the high energy, high-group-transfer-potential type (Chapter 11). The sugars of these compounds are thus “activated” and can be transferred to suitable acceptors provided appropriate transferases are available.
Most nucleotide sugars are formed in the cytosol, generally from reactions involving the corresponding nucleoside triphosphate. CMP-sialic acids are formed in the nucleus. Formation of uridine diphosphate galactose (UDP-Gal) requires the following two reactions in mammalian tissues.
Because many glycosylation reactions occur within the lumen of the Golgi apparatus, carrier systems (permeases, transporters) are necessary to transport nucleotide sugars across the Golgi membrane. Systems transporting UDP-Gal, GDP-Man, and CMP-NeuAc into the cisternae of the Golgi apparatus have been described. They are antiport systems; that is, the influx of one molecule of nucleotide sugar is balanced by the efflux of one molecule of the corresponding nucleotide (eg, UMP, GMP, or CMP) formed from the nucleotide sugars. This mechanism ensures an adequate concentration of each nucleotide sugar inside the Golgi apparatus. UMP is formed from UDP-Gal in the above process as follows.
EXO- & ENDOGLYCOSIDASES FACILITATE STUDY OF GLYCOPROTEINS
A number of glycosidases of defined specificity have proved useful in examining structural and functional aspects of glycoproteins (Table 47-5). These enzymes act at either external (exoglycosidases) or internal (endoglycosidases) positions of oligosaccharide chains. Examples of exoglycosidases are neuraminidases and galactosidases; their sequential use removes terminal NeuAc and subterminal Gal residues from most glycoproteins. Endoglycosidases F and H are examples of the latter class; these enzymes cleave the oligosaccharide chains at specific GlcNAc residues close to the polypeptide backbone (ie, at internal sites; Figure 47–5) and are thus useful in releasing large oligosaccharide chains for structural analyses. A glycoprotein can be treated with one or more of the above glycosidases to analyze the effects on its biologic behavior of removal of specific sugars.
TABLE 47–5 Some Glycosidases Used to Study the Structure and Function of Glycoproteins1
THE MAMMALIAN ASIALOGLYCO PROTEIN RECEPTOR IS INVOLVED IN CLEARANCE OF CERTAIN GLYCOPROTEINS FROM PLASMA BY HEPATOCYTES
Experiments performed by Ashwell and his colleagues in the early 1970s played an important role in focusing attention on the functional significance of the oligosaccharide chains of glycoproteins. They treated rabbit ceruloplasmin (a plasma protein; see Chapter 50) with neuraminidase in vitro. This procedure exposed subterminal Gal residues that were normally masked by terminal NeuAc residues. Neuraminidase-treated radioactive ceruloplasmin was found to disappear rapidly from the circulation, in contrast to the slow clearance of the untreated protein. Very significantly, when the Gal residues exposed to treatment with neuraminidase were removed by treatment with a galactosidase, the clearance rate of the protein returned to normal. Further studies demonstrated that liver cells contain a mammalian asialoglycoprotein receptor that recognizes the Gal moiety of many desialylated plasma proteins and leads to their endocytosis. This work indicated that an individual sugar, such as Gal, could play an important role in governing at least one of the biologic properties (ie, time of residence in the circulation) of certain glycoproteins. This greatly strengthened the concept that oligosaccharide chains could contain biologic information.
LECTINS CAN BE USED TO PURIFY GLYCOPROTEINS & TO PROBE THEIR FUNCTIONS
Lectins are carbohydrate-binding proteins that agglutinate cells or precipitate glycoconjugates; a number of lectins are themselves glycoproteins. Immunoglobulins that react with sugars are not considered lectins. Lectins contain at least two sugar-binding sites; proteins with a single sugar-binding site will not agglutinate cells or precipitate glycoconjugates. The specificity of a lectin is usually defined by the sugars that are best at inhibiting its ability to cause agglutination or precipitation. Enzymes, toxins, and transport proteins can be classified as lectins if they bind carbohydrate. Lectins were first discovered in plants and microbes, but many lectins of animal origin are now known. The mammalian asialoglyco-protein receptor described above is an important example of an animal lectin. Some important lectins are listed in Table 47-6. Much current research is centered on the roles of various animal lectins in the mechanisms of action of glycoproteins, some of which are discussed below (eg, with regard to the selectins).
TABLE 47–6 Some Important Lectins
Numerous lectins have been purified and are commercially available; three plant lectins that have been widely used experimentally are listed in Table 47-7. Among many uses, lectins have been employed to purify specific glycoproteins, as tools for probing the glycoprotein profiles of cell surfaces, and as reagents for generating mutant cells deficient in certain enzymes involved in the biosynthesis of oligosaccharide chains.
TABLE 47–7 Three Plant Lectins and the Sugars with Which They Interact1
THERE ARE THREE MAJOR CLASSES OF GLYCOPROTEINS
Based on the nature of the linkage between their polypeptide chains and their oligosaccharide chains, glycoproteins can be divided into three major classes (Figure 47–1). (1) Those containing an O-glycosidic linkage (ie, O-linked), involving the hydroxyl side chain of serine or threonine and a sugar such as N-acetylgalactosamine (GalNAc-Ser[Thr]); (2) those containing an JV-gycosidic linkage (ie, N-linked), involving the amide nitrogen ofasparagine and N-acetylglucosamine(GlcNAc-Asn); and (3) those linked to the carboxyl terminal amino acid of a protein via a phosphoryl-ethanolamine moiety joined to an oligosaccharide (glycan), which in turn is linked via glucosamine to phosphatidylinositol (PI). This latter class is referred to as glycosylphosphatidylinositol-anchored (GPI-anchored, or GPI-linked) glycoproteins. Members of this class, among other functions, are involved in directing certain glycoproteins to the apical or basolateral areas of the plasma membrane (PM) of some polarized epithelial cells (see Chapter 40 and below). Other minor classes of glycoproteins also exist.
FIGURE 47–1 Depictions of (A) an O-linkage(N-acetylgalactosamine to serine), (B) an N-linkage (N-acetylglucosamine to asparagine), and (C) a glycosylphosphatidylinositol (GPI) linkage. The GPI structure shown is that linking acetylcholinesterase to the plasma membrane of the human red blood cell. The carboxyl terminal amino acid is glycine joined in amide linkage via its COOH group to the NH2 group of phosphorylethanolamine, which in turn is joined to a mannose residue. The core glycan contains three mannose and one glucosamine residues. The glucosamine is linked to inositol, which is attached to the phosphatidic acid. The site of action of PI-phospholipase C (PI-PLC) is indicated. The structure of the core glycan is shown in the text. This particular GPI contains an extra fatty acid attached to inositol and also an extra phosphorylethanolamine moiety attached to the middle of the three mannose residues. Variations found among different GPI structures include the identity of the carboxyl terminal amino acid, the molecules attached to the mannose residues, and the precise nature of the lipid moiety.
The number of oligosaccharide chains attached to one protein can vary from one to 30 or more, with the sugar chains ranging from one or two residues in length to much larger structures. Many proteins contain more than one type of sugar chain; for instance, glycophorin, an important red cell membrane glycoprotein (Chapter 52), contains both O- and N-linked oligosaccharides.
GLYCOPROTEINS CONTAIN SEVERAL TYPES OF O-GLYCOSIDIC LINKAGES
At least four subclasses of O- glycosidic linkages are found in human glycoproteins. (1) The GalNAc-Ser(Thr) linkage shown in Figure 47–1 is the predominant linkage. Two typical oligosaccharide chains found in members of this subclass are shown in Figure 47–2. Usually a Gal or a NeuAc residue is attached to the GalNAc, but many variations in the sugar compositions and lengths of such oligosaccharide chains are found. This type of linkage is found in mucins (see below). (2) Proteoglycans contain a Gal-Gal-Xyl-Ser trisaccharide (the so-called link trisaccharide). (3) Collagens contain a Gal-Hydroxylysine (Hyl) linkage. (Subclasses [2] and [3] are discussed further in Chapter 48.) (4) Many nuclear proteins (eg, certain transcription factors) and cytosolic proteins contain side chains consisting of a single GlcNAc attached to a serine or threonine residue (GlcNAc-Ser[Thr]).
FIGURE 47–2 Structures of two O-linked oligosaccharides found in (A) submaxillary mucins and (B) fetuin and in the sialoglycoprotein of the membrane of human red blood cells. (Modified and reproduced, with permission, from Lennarz WJ: The Biochemistry of Glycoproteins and Proteoglycans. Plenum Press, 1980. Reproduced with kind permission from Springer Science and Business Media.)
Mucins Have a High Content of O-Linked Oligosaccharides & Exhibit Repeating Amino Acid Sequences
Mucins are glycoproteins with two major characteristics: (1) a high content of O-linked oligosaccharides (the carbohydrate content of mucins is generally more than 50%); and (2) the presence of variable numbers of tandem repeats (VNTRs) of peptide sequence in the centre of their polypeptide backbones, to which the O-glycan chains are attached in clusters (Figure 47–3). These sequences are rich in serine, threonine, and proline. Although O-glycans predominate, mucins often contain a number of N-glycan chains. Both secretory and membrane-bound mucins occur. The former are found in the mucus present in the secretions of the gastrointestinal, respiratory, and reproductive tracts. Mucus consists of about 94% water and 5% mucins, with the remainder being a mixture of various cell molecules, electrolytes, and remnants of cells. Secretory mucins generally have an oligomeric structure and thus often have a very high molecular mass. The oligomers are composed of monomers linked by disulfide bonds. Mucus exhibits a high viscosity and often forms a gel. These qualities are functions of its content of mucins. The high content of O-glycans confers an extended structure on mucins. This is in part explained by steric interactions between their GalNAc moieties and adjacent amino acids, resulting in a chain-stiffening effect so that the conformations of mucins often become those of rigid rods. Intermolecular noncovalent interactions between various sugars on neighboring glycan chains contribute to gel formation. The high content of NeuAc and sulfate residues found in many mucins confers a negative charge on them. With regard to their functions, mucins help lubricate and form a protective physical barrier on epithelial surfaces. Membrane-bound mucins participate in various cell-cell interactions (eg, involving selectins; see below). The density of oligosaccharide chains makes it difficult for proteases to approach their polypeptide backbones, so that mucins are often resistant to their action. Mucins also tend to “mask” certain surface antigens. Many cancer cells form excessive amounts of mucins; perhaps the mucins may mask certain surface antigens on such cells and thus protect the cells from immune surveillance. Mucins also carry cancer-specific peptide and carbohydrate epitopes (an epitope is a site on an antigen recognized by an antibody, also called an antigenic determinant). Some of these epitopes have been used to stimulate an immune response against cancer cells.
FIGURE 47–3 Much simplified schematic of a mucin. O-glycans (blue) are shown attached to two of many VNTR regions (red). N-glycans may also be present. Mucins generally contain cysteines (not shown) near their N and C termini, which are involved in polymerization via disulfide bridges. Other domains (D) near their N-termini are also involved in polymerization. Membrane-bound mucins contain transmembrane and cytosolic domains, in addition to larger extracellular domains containing O-glycans.
The genes encoding the polypeptide backbones of a number of mucins derived from various tissues (eg, pancreas, small intestine, trachea and bronchi, stomach, and salivary glands) have been cloned and sequenced. These studies have revealed new information about the polypeptide backbones of mucins (size of tandem repeats, potential sites of N-glycosylation, etc) and ultimately should reveal aspects of their genetic control. Some important properties of mucins are summarized in Table 47-8.
TABLE 47–8 SomePropertiesofMucins
The Biosynthesis of O-Linked Glycoproteins Uses Nucleotide Sugars
The polypeptide chains of O-linked and other glycoproteins are encoded by mRNA species; because most glycoproteins are membrane-bound or secreted, they are generally translated on membrane-bound polyribosomes (Chapter 37). Hundreds of different oligosaccharide chains of the O-glycosidic-type exist. These glycoproteins are built up by the stepwise donation of sugars from nucleotide sugars, such as UDP-GalNAc, UDP-Gal, and CMP-NeuAc. The enzymes catalyzing this type of reaction are membrane-bound glycoprotein glycosyltransferases. Generally, synthesis of one specific type of linkage requires the activity of a correspondingly specific transferase. The factors that determine which specific serine and threonine residues are glycosylated have not been identified but are probably found in the peptide structure surrounding the glycosylation site. The enzymes assembling O-linked chains are located in the Golgi apparatus, sequentially arranged in an assembly line with terminal reactions occurring in the trans-Golgi compartments.
The major features of the biosynthesis of O-linked glycoproteins are summarized in Table 47-9.
TABLE 47–9 Summary of Main Features of O-Glycosylation
N-LINKED GLYCOPROTEINS CONTAIN AN Asn-GLcNAc LINKAGE
N-Linked glycoproteins are distinguished by the presence of the Asn-GlcNAc linkage (Figure 47–1). It is the major class of glycoproteins and has been much studied, since the most readily accessible glycoproteins (eg, plasma proteins) mainly belong to this group. It includes both membrane-bound and circulating glycoproteins. The principal difference between this and the previous class, apart from the nature of the amino acid to which the oligosaccharide chain is attached (Asn vs Ser or Thr), concerns their biosynthesis.
Complex, Hybrid & High-Mannose Are the Three Major Classes of N-Linked Oligosaccharides
There are three major classes of N-linked oligosaccharides: complex, hybrid, and high-mannose (Figure 47–4). Each type shares a common pentasaccharide, Man3GlcNAc2—shown within the boxed area in Figure 47–4 and depicted also in Figure 47–5—but they differ in their outer branches. The presence of the common pentasaccharide is explained by the fact that all three classes share an initial common mechanism of biosynthesis. Glycoproteins of the complex type generally contain terminal NeuAc residues and underlying Gal and GlcNAc residues, the latter often constituting the disaccharide N-acetyllactosamine. Repeating N-acetyllactosamine units— [Galβ1-3/4GlcNAcβ1-3]n (poly-N-acetyllactosaminoglycans)—are often found on N-linked glycan chains. I/i blood group substances belong to this class. The majority of complex-type oligosaccharides contain two, three, or four outer branches (Figure 47–4), but structures containing five branches have also been described. The oligosaccharide branches are often referred to as antennae, so that bi-, tri-, tetra-, and penta-antennary structures may all be found. A bewildering number of chains of the complex type exist, and that indicated in Figure 47–4 is only one of many. Other complex chains may terminate in Gal or Fuc. High-mannose oligosaccharides typically have two to six additional Man residues linked to the pentasaccharide core. Hybrid molecules contain features of both of the two other classes.
FIGURE 47–4 Structures of the major types of asparagine-linked oligosaccharides. The boxed area encloses the pentasaccharide core common to all N-linked glycoproteins. (Reproduced, with permission, from Kornfeld R, Kornfeld S: Assembly of asparagine-linked oligosaccharides. Annu Rev Biochem 1985;54:631. Copyright © 1985 by Annual Reviews. Reprinted with permission.)
FIGURE 47–5 Schematic of the pentasaccharide core common to all N-linked glycoproteins and to which various outer chains of oligosaccharides may be attached. The sites of action of endoglycosidases F and H are also indicated.
The Biosynthesis of N-Linked Glycoproteins Involves Dolichol-P-P-Oligosaccharide
Leloir and his colleagues described the occurrence of a dolicholpyrophosphate-oligosaccharide (Dol-P-P-oligosaccharide), which subsequent research showed to play a key role in the biosynthesis of N-linked glycoproteins. The oligosaccharide chain of this compound generally has the structure R-Glc-NAc2Man9Glc3 . The sugars of this compound are first assembled on the Dol-P-P backbone, and the oligosaccharide chain is then transferred en bloc to suitable Asn residues of acceptor apoglycoproteins during their synthesis on membrane-bound polyribosomes. All N--glycans have a common pentasaccharide core structure (Figure 47–5).
To form high-mannose