CHAPTER OUTLINE
Secondary Structure in Proteins
Tertiary Structure of Proteins
Forces Controlling Protein Structure
High-Yield Terms
Fibrous protein: any protein that is generally insoluble in water and exists in an elongated and rigid conformation; most structural proteins are fibrous
Globular protein: any protein that is generally soluble in water and exists in more compact spherical conformation; most functional proteins are globular
Primary Structure in Proteins
The primary structure of peptides and proteins refers to the linear number and order of the amino acids present. The convention for the designation of the order of amino acids is that the N-terminal end (ie, the end bearing the residue with the free α-amino group) is to the left (and the number 1 amino acid) and the C-terminal end (ie, the end with the residue containing a free α-carboxyl group) is to the right.
Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harper’s Illustrated Biochemistry, 29th ed. New York, NY: McGraw-Hill; 2012.
Secondary Structure in Proteins
The ordered array of amino acids in a protein confers regular conformational forms upon that protein. These conformations constitute the secondary structures of a protein. In general, proteins fold into 2 broad classes of structure termed globular proteins or fibrous proteins. Globular proteins are compactly folded and coiled, whereas, fibrous proteins are more filamentous or elongated. It is the partial double-bond character of the peptide bond that defines the conformations a polypeptide chain may assume. Within a single protein, different regions of the polypeptide chain may assume different conformations determined by the primary sequence of the amino acids.
The α-Helix
The α-helix is a common secondary structure encountered in proteins of the globular class. The formation of the α-helix is spontaneous and is stabilized by H-bonding between amide nitrogens and carbonyl carbons of peptide bonds spaced 4 residues apart. This orientation of H-bonding produces a helical coiling of the peptide backbone such that the R-groups lie on the exterior of the helix and perpendicular to its axis (Figure 5-1).
FIGURE 5-1: Orientation of the main chain atoms of a peptide about the axis of an α helix. Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harper’s Illustrated Biochemistry, 29th ed. New York, NY: McGraw-Hill; 2012.
Not all amino acids favor the formation of the α-helix due to steric constraints of the R-groups. Amino acids, such as A, D, E, I, L, and M, favor the formation of α-helices, whereas, G and P favor disruption of the helix. This is particularly true for P since it is a pyrrolidine-based imino acid (HN=) whose structure significantly restricts movement about the peptide bond in which it is present, thereby, interfering with extension of the helix. The disruption of the helix is important as it introduces additional folding of the polypeptide backbone to allow the formation of globular proteins.
β-Sheets
An α-helix is composed of a single linear array of helically disposed amino acids, whereas β-sheets are composed of 2 or more different regions of stretches of at least 5 to 10 amino acids. The folding and alignment of stretches of the polypeptide backbone aside one another to form β-sheets is stabilized by H-bonding between amide nitrogens and carbonyl carbons. However, the H-bonding residues are present in adjacently opposed stretches of the polypeptide backbone as opposed to a linearly contiguous region of the backbone in the α-helix. β-sheets are said to be pleated. This is due to positioning of the α-carbons of the peptide bond, which alternates above and below the plane of the sheet. β-sheets are either parallel or antiparallel. In parallel sheets, adjacent peptide chains proceed in the same direction (ie, the direction of N-terminal to C-terminal ends is the same), whereas, in antiparallel sheets adjacent chains are aligned in opposite directions. β-sheets can be depicted in ball and stick format or as ribbons in certain protein formats (Figure 5-2).
FIGURE 5-2: Spacing and bond angles of the hydrogen bonds of antiparallel and parallel pleated α sheets. Arrows indicate the direction of each strand. Hydrogen bonds are indicated by dotted lines with the participating α-nitrogen atoms (hydrogen donors) and oxygen atoms (hydrogen acceptors) shown in blue and red, respectively. Backbone carbon atoms are shown in black. For clarity in presentation, R groups and hydrogen atoms are omitted. Top: Antiparallel β sheet. Pairs of hydrogen bonds alternate between being close together and wide apart and are oriented approximately perpendicular to the polypeptide backbone. Bottom: Parallel β sheet. The hydrogen bonds are evenly spaced but slant in alternate directions. Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harper’s Illustrated Biochemistry, 29th ed. New York, NY: McGraw-Hill; 2012.
Super-Secondary Structure
Some proteins contain an ordered organization of secondary structures that form distinct functional domains or structural motifs. Examples include the helix-turn-helix domain of bacterial proteins that regulate transcription and the leucine zipper, helix-loop-helix and zinc finger domains of eukaryotic transcriptional regulators. These domains are termed super-secondary structures.
Tertiary Structure of Proteins
Tertiary structure refers to the complete 3-dimensional structure of the polypeptide units of a given protein. Included in this description is the spatial relationship of different secondary structures to one another within a polypeptide chain and how these secondary structures themselves fold into the 3-dimensional form of the protein. Secondary structures of proteins often constitute distinct domains. Therefore, tertiary structure also describes the relationship of different domains to one another within a protein (Figure 5-3).
FIGURE 5-3: Examples of the tertiary structure of proteins. Top: The enzyme triose phosphate isomerase complexed with the substrate analog 2-phosphoglycerate (red). Note the elegant and symmetrical arrangement of alternating β sheets (light blue) and α helices (green), with the β sheets forming a β-barrel core surrounded by the helices. (Adapted from Protein Data Bank ID no. 1o5x.) Bottom: Lysozyme complexed with the substrate analog penta-N-acetyl chitopentaose (red). The color of the polypeptide chain is graded along the visible spectrum from purple (N-terminal) to tan (C-terminal). Notice how the concave shape of the domain forms a binding pocket for the pentasaccharide, the lack of β sheet, and the high proportion of loops and bends. (Adapted from Protein Data Bank ID no. 1sfb). Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harper’s Illustrated Biochemistry, 29th ed. New York, NY: McGraw-Hill; 2012.
Forces Controlling Protein Structure
The interaction of different domains is governed by several forces: These include hydrogen bonding, hydrophobic interactions, electrostatic interactions, and van der Waals forces.
Hydrogen Bonding
Polypeptides contain numerous proton donors and acceptors both in their backbone and in the R-groups of the amino acids. The environment in which proteins are found also contains the ample H-bond donors and acceptors of the water molecule. H-bonding, therefore, occurs not only within and between polypeptide chains but with the surrounding aqueous medium.
Hydrophobic Forces
Proteins are composed of amino acids that contain either hydrophilic or hydrophobic R-groups. It is the nature of interaction of different R-groups with the aqueous environment that plays a major role in shaping protein structure. The spontaneous folded state of globular proteins is a reflection of a balance between the opposing energetics of H-bonding between hydrophilic R-groups and the aqueous environment and the repulsion from the aqueous environment by the hydrophobic R-groups. The hydrophobicity of certain amino acid R-groups tends to drive them away from the exterior of proteins and into the interior. This driving force restricts the available conformations into which a protein may fold.
Electrostatic Forces
Electrostatic forces are mainly of 3 types: charge-charge, charge-dipole, and dipole-dipole. Typical charge-charge interactions that favor protein folding are those between oppositely charged R-groups such as K or R and D or E. A substantial component of the energy involved in protein folding is charge-dipole interactions. This refers to the interaction of ionized R-groups of amino acids with the dipole of the water molecule.
van der Waals Forces
There are both attractive and repulsive van der Waals forces that control protein folding. Attractive van der Waals forces involve the interactions among induced dipoles that arise from fluctuations in the charge densities that occur between adjacent uncharged nonbonded atoms. Repulsive van der Waals forces involve the interactions that occur when uncharged nonbonded atoms come very close together but do not induce dipoles. The repulsion is the result of the electron-electron repulsion that occurs as 2 clouds of electrons begin to overlap.
Quaternary Structure
Many proteins contain 2 or more different polypeptide chains that are held in association by the same noncovalent forces that stabilize the tertiary structures of proteins. Proteins with multiple polypeptide chains are oligomeric proteins. The overall structure formed by the interaction of at least 2 protein subunits in an oligomeric protein is known as quaternary structure. Oligomeric proteins can be composed of multiple identical polypeptide chains or multiple distinct polypeptide chains. Proteins with identical subunits are termed homo-oligomers. Proteins containing several distinct polypeptide chains are termed hetero-oligomers. Hemoglobin, the oxygen-carrying protein of the blood, is a heterotetrameric protein that can be considered one of the most clinically significant oligomeric proteins in the body.
Although van der Waals forces are extremely weak relative to other forces governing conformation, it is the huge number of such interactions that occur in large protein molecules that make them significant to the folding of proteins.
High-Yield Concept
In the broadest terms, fibrous proteins are involved in imparting structural characteristics to cellular constituents, whereas globular proteins represent the bulk of functional proteins in the cell, for example, the enzymes.
Major Protein Forms
All proteins in the human body can be grouped into one of 2 broad categories called fibrous or globular proteins.
Fibrous proteins are so called because they fold into long filamentous shapes and exhibit strong inflexible structural characteristics. In addition, fibrous proteins are generally insoluble in water. The predominant fibrous proteins are the collagens, keratins, and elastins which are used to construct connective tissues, muscle fibers, tendons, and the matrix of bones. As the name implies, globular proteins form spherical shapes and unlike fibrous proteins are more flexible and can more actively interact with the aqueous environment.
Collagens
In their various forms, the collagen proteins are the most abundant proteins in the body as well as representing one of the most clinically significant families of fibrous proteins (see Clinical Box 5-1). There are at least 28 different types of functional collagen protein whose subunits are encoded by 30 distinct collagen genes (see Table 39-1).
CLINICAL BOX 5-1: CONNECTIVE TISSUE DISORDERS
Collagens are the most abundant proteins in the body. Alterations in collagen structure arising from abnormal genes or abnormal processing of collagen proteins results in numerous diseases, including Larsen syndrome, scurvy, osteogenesis imperfecta (OI), and Ehlers-Danlos syndrome (EDS).
Ehlers-Danlos syndrome is actually the name associated with at least 9 distinct disorders that are biochemically and clinically distinct yet all manifest structural weakness in connective tissue as a result of defective collagen structure. These 9 disorders are designated EDS I-XIII and X with EDS I and EDS II being referred to as classical EDS. The disorder that was originally designated as EDS IX is now more commonly called X-linked cutis laxa or occipital horn syndrome. The major manifestations of EDS are skin fragility and hyperextensibility and joint hypermobility. To date, mutations in 8 genes involved in collagen synthesis or processing have been identified as causing the EDS phenotypes. EDS I results from defects in the COL5A1 and COL5A2 genes. EDS II also results from mutations in COL5A1, but the symptoms of this form of the disease are less severe. EDS IV is called the arterial or vascular form of the disease and results from defects in the COL3A1 gene. EDS VI is the ocular or kyphoscoliosis form of the disease. This form of EDS results from deficiencies in the activity of lysyl hydroxylase, which is responsible for proper posttranslational processing of certain collagen types. Lysyl hydroxylase is encoded by the procollagen-lysine 2-oxoglutarate 5-dioxygenase 1 (PLOD1) gene.
OI also encompasses more than one disorder. At least 4 biochemically and clinically distinguishable maladies have been identified as OI, all of which are characterized by bone fragility leading to multiple fractures and resultant bone deformities. All 4 forms of OI are due to defects in type I collagens. Type I OI is a mild form of the disease and also the most commonly encountered with a frequency of approximately 1 in 10,000 live births. Type I OI results from null mutations in the COL1A1 gene. Type II OI is the most severe form and is referred to as the perinatal lethal form. These infants exhibit characteristic facial features that include dark sclera, a beaked nose, and an extremely soft calvarium. Type II OI results from mutations in both the COL1A1 and COL1A2 genes. The outlook for type II OI patients is grim with life spans of only minutes to a few months. Death is usually the result of congestive heart failure, pulmonary insufficiency, or infection.
Each of the collagens contains a region of left-handed triple helix imparted by a high density of glycine and proline residues (Figure 5-4). In some collagens the entire molecule is composed of this triple helix, whereas, in other forms the triple helix represents only a small portion of the overall molecule. The glycine residues within the triple helical portion of the molecule are present in the repeating sequence: Gly-X-Y. The designation X and Y refers to the fact that any amino acid can occupy those positions. However, within the triple helix, proline (Pro) and hydroxyproline (Hyp) are present at high frequency. Hydroxyproline is a posttranslational modification that occurs to proline residues within collagen molecules via the action of a vitamin C–dependent prolyl hydroxylase. The presence of both proline and hydroxyproline induces rigidity into the triple helix of the collagen molecule. Another posttranslationally hydroxylated amino acid found in several types of collagen molecule is hydroxylysine (Hyl). Collagen molecules are also modified by glycosylation, specifically on Hyl residues in the molecule.
FIGURE 5-4: Triple helical structure of a typical collagen molecule. Reproduced with permission of themedicalbiochemistrypage, LLC.