A multitude of different proteins can be formed from only 20 common amino acids because these amino acids can be linked together in an enormous variety of sequences determined by the genetic code. The sequence of amino acids, its primary structure, determines the way a protein folds into a unique three-dimensional structure, which is its native conformation. Once it is folded, the three-dimensional structure of a protein forms binding sites for other molecules, thereby dictating the function of the protein in the body. In addition to creating binding sites, a protein must fold in such a way that it is flexible, stable, able to function in the correct site in the cell, and capable of being degraded by cellular enzymes.
Levels of Protein Structure. Protein structure is described in terms of four different levels: primary, secondary, tertiary, and quaternary (Fig. 7.1). The primary structure of a protein is the linear sequence of amino acids in the polypeptide chain. Secondary structure consists of local regions of polypeptide chains formed into structures that are stabilized by a repeating pattern of hydrogen bonds, such as the regular structures called α-helices and β-sheets. The rigidity of the peptide backbone determines the types of secondary structure that can occur. The tertiary structure involves folding of the secondary structural elements into an overall three-dimensional conformation. In globular proteins such as myoglobin, the tertiary structure generally forms a densely packed hydrophobic core with polar amino acid side chains on the outside. Some proteins exhibit quaternary structure, the combination of two or more subunits, each composed of a polypeptide chain.
Domains and Folds. The tertiary structure of a globular protein can be made up of structural domains, regions of structure that fold independently. Multiple domains can be linked together to form a functional protein. Within a domain, a combination of secondary structural elements forms a fold, such as the nucleotide-binding fold or an actin fold. Folds are defined by their similarity in a number of different proteins.
Quaternary Structure. Assembly of globular polypeptide subunits into a multisubunit complex can provide the opportunity for cooperative binding of ligands (e.g., O2 binding to hemoglobin), form binding sites for complex molecules (e.g., antigen binding to immunoglobulin), and increase stability of the protein. The polypeptide chains of fibrous proteins such as collagen are aligned along an axis, have repeating elements, and are extensively linked to each other through hydrogen and covalent bonds.
Ligand Binding. Proteins form binding sites for specific molecules, called ligands (e.g., adenosine triphosphate [ATP] or O2), or for another protein. The affinity of a binding site for its ligand is characterized quantitatively by an association or affinity constant, Ka (or its dissociation constant, Kd, in which Kd = 1/Ka).
Folding of Proteins. The primary structure of a protein dictates the way that it folds into its tertiary structure, which is a stable conformation that is identical to the shape of other molecules of the same protein (i.e., its native conformation). Chaperonins act as templates to overcome the kinetic and thermodynamic barrier to reaching a stable conformation. Prion proteins cause neurodegenerative diseases by acting as a template for misfolding. Heat, acid, and other agents cause proteins to denature; that is, to unfold or refold and lose their native three-dimensional conformation.
THE WAITING ROOM
Will S., who has sickle cell anemia, was readmitted to the hospital with symptoms indicating that he was experiencing another sickle cell crisis (see Chapter 6). Sickle cell disease is a result of improper aggregation of hemoglobin within the red blood cell.
Anne J. is a 54-year-old woman who arrived in the hospital 4 days ago, about 5 hours after she began to feel chest pain (see Chapter 6). In the emergency department, the physician drew blood for the measurement of cardiac troponin T subunit (cTN-T). The results from these tests had supported the diagnosis of an acute myocardial infarction (MI), and Anne J. was hospitalized.
Amy L. is a 62-year-old woman who presented with weakness, fatigue, an enlarged tongue (macroglossia), and edema. She had signs and symptoms of cardiac failure, including electrocardiographic abnormalities. Initial laboratory studies showed a serum creatinine of 1.9 mg/dL (reference range [females], 0.5 to 1.1 mg/dL), indicating mild renal failure. A urinalysis indicated the presence of a moderate proteinuria. She was subsequently diagnosed with amyloidosis/AL secondary to a plasma cell dyscrasia. Amyloidosis is a term that encompasses many diseases that share as a common feature the extracellular deposition of pathologic insoluble fibrillar proteins called amyloid in organs and tissues. In Amy L.’s disease, amyloidosis/AL, the amyloid is derived from immunoglobulin light chains (AL-5 amyloidosis, light-chain–related), and is the most common form of amyloidosis.
Dianne A. returned to her physician’s office for a routine visit to monitor her treatment (see Chapters 4, 5, and 6). Her physician drew blood for an HbA1c (pronounced hemoglobin A-1-c) determination. Her HbA1c was 8.5%, which was above the normal level of <5.6% for a person without diabetes, and of <7.0% for a person with controlled diabetes.
I. General Characteristics of Three-Dimensional Structure
The overall conformation of a protein, the particular position of the amino acid side chains in three-dimensional space, determines the function of the protein.
A. Descriptions of Protein Structure
Proteins are generally grouped into major structural classifications: globular proteins, fibrous proteins, and transmembrane proteins. Globular proteins are usually soluble in aqueous medium and resemble irregular balls. The fibrous proteins are geometrically linear, arranged around a single axis, and have a repeating unit structure. Another general classification, transmembrane proteins, consists of proteins that have one or more regions aligned to cross the lipid membrane (see Fig. 6.10). DNA-binding proteins, although a member of the globular protein family, are sometimes classified separately and are considered in Chapter 15.
The structure of proteins is often described according to levels called primary, secondary, tertiary, and quaternary structure (see Fig. 7.1). The primary structure is the linear sequence of amino acid residues joined through peptide bonds to form a polypeptide chain. The secondary structure refers to recurring structures (such as the regular structure of the α-helix) that form in short localized regions of the polypeptide chain. The overall three-dimensional conformation of a protein is its tertiary structure, the summation of its secondary structural elements. The quaternary structure is the association of polypeptide subunits in a geometrically specific manner. The forces involved in a protein folding into its final conformation are primarily noncovalent interactions. These interactions include the attraction between positively and negatively charged molecules (ionic interactions), the hydrophobic effect, hydrogen bonding, and van der Waals interactions (the nonspecific attraction between closely packed atoms).
B. Requirements of the Three-Dimensional Structure
The overall three-dimensional structure of a protein must meet certain requirements to enable the protein to function in the cell or extracellular medium of the body. The first requirement is the creation of a binding site that is specific for just one molecule or a group of molecules with similar structural properties. The specific binding sites of a protein usually define its role. The three-dimensional structure must also exhibit the degrees of flexibility and rigidity appropriate to its specific function. Some rigidity is essential for the creation of binding sites and for a stable structure (i.e., a protein that is excessively flexible would have the potential to be dysfunctional). However, flexibility and mobility in structure enables the protein to fold as it is synthesized and to adapt as it binds other proteins and small molecules. The three-dimensional structure must have an external surface that is appropriate for its environment (e.g., cytoplasmic proteins need to keep polar amino acids on the surface to remain soluble in an aqueous environment). In addition, the conformation must also be stable, with little tendency to undergo refolding into a form that cannot fulfill its function or that precipitates in the cell. Finally, the protein must have a structure that can be degraded when it is damaged or no longer needed in the cell.
II. The Three-Dimensional Structure of the Peptide Backbone
The amino acids in a polypeptide chain are joined sequentially by peptide bonds between the carboxyl group of one amino acid and the amide group of the next amino acid in the sequence (Fig. 7.2). Usually, the peptide bond assumes a trans configuration in which successive α-carbons and their R groups are located on opposite sides of the peptide bond.
The polypeptide backbone can bend in a very restricted way. The peptide bond itself is a hybrid of two resonance structures, one of which has double-bond characteristic, so that the carboxyl and amide groups that form the bond must remain planar (see Fig. 7.2). However, rotation within certain allowed angles (torsion angles) can occur around the bond between the α-carbon and the α-amino group and around the bond between the α-carbon and the carbonyl group. This rotation is subject to steric constraints that maximize the distance between atoms in the different amino acid side chains and prohibit torsion (rotation) angles that place the side-chain atoms too close to each other. These folding constraints, which depend on the specific amino acids present, limit the secondary and tertiary structures that can be formed from the polypeptide chain.
III. Secondary Structure
Regions within polypeptide chains form recurring, localized structures known as secondary structures. The two regular secondary structures called the α-helix and the β-sheet contain repeating elements formed by hydrogen bonding between atoms of the peptide bonds. Other regions of the polypeptide chain form nonregular, nonrepetitive secondary structures such as loops and coils.
A. The α-Helix
The α-helix is a common secondary structural element of globular proteins, membrane-spanning domains, and DNA-binding proteins. It has a rigid, stable conformation that maximizes hydrogen bonding while staying within the allowed rotation angles of the polypeptide backbone. The peptide backbone of the α-helix is formed by hydrogen bonds between each carbonyl oxygen atom and the amide hydrogen (N–H) of an amino acid residue located four residues farther down the chain (Fig. 7.3). Thus, each peptide bond is connected by hydrogen bonds to the peptide bond four amino acid residues ahead of it and four amino acid residues behind it in the amino acid sequence. The core of the helix is tightly packed, thereby maximizing association energies between atoms. The trans side chains of the amino acids project backward and outward from the helix, thereby avoiding steric hindrance with the polypeptide backbone and with each other (Fig. 7.4). The amino acid proline, because of its ring structure, cannot form the necessary bond angles to fit within an α-helix. Thus, proline is known as a “helix breaker” and is not found in the middle of α-helical regions of proteins, but can be found at the first or second position of an α-helical region.
β-Sheets are a second type of regular secondary structure that maximizes hydrogen bonding between the peptide backbones while maintaining the allowed torsion angles. In β-sheets, the hydrogen bonding usually occurs between regions of separate neighboring polypeptide strands aligned parallel to each other (Fig. 7.5A). Thus, the carbonyl oxygen of one peptide bond is hydrogen-bonded to the amide hydrogen of a peptide bond on an adjacent strand. This pattern contrasts with the α-helix, in which the peptide backbone hydrogen bonds are located within the same strand. Optimal hydrogen bonding occurs when the sheet is bent (pleated) to form β-pleated sheets.
The β-pleated sheet is described as parallel if the polypeptide strands run in the same direction (as defined by their amino and carboxyl terminals) and antiparallel if they run in opposite directions. Antiparallel strands are often the same polypeptide chain folded back on itself, with simple hairpin turns or long runs of polypeptide chain connecting the strands. The amino acid side chains of each polypeptide strand alternate between extending above and below the plane of the β-sheet (see Fig. 7.5). Parallel sheets tend to have hydrophobic residues on both sides of the sheets; antiparallel sheets usually have a hydrophobic side and a hydrophilic side. Frequently, sheets twist in one direction.
The hydrogen-bonding pattern is slightly different depending on whether one examines a parallel or antiparallel β-sheet (see Fig. 7.5B). In an antiparallel sheet, the atoms involved in hydrogen bonding are directly opposite to each other; in a parallel β-sheet, the atoms involved in the hydrogen bonding are slightly skewed from one another, such that one amino acid is hydrogen-bonded to two others in the opposite strand. Because the hydrogen bonds are at an angle in the parallel β-sheets the bonds are weaker than in an antiparallel β-sheet.
C. Nonrepetitive Secondary Structures
α-Helices and β-pleated sheets are patterns of regular structure with a repeating element—the ordered formation of hydrogen bonds. In contrast, bends, loops, and turns are nonregular secondary structures that do not have a repeating element of hydrogen bond formation. They are characterized by an abrupt change of direction and are often found on the protein surface. For example, β-turns are short regions that usually involve four successive amino acid residues. They often connect strands of antiparallel β-sheets (Fig. 7.6). The surface of large globular proteins usually has at least one omega loop: a structure with a neck like the capital Greek letter omega (Ω).
D. Patterns of Secondary Structure
Figure 7.7 is a three-dimensional drawing of a globular domain in the soluble enzyme lactate dehydrogenase (LDH). It illustrates the combination of secondary structural elements to form patterns. This LDH domain is typical of globular proteins, which average approximately 31% α-helical structure and approximately 28% β-pleated sheets (with a wide range of variation). The helices of globular domains have an average span of approximately 12 residues, corresponding to approximately three to four helical turns, although many are much longer. The β-sheets, represented in diagrams by an arrow for each strand, are an average of six residues long and six strands wide (2 to 15 strands). Like the β-sheet in the LDH domain, they generally twist to the right rather than lie flat (see Fig. 7.7). Most globular domains, such as this LDH domain, also contain motifs. Motifs are relatively small arrangements of secondary structure that are recognized in many different proteins. For example, certain of the β-strands are connected with α-helices to form the βα βα β structural motif.
The remaining polypeptide segments connecting the helices and β-sheets are said to have a coil or loop conformation (see Fig. 7.7). Although some of the connecting segments recognized in many proteins have been given names (like the Ω-loops), other segments such as those in this LDH domain appear disordered or irregular. These nonregular regions, generally called coils, should never be referred to as “random coils.” They are neither truly disordered nor random; they are stabilized through specific hydrogen bonds dictated by the primary sequence of the protein and do not vary from one molecule of the protein to another of the same protein.
The nonregular coils, loops, and other segments are usually more flexible than the relatively rigid helices and β-pleated sheets. They often form hinge regions that allow segments of the polypeptide chain to move as a compound binds or to move as the protein folds around another molecule.
IV. Tertiary Structure
The tertiary structure of a protein is the pattern of the secondary structural elements folding into a three-dimensional conformation, as shown for the LDH domain in Figure 7.7. The three-dimensional structure is flexible and dynamic, with rapidly fluctuating movement in the exact positions of amino acid side chains and domains. These fluctuating movements take place without unfolding of the protein. They allow ions and water to diffuse through the structure and provide alternative conformations for ligand binding. As illustrated with examples later in this chapter, this three-dimensional structure is designed to serve all aspects of the protein’s function. It creates specific and flexible binding sites for ligands (the compounds that bind), illustrated with actin and myoglobin. The tertiary structure also maintains residues on the surface appropriate for the protein’s cellular location, polar residues for cytosolic proteins, and hydrophobic residues for transmembrane proteins (illustrated with the β2-adrenergic receptor). Flexibility is one of the most important features of protein structure. The forces that maintain tertiary structure are hydrogen bonds, ionic bonds, van der Waals interactions, the hydrophobic effect, and disulfide bond formation.
A. Domains in the Tertiary Structure
The tertiary structure of large complex proteins is often described in terms of physically independent regions called structural domains. You can usually identify domains from visual examination of a three-dimensional figure of a protein, such as the three-dimensional figure of G-actin shown in Figure 7.8. Each domain is formed from a continuous sequence of amino acids in the polypeptide chain that are folded into a three-dimensional structure independently of the rest of the protein, and two domains are connected through a simpler structure such as a loop (e.g., the hinge region of Fig. 7.8). The structural features of each domain can be discussed independently of another domain in the same protein, and the structural features of one domain may not match that of other domains in the same protein.
B. Folds in Globular Proteins
Folds are relatively large patterns of three-dimensional structure that have been recognized in many proteins, including proteins from different branches of the phylogenetic tree. Over 1,000 folds have now been recognized, and it is predicted that there are only a few thousand different folds for all the proteins that have ever existed. A characteristic activity is associated with each fold, such as adenosine triphosphate (ATP) binding and hydrolysis (the actin fold, see Fig. 7.8) or NAD+ (oxidized nicotinamide adenine dinucleotide) binding (the nucleotide-binding fold, see Fig. 7.7).
C. The Solubility of Globular Proteins in an Aqueous Environment
Most globular proteins are soluble in the cell. In general, the core of a globular domain has a high content of amino acids with nonpolar side chains (valine, leucine, isoleucine, methionine, and phenylalanine), out of contact with the aqueous medium (the hydrophobic effect). This hydrophobic core is densely packed to maximize attractive van der Waals forces, which exert themselves over short distances. The charged polar amino acid side chains (arginine, histidine, lysine, aspartate, and glutamic acid) are generally located on the surface of the protein, where they form ion pairs (salt bridges, ionic interactions) or are in contact with aqueous solvent. Charged side chains often bind inorganic ions (e.g., K+, PO43−, or Cl−) to decrease repulsion between like charges. When charged amino acids are located on the interior, they are generally involved in forming specific binding sites. The polar uncharged amino acid side chains of serine, threonine, asparagine, glutamine, tyrosine, and tryptophan are also usually found on the surface of the protein, but they may occur in the interior, hydrogen-bonded to other side chains. Cysteine disulfide bonds (the bond formed by two cysteine sulfhydryl groups) are sometimes involved in the formation of tertiary structure, where they add stability to the protein. However, their formation in soluble globular proteins is infrequent.
D. Tertiary Structure of Transmembrane Proteins
Transmembrane proteins, such as the β2-adrenergic receptor, contain membrane-spanning domains and intra- and extracellular domains on either side of the membrane (Fig. 7.9). Many ion-channel proteins, transport proteins, neurotransmitter receptors, and hormone receptors contain similar membrane-spanning segments that are α-helices with hydrophobic residues exposed to the lipid bilayer. These rigid helices are connected by loops containing hydrophilic amino acid side chains that extend into the aqueous medium on both sides of the membrane. In the β2-adrenergic receptor, the helices clump together so that the extracellular loops form a surface that acts as a binding site for the hormone adrenaline (epinephrine)—our fight-or-flight hormone. The binding site is sometimes referred to as a binding domain (a functional domain), even though it is not formed from a continuous segment of the polypeptide chain. Once adrenaline binds to the receptor, a conformational change in the arrangement of rigid helical structures is transmitted to the intracellular domains that form a binding site for another signaling protein, a heterotrimeric G-protein (a guanosine triphosphate [GTP]-binding protein composed of three different subunits, which is described further in Chapter 10). Thus, receptors require both rigidity and flexibility to transmit signals across the cell membrane.
As discussed in Chapter 6, transmembrane proteins usually have a number of posttranslational modifications that provide additional chemical groups to fulfill requirements of the three-dimensional structure. As shown in Fig. 7.9 (and see Fig. 6.13), the amino terminus of the β2-adrenergic receptor (residues 1 to 34) extends out of the membrane and has branched high-mannose oligosaccharides linked through N-glycosidic bonds to the amide of asparagine. Part of the receptor is anchored in the lipid plasma membrane by a palmitoyl group that forms a thioester with the –SH residue of a cysteine. The carboxyl terminus, which extends into the cytoplasm, has a number of serine and threonine phosphorylation sites (shown as red circles) that regulate receptor activity.
V. Quaternary Structure
The quaternary structure of a protein refers to the association of individual polypeptide chain subunits in a geometrically and stoichiometrically specific manner. Many proteins function in the cell as dimers, tetramers, or oligomers, proteins in which two, four, or more subunits, respectively, have combined to make one functional protein. The subunits of a particular protein always combine in the same number and in the same way, because the binding between the subunits is dictated by the tertiary structure, which is dictated by the primary structure, which is determined by the genetic code.
A number of different terms are used to describe subunit structure. The prefix homo- or hetero- is used to describe identical or different subunits, respectively, of two, three, or four subunit proteins (e.g., heterotrimeric G-proteins have three different subunits). A protomer is the unit structure composed of nonidentical subunits. For example, adult hemoglobin consists of two α- and two β-chains and is a tetramer (α2β2). One α-β pair can be considered a protomer. In contrast, F-actin is an oligomer, a multisubunit protein composed of identical G-actin subunits. Multimer is sometimes used as a more generic term to designate a complex with many subunits of more than one type.
The contact regions between the subunits of globular proteins resemble the interior of a single-subunit protein; they contain closely packed nonpolar side chains, hydrogen bonds involving the polypeptide backbones and their side chains, and occasional ionic bonds or salt bridges. The subunits of globular proteins are very rarely held together by interchain disulfide bonds and never by other covalent bonds. In contrast, fibrous and other structural proteins may be extensively linked to other proteins through covalent bonds.
Assembly into a multisubunit structure increases the stability of a protein. The increase in size increases the number of possible interactions between amino acid residues and therefore makes it more difficult for a protein to unfold and refold. As a result, many soluble proteins are composed of two or four identical or nearly identical subunits with an average size of approximately 200 amino acids. The forming of multisubunit proteins also aids in the function of the protein.
A multisubunit structure has many advantages besides increased stability. It may enable the protein to exhibit cooperativity between subunits in binding ligands (illustrated later with hemoglobin) or to form binding sites with a high affinity for large molecules (illustrated with antigen binding to the immunoglobulin molecule, IgG). An additional advantage of a multisubunit structure is that the different subunits can have different activities and cooperate in a common function. Examples of enzymes that have regulatory subunits or exist as multiprotein complexes are provided in Chapter 9.
VI. Quantitation of Ligand Binding
In the examples of tertiary structure we have discussed, the folding of a protein created a three-dimensional binding site for a ligand (NAD+ for the LDH domain 1, ATP for G-actin, or adrenaline for the β2-adrenergic receptor). The binding affinity of a protein for a ligand is described quantitatively by its association constant, Ka, which is the equilibrium constant for the binding reaction of a ligand (L) with a protein (P) (Equation 7.1).
Equation 7.1. The association constant, Ka for a binding site on a protein.
Consider a reaction in which a ligand (L) binds to a protein (P) to form a ligand–protein complex (LP) with a rate constant of k1. LP dissociates with a rate constant of k2: