Structure, Nomenclature, and Properties of Proteins and Amino Acids

Structure, Nomenclature, and Properties of Proteins and Amino Acids

Martha H. Stipanuk, PhD

Proteins were first recognized as a distinct class of biological molecules in the eighteenth century by Antoine Fourcroy and others, evidenced by the ability of egg whites, wheat gluten, plasma albumin, and fibrin (from clotted blood) to coagulate when treated with heat or acid. The Dutch chemist Gerhadus Johannes Mulder carried out elemental analysis of common proteins and found that nearly all proteins had a similar empirical formula, C400H620N100O120P1S1, leading him to conclude that all “albuminous” compounds might be composed mainly of a single type of compound. The name “protein” (from the Greek word proteios, meaning “primary”) was first given to this class of molecules in 1838 by Mulder’s associate, Jöns Jakob Berzelius.

We now know that proteins, often called polypeptides, are made up of amino acid residues linked by peptide bonds. Proteins are involved in essentially every process that takes place in cells, and these proteins have a remarkable diversity of functions. Proteins function as enzymes, transcription factors, binding proteins, transmembrane transporters and channels, hormones, immunoglobulins, motor proteins, receptors, structural proteins, and signaling proteins.

The human genome contains about 23,000 protein-coding genes, and proteins make up 20% to 50% of the dry mass of the adult human body, with fat being the other major component. These proteins and peptides are synthesized using amino acids as the building blocks, much as complex carbohydrates are synthesized using sugar residues as the building blocks. In addition to the important role of amino acids as precursors for protein synthesis, amino acids have important roles as intermediates in metabolism; as precursors of nonpeptide compounds, such as the neurotransmitter γ-aminobutyric acid and the coenzyme nicotinamide adenine dinucleotide; and as precursors for synthesis of several unique small peptides, such as glutathione.

In terms of diet, humans and other animals must consume protein to meet needs for amino acids including specific amino acids that cannot be synthesized by the organism. Protein in human diets includes both animal sources, such as meat, milk, fish, and eggs; and plant sources, such as cereal seeds (wheat, rice, maize) and legume seeds (soybeans, peanuts). The major proteins in meat and fish include myofibrillar, sarcoplasmic, and connective tissue proteins. The major protein in egg is ovalbumin in the egg white, which is present as a source of protein for growth of the embryo, whereas the major protein in milk is casein, which is present as a source of protein for the mammalian newborn. The majority of protein in plant seeds consists of globulins or storage proteins that are synthesized during seed development and stored in protein bodies. These proteins are hydrolyzed during seed germination to provide nutrients for the developing seedling.

The Proteinogenic Amino Acids

All peptides and proteins, regardless of their origin, are constructed from amino acids that are covalently linked together, usually in a linear sequence. Twenty-one amino acids are naturally incorporated into polypeptides in mammals. Twenty of these are directly encoded by the universal genetic code. The twenty-first of the amino acid precursors for protein synthesis, selenocysteine, is incorporated into a small number of proteins by a unique cotranslational mechanism requiring special secondary structure in the messenger RNA (mRNA) (i.e., a selenocysteine insertion sequence, SECIS) that causes the UGA stop codon to encode selenocysteine (see Chapter 39). Another unusual amino acid called pyrrolysine is considered the twenty-second proteinogenic amino acid, but it is found only in some methane-producing enzymes in methanogenic archaea. The structures of the 20 common amino acids and selenocysteine are shown in Figure 5-1.

Amino acids have distinctive “side chains” or “R” groups that give each amino acid size, shape, and characteristics that dictate solubility and electrochemical properties. A given R group confers novel, sometimes unique, chemical properties to an amino acid, and amino acids are often classified based on the chemical properties of their respective R groups. With such diverse building blocks, it is easy to understand how peptides and proteins can be designed for complex activities.

Chirality and Optical Rotation

As shown in Figure 5-2, each amino acid contains an amino group and a carboxylic acid group. Both of these functional moieties are bonded directly to a central carbon atom designated as the α-carbon. Except for glycine, the α-carbon for each of the amino acids has four different functional groups bonded to it: an amino group, a carboxylic acid group, hydrogen, and its R group. The α-carbon of glycine does not have an R group or side chain, so two hydrogen atoms are attached to the α-carbon.

The presence of four different functional groups creates a chiral center. A chiral center exists when an arrangement around a given molecule cannot be superimposed. For all amino acids (with the exception of glycine), there are two nonsuperimposable, mirror-image forms. These two forms are referred to as stereoisomers, designated as L– and D-isomers. This terminology comes from the Latin terms laevus and dexter or levo and dextro, meaning left and right, respectively. The D– and L-isomers of a given amino acid will rotate plane polarized light in opposite directions, but amino acids are designated D or L not by the direction in which they themselves rotate light. Instead, the L and D convention for amino acid stereochemistry refers to the optical activity of the isomer of glyceraldehyde from which the amino acid can theoretically be synthesized. D-Glyceraldehyde is dextrorotary, whereas L-glyceraldehyde is levorotary. Thus the designation L or D in combination with the given name of an amino acid implies a specific spatial configuration around the amino acid’s α-carbon.

Proline also deserves special comment, because its R group is joined both at the α-carbon and amino group to form a 5-membered ring. Thus, the α-amino nitrogen of proline has two alkyl substituents but only one hydrogen in its unprotonated state. For this reason, proline is referred to as a secondary amino acid or imino acid. The α-carbon of proline remains a chiral center.

Although the L and D designations remain in common usage for most amino acids, another system for assigning stereochemistry, the RS system, is used most often in organic chemistry. The symbol R comes from the Latin rectus for “right,” and S comes from the Latin sinister for “left.” The RS system denotes the absolute stereochemistry of the molecule, with each stereogenic center in a molecule being assigned a prefix (R or S) according to whether its configuration is right- or left-handed. In order to make the R or S assignment, relative priority values are assigned to each of the four substituents on the chiral carbon based on the mass of the groups (heaviest to lightest) according to basic rules. Almost all of the amino acids in proteins are S at the α-carbon, but cysteine and selenocysteine are R and glycine is nonchiral at their α-carbons.

In proteins and peptides, amino acids are found almost exclusively in the L form, although D-amino acids are found in some bacterial proteins and peptides (Petsko and Ringe, 2004). The almost exclusive presence of L-amino acids in proteins indicates that reactions that involve amino acid and protein synthesis must be highly stereospecific. The metabolic pathways for amino acid synthesis create predominantly amino acids in their L forms. Moreover, the biological machinery required for protein assembly recognizes L-amino acids almost exclusively. It should be noted that D-aspartate and D-serine are produced by the mammalian brain by enzymes that catalyze the racemization of L-aspartate and L-serine, respectively, and these D-amino acids are involved in activation of the N-methyl-D-aspartate type of excitatory amino acid receptors (Wolosker et al., 2008).

The Acid and Base Characteristics of Amino Acids

In aqueous solutions, amino acids are easily ionized. The most abundant ionic species present when amino acids are dissolved in an aqueous medium at neutral pH are shown in Figure 5-1, and the pKas for all dissociable groups are shown in Table 5-1. The acid dissociation constant Ka is used to define characteristics of titratable groups in organic acids and amines. The negative log of the dissociation constant Ka is called the pKa of the titratable group. In a practical sense, this means that when the pH is equal to the pKa, the associated (AH, protonated) and dissociated (A, unprotonated) species will be present in equal molar concentrations.

Ka=[H+] ([A]/[AH])


log Ka=log [H+]+log ([A]/[AH])


log Ka=log [H+]log ([A]/[AH])


pKa=pHlog ([A]/[AH])


The pKas of carboxylic acid groups are relatively low, usually 2 to 4, so these groups are almost always negatively charged at physiological pH. Amino groups have pKas that are relatively high, usually 9 to 11, so these groups are almost always positively charged at physiological pH. Most amino acids have neutral side chains at physiological pH and have an overall net charge of 0. However, these amino acids still have a positively charged amino group and a negatively charged carboxyl group. Thus they are not uncharged molecules, nor are they cations or anions. The name “zwitterion” or “dipolar ion” is given to such molecules that have both positive and negative charges but a net charge of zero.

In a nonhydrated state, most amino acids exist as nonvolatile crystalline solids. In this case, an internal transfer of a hydrogen ion from the −COOH group to the −NH2 group of the amino acid leaves an ion with both a negative charge and a positive charge. The zwitterionic character of amino acids causes them to be held together by electrostatic forces, or ionic bonds, in a crystalline lattice (i.e., analogous to the crystalline lattice of sodium chloride and other salt crystals). These ionic attractions between oppositely charged ions are strong, and consequently thermal decomposition of amino acids usually requires high temperatures (e.g., above 200° C).

Ionizable groups of amino acids can be characterized by titrating a solution of the amino acid with acid or base to obtain a titration curve. The types and number of the functional groups capable of reacting with or exchanging a hydrogen ion (proton) influence the shape of this curve. Addition of base (or acid) will result in a rapid change in the pH of the solution when no group is being titrated, whereas a much slower rate of change in the pH of the solution will be observed when an ionizable group is being titrated. For example, alanine contains two titratable groups: one carboxylic acid group and one amino group. In aqueous solution at a very low or acidic pH (i.e., a high hydrogen ion concentration), both the amino group and the carboxylic acid group of alanine will be protonated, and as a result alanine will be positively charged. If base is gradually added (to decrease the concentration of hydrogen ions), the carboxylic acid group will lose its proton and alanine will become a zwitterion with one negative charge and one positive charge. If more base is added to increase the pH, eventually the positively charged amino group will lose its proton and alanine will become negatively charged.

The presence of a titratable group can be easily observed on a titration curve as a marked decrease in the change in pH per unit of base added; this will appear as a flattening of the curve when pH is plotted on the vertical axis and units of base are plotted on the horizontal axis. In essence, the titratable group acts as a buffer to resist changes in pH by donating protons to neutralize the base that is added. A curve obtained by the titration of histidine, which contains three titratable functional groups, is shown in Figure 5-3. On a titration curve, the pKa can be observed as the point of inflection near the center of the “plateau.” The inflection point is where the curvature changes from concave up to concave down. For histidine in Figure 5-3, three pKas can be detected: the carboxyl group has a pKa = 1.82 the imidazole group has a pKa = 6.0, and the α-amino group has a pKa = 9.17.

Because pKa is a log10 scale, a 1.0 unit change in pH on either side of the pKa will be associated with a tenfold change in the ratio of the associated and dissociated species, and a 2.0 unit change in pH on either side of the pKa will be associated with a 100-fold change in the ratio.

pHpKa=log ([A]/[AH])




Thus if the pKa for an ionizable group is 6.0, the ratio of the unprotonated to the protonated species will be 0.01 (mainly protonated) at pH 4.0 and 100 (mainly unprotonated) at pH 8.0. On the titration curve, the rate of change in pH per unit of base (or acid) added increases as one moves away from the pKa of a titratable group.

When amino acids are incorporated into peptides, they lose their ability to form zwitterions because the α-carboxyl and α-amino groups are in peptide linkage with other amino acid residues. Other than the charges due to ionization of the C-terminal carboxyl group and N-terminal amino group, it is ionization of the R groups of the amino acids in the polypeptide that comprises the electrical charge of the macromolecule. Aspartate and glutamate have carboxylate groups on their side chains, whereas lysine has an ε-amino group and arginine has a basic guanidinium group; these groups are normally charged at physiological pH. In contrast, the imidazole ring of histidine (pKa = 6.0) and the thiol group (−SH) of cysteine (pKa = 8.3) have pKas that are closer to neutral and undergo partial ionization within the range of physiological pH, meaning that relatively small shifts in cellular pH can change the charge of these residues. The seleno group of selenocysteine has a pKa of about 5.2, such that selenocysteine residues are mostly ionized at physiological pH.

The ionization state of these side chains affects the physical and chemical properties of proteins and is important for their interactions with other proteins, substrates or ligands, and other macromolecules as well as for their physiological functions. Within chromatin, the basic amino acid residues in histones form ionic bonds with the acidic sugar–phosphate backbone of DNA. Acidic amino acid residues are involved in chelation of calcium ions by calcium-binding proteins. The histidine side chain and the carboxylate of acidic amino acids often serve as coordinating ligands for metals in metalloproteins. Within the native protein structure, pKa values for ionizable groups can be substantially altered because of interactions with nearby residues or the hydrophobicity of the interior of the protein. Such alterations can be critical for the catalytic function of proteins such as enzymes (Harris and Turner, 2002).

Hydrophobicity or Hydrophilicity of Amino Acid Residues

In addition to differences in size and charge, amino acids also differ in hydrophobicity or hydrophilicity (i.e., the tendency to interact with a polar or nonpolar solvent or environment). This property of amino acid R groups can vary widely, ranging from totally nonpolar or hydrophobic (water insoluble) to polar or hydrophilic (water soluble). The hydrophobic character of amino acid residues is believed to be the major driving force in protein folding. The amino acid residues with high positive hydropathy scores (e.g., isoleucine and valine) tend to repel the aqueous environment and consequently tend to pack together in the interior of the protein to avoid contact with water. On the other hand, amino acid residues with high negative hydropathy scores (e.g., arginine and lysine) will most likely be found on the surface of the protein in contact with the aqueous environment.

Jack Kyte and Russell Doolittle (1982) proposed a hydropathy index that is now widely used to predict aspects of protein structure; this scale assigns negative numbers to the most hydrophilic side chains and positive numbers to the most hydrophobic side chains (see Table 5-1). Other scales have been developed, some of which assign quite different values to some of the amino acids. Efforts to develop better methods of predicting protein structure continue. An example of the use of a hydropathy index to predict the transmembrane segments of a protein sequence is shown in Figure 5-4. Transmembrane segments of transmembrane proteins can be predicted from the average hydrophobicity scores for small regions of the polypeptide chain (e.g., segments of 9 to 19 amino acids). Transmembrane regions of proteins, which must pass through the lipid bilayers of cell membranes, tend to have high hydropathy scores (greater than 1.6 units).

FIGURE 5-4 A hydropathy plot. Hydrophobic regions are usually found in the interior of proteins, whereas hydrophilic regions are likely to interact with aqueous or ionic environments. See Table 5-1 for hydropathic index values for given amino acids. A hydropathy plot (A) identifies regions of a polypeptide chain that contain amino acids that are predominantly hydrophobic (shaded) or, in contrast, hydrophilic in nature. The example shown could apply to a transmembrane protein (B). Transmembrane proteins often cross the lipid bilayers of cell membranes. The hydrophobic regions (positive hydropathic values) are associated with the interior of the membrane, whereas the hydrophilic regions may extend into the cytoplasmic compartment or toward the exterior of the cell.

Modifications of Amino Acid Side Chains

As was noted in the previous section, the properties of the common amino acids can vary markedly depending upon their innate characteristics (e.g., size, charge, polarity, and hydropathy). The characteristics of some amino acids can also be altered by additional enzymatic and nonenzymatic modifications of the R group. Such modifications can occur as cotranslational or posttranslational events following incorporation of the amino acid into proteins (see Chapter 13). These specific amino acid modifications are introduced to modulate or modify a given chemical property. Posttranslational modifications extend the structures and properties of amino acids in proteins well beyond those of the 20 (or 21, if selenocysteine is included) amino acids used for protein translation in mammals.

Specific chemical properties can be altered subtly or dynamically by posttranslational modification. Some examples of posttranslational modifications of amino acid R groups in proteins include the methylation of lysine and histidine residues, the acetylation of lysine residues, the hydroxylation of proline and lysine residues, and the carboxylation of glutamate residues (Figure 5-5). These types of modifications of R groups are essential in defining the structural and functional properties of proteins. For example, the γ-carboxylation of glutamate residues in prothrombin and other blood clotting factors is critical for calcium binding and the proper function of these clotting factors in the blood clotting cascade.

Stay updated, free articles. Join our Telegram channel

Feb 26, 2017 | Posted by in PHARMACY | Comments Off on Structure, Nomenclature, and Properties of Proteins and Amino Acids

Full access? Get Clinical Tree

Get Clinical Tree app for offline access