Amino Acids, Peptides, and Proteins

Chapter 21


Amino Acids, Peptides, and Proteins



Amino acids, peptides, and proteins are crucial for virtually all biological processes. Amino acids have diverse roles in metabolism, neurotransmission, and intercellular signaling, as well as serving as structural subunits of peptides and proteins. Peptides include many hormones, signaling molecules, and protein fragments that are of physiologic and diagnostic significance. Nucleic acids provide the information or software program, and proteins serve as the hardware that performs most cellular functions. Proteins are multifunctional and constitute the machinery of life. Biologically, they (1) form many important intracellular and extracellular structures; (2) generate energy through catalysis and electron transfer; (3) produce motility through contractile elements; (4) assemble molecules; (5) serve as ion channels and pumps; (6) act as carriers; (7) perform immune defense; (8) serve as receptors, hormones, and cytokines for intercellular regulation; and (9) constitute signaling networks for intracellular regulation.


In humans, more than 20,000 genes encode proteins. Completion of the sequence of the human genome defines much of the sequence information for proteins and facilitates the identification of protein and peptide sequences. The number of proteins is greater than the number of genes, however, because of variable splicing of messenger RNA (mRNA), somatic recombination and mutation, proteolytic processing, and numerous post-translational modifications of proteins. The Human Antibody Initiative has developed specific antibodies for thousands of human proteins, and these antibodies have been used to assemble a Human Protein Atlas that uses immunohistochemistry to define the expression of more than 5000 proteins in different tissues of the body (http://www.proteinatlas.org/accessed April 19, 2011).124 These efforts serve as major resources for identification of tissue sources of proteins and potential assay reagents. Many proteins are secreted or leak out of cells in response to cellular injury or turnover, allowing their analysis in biological fluids such as blood and urine. The proteome represents the complete set of proteins in an organism or subcompartment of an organism such as the plasma space.3 Initial efforts of the Human Proteome Organization’s (HUPO’s) Plasma Proteome Project tentatively identified sequences of about 3000 gene products in plasma, and many structural variants of these gene products may exist.114,117 Several databases include extensive information about the sequence and post-translational modification of proteins. The U.S. National Library of Medicine and the Swiss Institute of Bioinformatics maintain gene and protein databases and sequence analysis tools that are available via the Internet [http://www.ncbi.nlm.nih.gov/accessed April 19, 2011 and http://ca.expasy.org/accessed April 19, 2011; The HIP2 Database (Healthy Human Individual’s Integrated Plasma Proteome); http://bio.informatics.iupui.edu/HIP2/accessed April 19, 2011/] and include data on more than 12,000 protein entries].134 The Human Protein Reference Database and Human Proteinpedia are additional resource (http://www.hprd.org/accessed April 19, 2011).101 Most databases are designed mainly to assist with peptide and protein identification, and more data are needed related to the abundance of components in healthy and diseased populations, which is the usual basis for diagnostic applications.2,3,60


This chapter begins with a discussion of the properties and metabolism of amino acids. Inherited disorders of amino acid metabolism are discussed in Chapter 58. A general description of protein structure is followed by information on several high-abundance proteins in plasma and several other fluids including cerebrospinal fluid. This chapter provides limited discussion of urinary proteins, which are covered in Chapters 25 and 48. Virtually all tissues contribute proteins to plasma via secretion or cell injury. The resulting diversity of plasma proteins leads to high informational content and broad clinical utility of the analysis of proteins in plasma or serum. Other proteins are discussed in the chapters on enzymes (see Chapter 22), tumor markers (see Chapter 24), lipoproteins (see Chapter 27), hormones (see Chapter 29), and specific disease processes. Chapter 32 describes analysis of hemoglobins, which are the major protein component of whole blood and are major plasma components when red blood cell lysis occurs.



Amino Acids


Amino acids are major metabolic intermediates and the basic structural units of proteins. Their measurement in physiological fluids assists with fundamental studies of metabolism and the diagnosis of pathologic processes and inherited conditions.



Basic Biochemistry


Amino acids are organic compounds containing both an amino group (imageNH2) and a carboxyl group (imageCOOH) or another acidic group such as sulfonic acid (imageSO3). Technically, proline, hydroxyproline, and sarcosine are imino acids (imageNHimage), but they usually are grouped together with amino acids. Those occurring in proteins are α-amino acids, with amino groups linked to the α-carbon, as diagrammed here:



image


R represents a variety of different sidechains as listed in Table 21-1. Amino acids in physiologic fluids also include β-amino acids such as β-alanine and taurine, and γ-amino acids such as γ-aminobutyric acid (GABA).



With the exception of glycine, all α-amino acids are asymmetric about the α-carbon, with four different groups linked to this carbon. Most α-amino acids in humans, including all of the amino acids incorporated into protein, have the L-configuration. Small quantities of D-amino acids occur in physiological fluids. In most cases, these are not known to have specific functions. An exception is D-serine, which represents 5 to 20% of total serine in cerebrospinal fluid; D-serine may serve as a neurotransmitter.41 Amino acids with the D-configuration occur in some bacterial products, foods, and pharmaceuticals. D-Amino acid oxidases in liver and kidney convert D-amino acids to ketoacids, which can be metabolized. L-Amino acids in proteins undergo slow racemization to a mixture of L– and D-amino acids over many years. Aspartic acid undergoes the most rapid racemization and can be used to estimate the time of synthesis of proteins undergoing very slow turnover, such as lens proteins in the eye or collagens in intervertebral disks, where half-lives may exceed 50 years.145 Two amino acids—threonine and isoleucine—have a second asymmetric carbon, and the stereoisomers are referred to as allothreonine and alloisoleucine. Table 21-1 diagrams the structures of the 21 amino acids that are encoded by codons in messenger RNA and incorporated into proteins. Twenty amino acids are incorporated into most proteins. Selenocysteine is a special case of an amino acid synthesized on a specific transfer RNA and incorporated into a few sites in only about 25 proteins.92 Many additional amino acids are generated by post-translational modification of proteins or by metabolism.



Acid-Base Properties of Amino Acids


Acid-base properties of amino acids depend on the amino and carboxyl groups attached to the α-carbon and on the basic or acidic groups occurring on some sidechains (R). In the physiologic pH range of plasma near pH 7.4, the carboxyl group is dissociated, and the amino group is protonated to give the following structure:



image


Thus, at neutral pH, amino acids have both positively and negatively charged groups and occur as zwitterions.



image


The pH at which an ionizable group, such as an amino or carboxyl group, occurs equally in charged and uncharged forms is referred to as the pK for that group. Amino acids have two or more pKs, including one for the carboxyl, one for the amino group, and an additional pK if an ionizable sidechain is present. The isoelectric point (pI) is the pH where an amino acid or other molecule has a net charge of 0. For a typical neutral amino acid such as glycine, the pI of 5.97 is midway between the pK1 of 2.34 for the carboxylic acid and the pK2 of 9.60 for the amino group. Ionization constants for amino acids are given in Table 21-2. The pKs of amino acid sidechains in proteins vary somewhat because of the influence of neighboring amino acids.



The buffering capacity of ionizable groups is primarily in a pH range within ±1 of the pK for the respective groups. Amino acids and proteins therefore have a limited buffering capacity near physiologic pH, mainly caused by the contribution of imidazole sidechains of histidine. Amino acids serve as buffers at pHs near the pKs of their ionizable groups; glycine, for example, sometimes is used as a buffer near pH 2.5 or 9.5.



Hydrophobicity, Solubility, and Stability of Amino Acids


Sidechains produce variation in the properties of different amino acids. Table 21-1 shows the structures of amino acids, molecular weights, and the Kyte and Doolittle index of the hydrophobicity of their sidechains.83 Amino acids with longer aliphatic or aromatic sidechains such as isoleucine, leucine, and phenylalanine have greater hydrophobicity than shorter sidechains such as alanine, indicating lower water solubility. Neutral amino acids having polar groups such as hydroxyl or amide groups in their sidechains are more hydrophilic. Acidic amino acids have sidechains with carboxylic acids, and basic amino acids have sidechains with amino, guanidine, or imidazole groups. Acidic and basic sidechains generally represent highly polar and water-soluble sidechains. The structural diversity of sidechains contributes to the participation of amino acids in many metabolic pathways and the ability to form proteins with wide variation in structure and physical properties. The occurrence of an imino rather than an amino group in proline results in some differences in physical properties and reactivity of proline and in the geometry of peptides containing proline.


Amino acids generally occur freely in solution in plasma and other physiologic fluids, with a few exceptions. The thiol group of cysteine and homocysteine and small peptides such as cysteinylglycine and glutathione oxidize easily and become linked to other molecules via disulfides in the extracellular space, which is an oxidizing environment. This contrasts with the cytoplasm, which is predominantly a reducing environment maintained by a high concentration of glutathione in reduced form.127 In plasma, cysteine occurs as cystine (dimeric cysteine linked via a disulfide) or as a mixed disulfide with albumin or other proteins. About half of cysteine is covalently bound to protein and can be released by the addition of reducing agents. Analysis of amino acids without reduction detects only the cystine component. Homocysteine, cysteinylglycine, and glutathione are similarly distributed as mixed disulfides with cysteine or protein.157 During storage of plasma specimens, some change in the distribution of homocysteine occurs, and the proportion bound to protein increases.59,157 Tryptophan is another amino acid that occurs approximately 50% bound to albumin, although in this case the binding is noncovalent.


Most amino acids are water soluble and stable in plasma specimens, again with a few exceptions. Solubility of cystine and a few of the more hydrophobic amino acids can be limiting in some disorders of transport or metabolism. Crystals of cystine, leucine, or tyrosine may deposit when concentrations become elevated in urine in cystinuria or tyrosinemia or within intracellular compartments in cystinosis. Glutamine degrades in solution as the result of intramolecular cyclization to pyroglutamic acid (also termed 5-oxoproline) with release of ammonia. Quantification of glutamine therefore requires rapid processing of specimens and frozen storage, similar to requirements for ammonia analysis. Homocysteine (usually measured as total homocysteine after reduction) is stable in plasma but requires rapid processing of blood specimens to avoid elevation resulting from continuing release from cells.128 Arginine can be degraded in specimens with increased arginase concentrations; some arginase is present in erythrocytes and may be increased by hemolysis.112



Amino Acid Metabolism


Amino acids participate in many metabolic pathways, in addition to serving as a substrate for protein synthesis.31 In the healthy state, women require ≈46 g/d and men ≈56 g/d of dietary protein (0.8 g/kg body weight), and substantial increases in demand occur during growth and in many disease states.118 Dietary protein is digested by proteases in the stomach and small intestine to yield amino acids. Endogenous protein turnover serves as another source of free amino acids. Dietary sources and endogenous turnover, therefore, serve as dual sources of amino acids for protein synthesis. Eight amino acids used for protein synthesis—(1) isoleucine, (2) leucine, (3) lysine, (4) methionine, (5) phenylalanine, (6) threonine, (7) tryptophan, and (8) valine—are not synthesized by humans and therefore are considered “essential” constituents of the diet. Meat, milk, eggs, and fish contain a full range of essential amino acids. Gelatin is deficient in tryptophan, and individual plant sources of protein may be deficient in lysine, methionine, or tryptophan. Diets based on a single source of plant protein may be deficient in some amino acids. When liver function is compromised,162 cysteine and tyrosine become essential because they are not converted from their usual precursors methionine and phenylalanine. Arginine may be conditionally essential.112 Essential amino acids usually have been defined for the entire body, but additional amino acids may be essential for specific tissues or for some human cells in culture. As an example, administration of asparaginase to deplete asparagine is useful therapy for acute lymphoblastic leukemia, because the lymphoblasts are not able to synthesize asparagine. Glutamine is considered to be an important metabolic substrate for immune cells and enterocytes.132 Supplementation of glutamine and arginine may be beneficial in critically ill patients.162


Requirements for dietary protein to maintain nitrogen balance increase per unit body weight during infancy and childhood when there are demands for protein for growth, and in pregnancy, lactation, and states of protein loss or catabolic states.24,80 Daily requirements increase by up to 3.5 to 4 g protein/kg body weight for premature infants.50 A diet severely deficient in protein and consisting primarily of high-starch foods can lead to kwashiorkor, with decreased serum albumin, immune deficiency, edema, ascites, growth failure, apathy, and many other symptoms.67 Marasmus results when protein and energy sources such as carbohydrates are deficient; protein-calorie starvation causes generalized wasting of muscles and subcutaneous tissues and lesser edema. Inadequate nutrition is frequently a problem in surgical, burn, or trauma patients in a catabolic state and with decreased food intake.69 Negative nitrogen balance can contribute to delayed wound healing and impaired immunity. Measurements of plasma markers such as albumin or prealbumin are indicators of adequacy of the amino acid supply. Protein intake is related to urinary urea and acid excretion in the form of sulfate. A high-protein diet promotes excretion of acidic urine, and a vegetarian diet of a neutral urine. In kidney disease, high protein intake appears to be harmful, and protein restriction has been used as a therapeutic intervention to slow the progress of kidney disease.89


Amino acids normally are released from dietary protein by degradation with pepsin in the acidic environment of the stomach and with pancreatic hydrolases added in the duodenum.31 Digestion of protein may be impaired by gastric or pancreatic disorders. Uptake by intestinal and other cells occurs by active transport with several separate transport systems that handle neutral amino acids, acidic amino acids, basic amino acids, and cystine, glycine, and proline. Amino acids are actively transported by γ-glutamyltransferases that transport amino acids by linking them to glutamic acid transferred from glutathione. Active transport maintains higher concentrations of amino acids inside cells than outside in the extracellular space and, in renal tubules, reclaims most amino acids that undergo glomerular filtration.70


Amino acids are critical intermediates in many metabolic pathways, including the urea cycle for converting ammonia to urea and the alanine cycle for transferring nitrogen and fuel sources from muscle to liver, both described later. Other major pathways include ammonia generation in the kidney from glutamine and glutamic acid, glutathione formation and reduction to maintain a reducing environment intracellularly, and the glutathione cycle for cellular uptake of molecules.127 Amino acids are precursors for many hormones and signaling molecules such as thyroid hormones, catecholamines, serotonin, melatonin, nitric oxide, and hydrogen sulfide. Activated pathways for metabolism of tryptophan and cysteine are effector pathways for interferon action.139 Serine is a major source of one-carbon units transferred by tetrahydrofolic acid for purine synthesis, methylation of deoxyuridylic acid to thymidylic acid, and conversion of homocysteine to methionine. Glycine, aspartic acid, glutamine, and serine contribute atoms to purine and pyrimidine synthesis. Glycine and arginine are precursors for creatine synthesis. Methionine serves as a methyl donor after activation as S-adenosylmethione for a wide range of methylation reactions, including creatine synthesis, protein methylation, and choline synthesis, and yields homocysteine as a byproduct. Several amino acids participate in conjugation reactions that serve as excretory pathways and generate products such as glycine or taurine conjugates with bile acids. Cysteine and glutathione form mercapturates with reactive compounds as a protective mechanism. Metabolism of the sulfur-containing amino acids generates sulfate, which is excreted in urine in increased amounts with consumption of high-protein diets.


The alanine cycle is a metabolic pathway that allows muscle cells to use amino acids as a fuel source and to export excess nitrogen in the form of alanine, which then is metabolized in the liver as diagrammed in Figure 21-1. In muscle cells, ammonia is transferred from amino acids to pyruvate by aminotransferases to form alanine. Alanine is excreted from muscle cells and is taken up by the liver, where aminotransferases transfer ammonia from alanine to 2-oxoglutarate, forming pyruvate and glutamic acid. Pyruvate can serve as a substrate for gluconeogenesis or for energy, and glutamate serves as a donor for urea synthesis. This pathway explains the need for high aminotransferase catalytic amounts in liver and muscle.



The urea cycle converts ammonia to urea in the liver (see Chapter 50). Ammonia, bicarbonate, and ATP join in formation of carbamoylphosphate, which is transferred to ornithine to form citrulline. Aspartic acid and citrulline are condensed to form arginosuccinic acid, which, in turn, is cleaved to arginine and fumaric acid. Arginase hydrolyzes arginine to urea and ornithine to allow the cycle to repeat. Net changes from the cycle include input of ammonia, bicarbonate, aspartic acid, and energy in the form of ATP and output of urea and fumaric acid. Urea usually is viewed simply as a waste product, with urea production related to protein intake. However, one beneficial action of urea is its contribution to the ability of kidneys to concentrate urine. Urea is the main component accounting for high osmolality in the renal medulla, and maximal urinary concentrating ability decreases when protein intake is low and urea production is decreased.



Amino Acid Concentrations


Plasma amino acid concentrations are high during the first days of life, especially in premature neonates, but they tend to be low in infants with low birth weight for gestational age because of placental insufficiency. Plasma amino acid concentrations vary by about 30% during the day; therefore, blood specimens should be collected at the same time each day. Values are highest in midafternoon and lowest in early morning. This diurnal variation affects detection of heterozygous defects in metabolism.70


Most amino acids in blood undergo efficient glomerular filtration but are reabsorbed in proximal renal tubules by saturable transport systems (see Chapter 48). Increased renal excretion of amino acids (aminoaciduria) results from overload from increased plasma concentrations or from tubular impairment related to hereditary disorders or tubular injury. Amino acid excretion in urine varies with renal tubular maturation. Premature infants, especially during the first week, have a generalized aminoaciduria; even at full term, amino acid excretion is greater than in normal adults. In the urine of normal adults, glycine is the most abundant amino acid, followed by histidine, taurine, glutamine, serine, and alanine. Amounts of 1-methylhistidine may be high, depending on meat intake.


For adults, CSF concentrations of most amino acids, except for glutamine, are several-fold lower than in plasma. Glutamine is the main component, with usual concentrations of 0.4 to 1.0 mmol/L that are slightly higher than in plasma. Most amino acids other than alanine, serine, and threonine have CSF concentrations <0.03 mmol/L. Newborns have a smaller gradient of amino acid concentrations versus plasma than do adults.


Cells generally have intracellular amino acid concentrations higher than plasma concentrations. Intracellular concentrations of glutathione are in the mmol/L range. Many cells maintain high concentrations of taurine, which may serve as an antioxidant and as a component for regulation of intracellular osmolality. Regulation of intracellular amino acid concentrations may be one mechanism for controlling intracellular osmolality, particularly in tissues such as the renal medulla and brain.5



Clinical Implications of Amino Acid Concentrations


Historically, clinical laboratory assessments of plasma and urinary amino acids have been used primarily to detect or monitor inborn errors of metabolism, as detailed in Chapter 58. Measurement of plasma homocysteine is of clinical interest, as increased homocysteine is an indicator of deficiency of folic acid or vitamin B12 and is correlated with risk of cardiovascular disease.71,128 Fasting concentrations of plasma homocysteine are relatively constant within individuals and are correlated with risk for cardiovascular disease and thrombosis. Many hypotheses have been advanced for how homocysteine might be a cause of cardiovascular disease or thrombosis, but these hypotheses have been undermined by some recent intervention trials that have not observed decreased cardiovascular disease in response to lowering of homocysteine by increased intake of folic acid and vitamin B12.71 However, even if homocysteine does not cause cardiovascular disease, it could still serve as a measurable risk factor.


Measurement of amino acids has not been extensively used in clinical practice for monitoring of patients with nutritional, metabolic, infectious, or psychiatric disorders. In part, this is because plasma amino acid concentrations do not always reflect severe deficiencies in malnutrition states; most amino acid metabolism and function are intracellular. Also amino acid analysis has been expensive, with limited availability and slow turnaround time. Metabolomic studies may provide increased insight into metabolic pathways and potential clinical applications of measurement of amino acids. Several amino acid concentrations are of potential clinical interest. Drops in tryptophan concentration are associated with depression and other psychiatric disease.107 Tryptophan degradation along the kynurenine pathway and cysteine metabolism by cysteine dioxygenase are stimulated by interferon, so that quantification of tryptophan, cysteine, and their metabolites is an indicator of immune activation.139 Concentrations of arginine and citrulline are indicators of activation of the nitric oxide synthase pathway and adequacy of the substrate arginine for nitric oxide synthesis. This pathway may be important for pathogenesis of vascular disorders in sickle cell disease, asthma, and other disorders and may be a pharmacologic target.43,111 Increased concentrations of asymmetric dimethylarginine, which inhibits nitric oxide synthase, may be physiologically significant in the development of cardiovascular and kidney disease.75 Glutamine concentration may be monitored to assess the nutritional adequacy of postsurgical patients and those with other conditions.132,162 Measurements of intracellular concentrations of glutathione or ratios of reduced and oxidized glutathione may serve as measures of oxidative stress, nutritional depletion in cancer or other states, or overdose with toxic compounds such as acetaminophen. Advances in the ability to measure amino acids rapidly and efficiently may expand clinical application of amino acid measurements.



Analysis of Amino Acids


Methods used to measure amino acids in biological samples are detailed in Chapter 58. The standard method for many years was the cation-exchange chromatography with spectrophotometric detection after postcolumn reaction with ninhydrin, as developed by Stein and Moore in the 1950s. This method quantifies 30 to 40 components in plasma and urine. However, newborn screening programs usually apply tandem mass spectrometry, and an increasing range of chromatographic and mass spectrometric methods are being applied to amino acid analysis. Analytical procedures usually involve deproteinization prior to analysis by precipitation with acids or organic acid or by ultrafiltration. This yields good recovery of most amino acids, except recovery of tryptophan may vary as the result of protein binding; substantial amounts of cysteine, homocysteine, and thiol-containing peptides such as cysteinylglycine and glutathione are linked via disulfide bonds to proteins.157 Recovery of total cysteine and homocysteine requires reduction of specimens before precipitation steps. Most amino acids are stable in blood specimens, except for glutamine, which undergoes intramolecular cyclization to form pyroglutamic acid (5-oxoproline) with the release of ammonia. Specimens should be processed rapidly and stored frozen to preserve glutamine. Rapid processing of specimens is needed for accurate determination of homocysteine to avoid effects of homocysteine excretion from blood cells.128



image


Several methods are available for measurement of total homocysteine, including immunoassay after enzymatic conversion to S-adenosylhomocysteine, enzyme cycling reactions, chromatographic or electrophoretic methods after derivatization, chromatographic analysis with electrochemical detection, and tandem mass spectrometry.



Peptides and Proteins


This section describes the basic biochemistry of peptides and proteins, and provides details on several high-abundance plasma proteins and selected methods for protein analysis used in the clinical laboratory.



Peptide and Protein Structure


A peptide bond is the amide bond formed between the amino group of one amino acid and the carboxyl group of a second amino acid. An example of a peptide bond joining two amino acids is diagrammed with the peptide bond enclosed by a box:



image


Isopeptide bonds refer to amide bonds linking amino acids via amino acid sidechains; an example is glutathione, in which the γ-carboxyl of glutamic acid is linked to the amino group of cysteine. Polymers of several amino acids are termed peptides, and amino acid sequences are described from the amino- or N-terminus, the amino acid with a free α-amino group. The final amino acid in the chain is the carboxy- or C-terminus. Peptide bonds, similar to all amide bonds, are planar structures that fix the geometry about the bonds, so that the major conformational flexibility of the peptide backbone results from rotation about the axes of the two bonds to the α-carbon. Short peptides are identified by the number of constituent amino acids, such as dipeptide, tripeptide, tetrapeptide, or pentapeptide. When peptides reach sufficient length to have a defined globular structure—about 50 or more residues—polypeptides begin to be referred to as proteins. The terms proteose and peptone refer to partial digestion products of proteins, sometimes used as a source of amino acids for bacterial culture or other purposes.


The structures of peptides and proteins consist of five elements:



1. Primary structure is the sequence of amino acids in a peptide or protein. Post-translational modifications of amino acids contribute to increased diversity.


2. Secondary structure is the restriction of the peptide backbone by hydrogen bonds between different peptide bonds. Elements of secondary structure include α-helix, β-sheet, and β-turn. An α-helix has about 3.6 residues per turn, and hydrogen bonds are between amide oxygens and amide hydrogens 4 residues later. A β-sheet involves hydrogen bonds between the peptide bonds of adjacent peptide chains arranged in parallel or antiparallel configurations. Random coils refer to segments of peptide that lack defined secondary structure. Turns are commonly located on the surface of proteins and represent frequent sites of mutation because fewer constraints are imposed on sidechain packing than on interior regions of a protein.


3. Tertiary structure refers to folding of the polypeptide chain and elements of secondary structure into a compact three-dimensional shape. Folding is a complex process driven by energy minimization of intramolecular and solvent interactions. Hydrophobic groups tend to fold into the interior with less exposure to solvent, and charged and polar sidechains tend to be located on the surface, where sidechains are exposed to solvent. The three-dimensional structure is stabilized by intramolecular hydrogen bonds, van der Waals forces, and hydrophobic interactions. Disulfide bonds between cysteine residues stabilize three-dimensional structure. High stability of a structure is reflected by maintenance of structures up to high temperatures. Denaturation of protein refers to unfolding that occurs with temperature change or in the presence of organic solvents, detergents, or reagents that disrupt hydrogen bonds. Limited denaturation can be reversible, but extensive unfolding and denaturation of proteins often lead to aggregation and precipitation, which is difficult to reverse.


4. Quaternary structure refers to the incorporation of two or more polypeptide chains or subunits into a larger unit. Examples are creatine kinase with two subunits and lactate dehydrogenase and hemoglobin with four subunits.


5. Ligands and prosthetic groups provide additional functional and structural elements, such as metals in metalloenzymes, heme in hemoglobin and cytochromes, and lipids in lipoproteins. Proteins without their associated ligands are often referred to as apoproteins (e.g., apotransferrin without iron, apolipoproteins without lipid).


Many proteins are organized with subassemblies of smaller structural units or domains somewhat analogous to a string of beads. Similar domains may occur in different proteins, but diversity in structure is possible through assembly of domains in different combinations and three-dimensional configurations. This is exemplified by a number of proteins of the coagulation and complement systems that are formed from different combinations of small globular domains. Many gene products have arisen from duplication of common ancestral genes. Homologous genes not only commonly have sequence homology but fold into similar three-dimensional structures. The serpin (originally from serine proteinase inhibitor) superfamily consists of more than 1000 related proteins in different organisms that have been subdivided into 16 clades (lettered A through P).88 Humans have 36 serpins, of which 29 are protease inhibitors and 7 lack protease inhibitor function.88 Serpins that act as protease inhibitors in plasma include α1-antitrypsin, α1-antichymotrypsin, α2-antiplasmin, antithrombin III, C1 inhibitor, heparin cofactor II, protein C inhibitor, and plasminogen activator inhibitor-1. Some serpins without known protease inhibitor function are cortisol-binding globulin, thyroxine-binding globulin, angiotensinogen, intracellular proteins, heat shock protein 47, and the tumor suppressor maspin. Serpins illustrate how a common structure is adapted to multiple functions. Extensive information about structures of serpins provides insight into disorders of protein conformation and folding discussed later. Other examples of families of plasma proteins are the albumin and lipocalin families. The albumin family includes albumin, α-fetoprotein, and afamin.121 The lipocalin family includes several plasma proteins such as α1-acid glycoprotein, retinol-binding protein, apolipoprotein D, α1-microglobulin, prostaglandin D synthase (β-trace), β-lactoglobulin, neutrophil gelatinase-associated lipocalin (NGAL), inter-α-trypsin inhibitor, and C8 γ-chain.171 Lipocalins generally have a barrel-shaped structure that is well suited to serve as a carrier for small molecules.



Disorders of Protein Folding


Protein folding is an error-prone process, and many molecular chaperones work to refold, prevent aggregation of misfolded proteins, or degrade misfolded proteins.28,44 Several heat shock proteins that increase in response to a variety of stresses are molecular chaperones. When cells detect increased accumulation of misfolded proteins, an adaptive mechanism—the unfolded protein response—is activated.135 This response increases production of chaperones and slows general protein synthesis to allow more time to fold new proteins. Despite these protective mechanisms, several families of age-related, genetic, and infectious diseases appear connected to disorders of protein folding and protein aggregation. Prion diseases are infectious diseases in which the transmissible agent may be protein that catalyzes misfolding of endogenous proteins. In Alzheimer’s disease, accumulation of deposits of amyloid may contribute to pathogenesis. Polyglutamine diseases result from genetic expansion of repeat units encoding glutamine and are associated with Huntington’s disease or other neurodegenerative disorders; polyglutamine sequences aggregate as β-sheets.169 TDP-43 proteinopathies are neurodegenerative disorders, including amyotrophic lateral sclerosis, resulting from aggregation of transactive response DNA-binding protein.81 Several inherited disorders related to mutations in specific proteins probably result from problems in protein folding. In α1-antitrypsin deficiency, hepatic injury results from aggregation and accumulation of misfolded protein.14,144 In cystic fibrosis, the cystic fibrosis transmembrane conductance regulator (CFTR) protein has a one–amino acid deletion that may influence interactions with chaperones and the stability of the protein. Accumulation of misfolded proteins has been suggested as a pathogenic mechanism contributing to vascular, cardiac, and beta cell failure in diabetes.44,135



Protein Synthesis and Processing


Proteins are synthesized by ribosomes reading from the 5′-end of mRNA. Triplet codons in mRNA are matched with complementary sequence in transfer RNA carrying specific amino acids. Protein synthesis begins with an AUG codon encoding methionine, and the polypeptide chain is synthesized from the N-terminus. As originally outlined by Blobel in the signal hypothesis, proteins that are secreted, located in vesicular compartments, or oriented on the external surface of cell membranes usually contain an N-terminal signal peptide about 15 to 30 amino acids in length. Signal peptides interact with signal recognition particles and attach ribosomes to the endoplasmic reticulum (ER). Nascent peptide chains are inserted through the membrane of the ER as the protein is synthesized. Signal peptides of most secretory proteins are removed even before synthesis of the entire protein chain is completed. Some post-translational modifications such as asparagine-linked, also termed N-linked, glycosylation occur in the ER.100 This modification entails assembly on a dolichol carrier of a branched oligosaccharide ending with multiple mannose subunits, and the oligosaccharide is transferred en bloc to asparagine residues that are in the sequence Asn-Xxx-(Ser or Thr), where Xxx can be any amino acid but proline. Within the cisternal compartments of the ER and Golgi apparatus, many biosynthetic processing steps occur. The structures of N-linked oligosaccharides are modified into a complex set of structures that usually end in sialic acids; a variety of O-linked oligosaccharides are attached, primarily to Ser and Thr, and many modifications of sidechains occur, such as phosphorylation, sulfation, hydroxylation of Pro, Lys and Asp, and carboxylation of Glu at selected sites in proteins. Oligosaccharide processing may change in different physiologic states and may serve as a marker for disease.45 Numerous molecular chaperones assist with folding and disulfide bond formation of proteins, including peptidylprolyl isomerases and disulfide isomerases.28


Many proteins and peptides are synthesized as larger precursors. Proinsulin, discovered as the precursor for insulin by Steiner in the 1960s, served as the prototype of a peptide precursor. Proinsulin is a single polypeptide chain that sequentially contains the B-chain of insulin, connecting or C-peptide, and the A-chain of insulin. The C-peptide is excised by proteases and insulin is produced, with A- and B-chains linked by two disulfides. Synthesis of insulin as a precursor assists in appropriate formation of disulfide bonds. Most small bioactive peptides (e.g., gastrin, ACTH, angiotensin, bradykinin, vasopressin, thyrotropin-releasing hormone) are synthesized as precursors, and peptides are released by proteolysis prior to release or extracellularly. Many larger proteins are synthesized as precursors as well, including albumin, haptoglobin, and C3 and C4. Synthesis as precursors aids in appropriate folding and assembly of disulfides for proteins such as haptoglobin, C3, and C4, which are cleaved into multiple peptide chains.


Different sets of post-translational modifications occur on intracellular proteins. N- and O-linked glycosylation typical of extracellular proteins does not occur; instead, reversible addition of N-acetylglucosamine to Ser and Thr is noted.49 Reversible phosphorylation of proteins by signaling cascades of kinases and phosphatase serve as important regulatory systems. Many other biosynthetic modifications influence protein function and intracellular localization. Modification of histones and nuclear proteins by acetylation and methylation affects transcription. Coupling of the small protein ubiquitin is a mechanism for targeting proteins for degradation into peptides by proteasomes. This is an important process for degrading misfolded or damaged proteins, for protein turnover, and for generating peptides that are presented for immune surveillance and activation.53


In addition to biosynthetic modifications, proteins undergo chemical modification by free radicals and other reactants. As an example, glucose reacts with amine groups to generate reversible Schiff’s bases; some of the reaction products rearrange to more stable glucose conjugates, and the products are referred to as glycated proteins. Amounts of hemoglobin A1c and glycated plasma protein measured as fructosamine serve as measures of exposure to glucose. Sidechain carbonyls serve as an indicator of free radical generation. Nitrotyrosine can be assessed as a measure of exposure to reactants. Oxidants can modify a number of amino acid sidechains, including oxidation of methionine to methionine sulfoxide and sulfone and cysteine to cysteic acid. Albumin serves as a model for many chemical modifications of plasma proteins.105,121



Physical Properties of Proteins


Variable amino acid sequences and three-dimensional structures of different proteins result in variation in physical properties. Tyrosine and tryptophan residues absorb light at 280 nm, and the abundance of these amino acids determines the strength of absorbance of a peptide or protein. Absorption at 280 nm, therefore, is used to quantify a purified protein. Tryptophan residues are intrinsic fluorescent groups with variable fluorescence efficiency and lifetimes, depending on their local environment. Some proteins, such as hemoglobin, contain heme or other chromophores that can be assessed at visible wavelengths. Peptide bonds have strong absorbance at wavelengths below 220 nm, so that spectrophotometric detection in this range detects all peptides and proteins, and absorbance is much greater than at 280 nm. Ionizable groups in proteins exert a strong effect on physical properties and changes in structure that occur in response to changes in pH. Common ranges of pKs are shown in Table 21-2, although the ionization of individual amino acids may be influenced strongly by neighboring amino acids. Differing physical properties serve as the basis of methods to separate proteins (see also Chapter 12). Some important characteristics include the following:



1. Differential solubility. The solubility of proteins is affected by pH, ionic strength, temperature, and the dielectric constant (addition of solvents such as ethanol). Changing solvent pH affects the net charges of a protein; at its pI (net charge zero), a protein in polar solvent usually has its lowest solubility. Changing ionic strength affects the hydration and solubility of proteins. “Salting-in” and “salting-out” procedures were early methods for separating and characterizing proteins. Serum was originally divided into albumin, which is soluble in water, and globulins, which require salt to remain in solution. Albumin also stays in solution at high concentrations of salts such as ammonium sulfate that precipitate globulins. Addition of organic solvents and polyethylene glycol has been used for differential precipitation. Fractional precipitation of plasma with ethanol, using protocols developed by Cohn and coworkers, leads to several Cohn fractions that are enriched in immunoglobulins, α- and β-globulins, or albumin (fraction V). Polyethylene glycols induce precipitation by steric exclusion and, therefore, preferentially precipitate large proteins or complexes.


2. Molecular size. Separation of small and large molecules can be achieved by dialysis or ultrafiltration. Size exclusion chromatography, ultracentrifugation, and electrophoresis perform size separations under native conditions where proteins and peptides are in native globular states or under denaturing conditions. Addition of reducing agents allows separation of disulfide-linked components. Polyacrylamide gel electrophoresis in the presence of the denaturing detergent sodium dodecylsulfate is a method for estimating the molecular weight of polypeptide chains in proteins.


3. Molecular mass. Advances in mass spectrometry allow the determination of masses of peptides and proteins with increasing accuracy. Peptides and proteins can be ionized by matrix-assisted laser desorption/ionization (MALDI) or by electrospray ionization. Highest detection sensitivity and mass accuracy are for peptides, and tandem mass spectrometry can be applied for sequence analysis of peptides. Large proteins with extensive mass heterogeneity due to variable modifications pose challenges related to their high mass and the formation of multiple peaks.


4. Electrical charge. Ion-exchange chromatography, isoelectric focusing, and electrophoresis separate peptides and proteins based on charge.


5. Surface adsorption. Adsorption of peptides and proteins to particles have been used as the basis for separations. Hydrophobic interaction chromatography and reverse-phase chromatography rely on differential hydrophobic interactions of peptides or proteins.


6. Affinity chromatography. Specific ligands, antibodies, or other recognition molecules have been used to separate peptides or proteins selectively.



Plasma Proteins


Plasma is a complex mixture of proteins representing thousands of gene products.2,57,117 The most abundant products are proteins secreted directly into the circulation primarily by the liver, and immunoglobulins contributed by lymphatic tissue. Classical methods for protein fractionation and purification over several decades led to isolation and characterization of about 100 of the most abundant plasma proteins.3 The 12 most abundant proteins represent more than 95% of total protein mass, and a progressive decline in abundance of other components has been noted.2,60 Albumin alone represents more than 50% of the total mass of protein and an even higher proportion of the number of molecules, so that albumin is the main factor in colloid osmotic pressure (oncotic pressure). For some purposes, such as evaluation of contributions to oncotic pressure, inhibitory capacity of proteases, and binding capacity for ions, drugs, or small molecules, it is useful to express abundance of proteins on a molar basis rather than as mass abundance, and a slightly different set of proteins is identified that includes some lower molecular weight proteins that do not represent major components by mass. Lists of the 30 most abundant proteins by mass and molecular abundance are provided in Table 21-3.



It is apparent, for example, that although α2-macroglobulin (AMG) is the most abundant protease inhibitor by mass abundance, α1-antitrypsin and several other protease inhibitors have a higher molar concentration and capacity for protease inhibition.


The most appropriate way to express protein abundance depends on the method of analysis. Signal intensities of stained bands in gels and chromatographic separations with photometric detection usually relate to protein mass. Signal responses in immunoassays and mass spectrometry usually correspond to numbers of particles (i.e., the molar abundance of proteins). Components with highest molar abundance in Table 21-3 are easiest to detect by immunoassay (allowing use of less sensitive methods such as immunoturbidimetry and immunonephelometry) and generally will yield tryptic peptides with the highest abundance for mass spectrometric detection.60 Use of molar abundance allows better comparison of the stoichiometry of molecular forms of proteins and different proteins that differ substantially in molecular weight. Currently, concentration of proteins s are typically expressed in units of mass or molarity. High-abundance proteins most often are reported in units of mass per volume by clinical laboratories in the United States.



The Plasma Proteome and Peptidome


Plasma contains a complex mixture of thousands of proteins and peptides. The complete set of proteins is referred to as the proteome. Current efforts to compile a database of components found in plasma include more than 12,000 protein entries.134 Considerable variation among proteins is related to post-translational and genetic variation.114 In addition to the diversity of proteins, plasma contains complex mixtures of peptide components—the peptidome—which only recently has been fully appreciated.122 Although many peptides are cleared rapidly by renal clearance and peptidolysis, some peptides accumulate bound to albumin or other carrier proteins.122 The peptidome is of potential diagnostic interest because of its large diversity and information content, reflection of activity of physiologic proteases, and suitability for direct analysis by mass spectrometry. Many approaches for proteome analysis are known, including (1) two-dimensional electrophoresis and chromatography, (2) microarrays of antibodies or other proteins, and (3) mass spectrometry with electrospray ionization or matrix-assisted laser desorption/ionization.2,29,56,117,134,149 Analysis of multiple components expands the ability to examine changes in patterns of protein concentration versus traditional analysis of one protein at a time, and high-resolution separation techniques offer the opportunity to identify many structural changes, as well as quantitative changes, in proteins. Sequence analysis can be performed rapidly on small peptides, allowing identification of components. Simultaneous analysis of many peptides or proteins offers opportunities to empirically identify protein markers for disease without knowing beforehand which protein to test for. It has been suggested that mass spectrometric patterns of peptides might serve as signatures of disease that could be interpreted by complex pattern recognition by computers.122 However, it is not clear how usual standards of laboratory practice, such as calibration, quality control, and verification of accuracy, would be assessed.55 The U.S. Food and Drug Administration has proposed guidelines for in vitro diagnostic multivariate index assays to address special issues raised by this approach. Although untargeted empirical analysis may be a useful discovery approach, most methods for multiplex analysis of peptides and proteins for clinical use are likely to rely on targeted analysis. Substantial advances have been made in the quantitative analysis of proteins by digesting proteins into peptides with the protease trypsin and then quantitating released tryptic peptides by tandem mass spectrometry using isotopically labeled internal standard peptides. It has been proposed that proteotypic peptides (peptides that are unique to a single protein) for more than 20,000 proteins should be identified to provide a means of quantifying virtually any human protein.2 Peptide quantification via tandem mass spectrometry is being applied to measurement even of relatively low-abundance proteins such as thyroglobulin and troponins,54,79 and it may serve as a valuable method for measuring bioactive peptides. Multiplex immunoassays are coming into routine clinical laboratory use as the result of microparticle or planar arrays.78,90 Products currently available for clinical use measure several components; highly multiplexed assays still present some challenges with respect to reproducibility and quality control.32



Plasma Protein Concentrations


The characteristic of proteins that has been applied most frequently is their plasma concentration, and the range of plasma concentrations of different proteins measured in clinical assays extends well over 10 logs of concentration.3,60 Relative to the highest abundance protein, albumin, concentrations of plasma proteins decrease progressively over several logs of concentration for proteins released from tissues of smaller cell mass, and the ability of endogenous proteins to enter the circulation generally decreases hierarchically: (1) proteins secreted directly into plasma, (2) cell membrane proteins shed into the circulation, (3) secretory proteins in exocrine secretions, (4) high-abundance cytoplasmic proteins, (5) low-abundance cytoplasmic proteins, (6) transmembrane proteins, and (7) organellar proteins that must traverse more than one membrane to exit cells. Many proteins serve as useful markers of physiology and disease, as these processes alter the production or release of proteins into plasma. Changes in plasma concentrations of secretory proteins usually reflect changes in synthetic rates, secretion, or clearance, and plasma concentrations of cytoplasmic and organellar proteins tend to reflect cellular injury and rates of leakage into plasma. In addition to endogenous sources of proteins, multiple exogenous sources of proteins include infectious organisms, dietary sources, and therapeutic interventions. For diagnostic purposes, clinical laboratories currently assess only a small proportion of the thousands of gene products in the circulation and identify only a small proportion of diversity of post-translational processing. Antibodies and T-cell receptors represent special cases wherein somatic recombination and mutation result in repertoires of millions of sequences and binding specificities. A small portion of this diversity is assessed by serological or functional assay.


Plasma concentrations of proteins depend not only on rates of production and efficiency of entry to the circulation, but also on rates of clearance. Proteins and peptides substantially smaller than albumin are cleared from the circulation by glomerular filtration unless they are bound to larger carriers, such as small apolipoproteins bound to lipoprotein particles or retinol-binding protein bound to prealbumin.116 Peptides and small proteins not bound to carriers are cleared with half-lives of about 2 hours under conditions of normal kidney function and accumulate to higher concentrations in kidney failure. Examples of proteins and peptides that increase dramatically in renal failure include β2-microglobulin, cystatin C, immunoglobulin light chains, parathyroid hormone fragments, complement factor D, Clara cell protein (CC16), atrial natriuretic peptide, interleukins, and, to a lesser extent, retinol-binding protein. Accumulation of components in renal failure identifies components with renal clearance161 as a primary clearance mechanism; hundreds of increased plasma peptide components can be detected by mass spectrometry.136 Many bioactive peptides in plasma, such as insulin, intact parathyroid hormone, and kinins, have much shorter circulating half-lives of only a few minutes, indicating receptor-mediated uptake or degradation by exopeptidases or endopeptidases.56 Proteins the size of albumin or larger generally have an upper limit of about 7 days for their circulating half-lives because of pinocytosis and degradation by cells. Two exceptions to this are albumin and immunoglobulin G, which have a receptor-mediated process to recycle these proteins from pinocytotic vesicles. The recycling mechanism extends the half-life of albumin and immunoglobulin several-fold.131,165 Many proteins are subject to uptake by specific receptors or degradation by proteolysis. Half-lives <7 days suggest clearance mechanisms other than bulk pinocytosis.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Nov 27, 2016 | Posted by in GENERAL & FAMILY MEDICINE | Comments Off on Amino Acids, Peptides, and Proteins

Full access? Get Clinical Tree

Get Clinical Tree app for offline access