Genetic Variation: Effect on Nutrient Utilization and Metabolism1

Genetic Variation: Effect on Nutrient Utilization and Metabolism1

Patrick J. Stover

Zhenglong GU


Genetic variation contributes to individual phenotypic differences within and among human populations, including metabolic traits and differential susceptibility to common chronic and metabolic diseases. Metabolic impairments are integral components of chronic diseases, developmental anomalies, cancers, neurologic disorders, and most other pathologic processes. Often they precede anatomic and other signs of disease. Clinical investigations of inborn errors of metabolism provided some of the earliest and conclusive evidence that (a) metabolic impairments are heritable, (b) genes can modify nutrient use and metabolism, (c) metabolic impairments cause disease, and (d) the functional consequences of genetic mutations can be attenuated significantly by targeted nutritional therapies that compensate for and, less often, avoid genetically induced metabolic impairments.

Phenylketonuria provides a classic paradigm that demonstrates the potential effectiveness of diet in modifying deleterious phenotypes that result from genetic mutations that alter metabolism. Phenylalanine-restricted diets lessen and may even prevent severe cognitive deficits in children who carry mutations in the phenylalanine hydroxylase gene (1). Inborn errors of metabolism are generally recessive and are relatively rare in most populations, and the initiation or progression of the associated disorders can be managed by diet or nutrition in some, but not all, cases.

Inborn errors of metabolism are typically monogenic disorders that follow Mendelian modes of inheritance and therefore are well characterized with respect to their molecular and genetic bases. However, the most prevalent human metabolic disorders are complex polygenic diseases with contributions from multiple low-penetrant susceptibility alleles, and the risks associated with these alleles are modifiable by both lifestyle and environmental factors, including one or more dietary components.

The genetic and biochemical causes of many cancers and chronic diseases, including cardiovascular disease and type 2 diabetes mellitus, remain unidentified. These disorders do not conform to classic Mendelian inheritance patterns, and therefore genetic approaches based on “simple” linkage analyses are not always possible. Genomic approaches, enabled by the availability of complete genome sequences from several mammalian species and generation of a comprehensive catalog of human genetic variation, have been successful in identifying susceptibility genotypes that modify metabolism, change nutritional requirements, and contribute to metabolic disease. Furthermore, through evolutionary genomics, the origins and consequences of human
genetic variation are decipherable, and allelic variants and interacting environmental risk factors that impair metabolic pathways or modify optimal dietary requirements can be inferred.

Origin of Human Genetic Variation

The pattern of human genetic variation is determined by interactions among different evolutionary forces. The generation of primary sequence differences in DNA is a function of the DNA mutation rate; the expansion of the mutation within a population is a function of recombination, demographic history (e.g., fluctuations in effective population size, substructure, and migration), selection (the effect of the mutation on an organism’s fitness), and random process (genetic drift) (2, 3). Not all sequence variation has phenotypic consequences (2). DNA sequence that does not affect function can mutate freely without consequences; whereas changes in DNA sequences that encode information or function may alter physiologic process, and therefore the propagation and expansion of such sequences will be more constrained.

Most human genetic variations present in noncoding regions, including those found in intronic and intergenic regions, are assumed to be selectively neutral and therefore a function of the DNA mutation rate (2), which is estimated to be approximately 2.5 × 10-8 mutations per nucleotide per generation. However, this rate is not uniformly distributed across the whole genome (4). The highest mutation rates for a human gene are approximately 1 × 10-5 per generation (5).

Many factors contribute to DNA mutation rates. DNA replication and recombination do not occur with complete fidelity and thereby account for a significant portion of observable mutation rates. Polymerase error rates and DNA mutations are affected by nutrients including iron, B vitamins, and antioxidants. For example, inhibition of folate-dependent deoxythymidine monophosphate synthesis results in the misincorporation of deoxyuridine triphosphate into DNA (6). Purine and pyrimidine bases within DNA also undergo spontaneous chemical mutation; cytosine spontaneously deaminates to uracil with a frequency of 100 mutations/genome/day, and purine nucleotides undergo depurination mutations at a rate of 3000 mutations/genome/day. DNA repair systems are effective in detecting and correcting most of these mutations (7).

Genotoxic xenobiotics, both natural products and synthetic chemicals, are present in the food supply and can modify DNA chemically and increase mutation rates. One class of natural compounds, aflatoxins, can dramatically increase DNA mutation rates, trigger cancers in somatic cells, and lead to localized cancer epidemics (8). DNA mutation rates are affected by dietary antioxidants (9) as well as by excesses in pro-oxidant nutrients including iron (10). However, only mutations that occur in the germ line contribute to a species’ heritable genetic variation.

DNA mutation rates and polymorphism frequency vary throughout the human genome. Such region-specific differences within the genome have been attributed to the frequency of DNA recombination and to the mutagenic potential of specific nucleotide sequences. The most common genetic mutation within the human genome is the C to T transition (11). The sequence CpG is enriched in the promoter regions of mammalian genes and is recognized by DNA methylases, which convert the cytosine base (C) to methylcytosine (meC). meC density within the genome is modifiable by dietary folate and one-carbon donors, and fetal methylation patterns established in utero can be metastable and can influence gene expression into adulthood (12).

Cytosine methylation influences the transcription rates of genes by altering the affinity of DNA binding transcription factors or by enabling the recruitment of meC binding proteins that serve to silence gene transcription, or both. DNA methylation usually is associated with gene silencing and is critical for the inactivation of imprinted genes and X chromosome inactivation. Mutations at CpG dinucleotides occur approximately 10 times more frequently than at other loci, presumably because meC deaminates spontaneously to thymidine (T), whereas C deaminates to uracil. Uracil is recognized as foreign to DNA and is excised by the DNA repair enzymes, whereas T is not recognized as foreign. The sequence CpG is underrepresented in the human genome, and its frequency has decreased throughout evolution, consistent with that inherent instability (11).

DNA recombination rates also vary throughout the human genome. Recombination creates genetic variation by reshuffling existing genetic variation. The recombination rate was estimated to be 1 cM/Mb to approximately 1.33 cM/Mb; however, it is also very heterogeneous across the human genome: approximately 33,000 “recombination hotspots” account for approximately 50% to 60% of the crossover events, but they occupy only approximately 6% of the human genome sequence (3, 13, 14, 15, 16). Investigators have observed that genes that interact with the environment (e.g., immunity, cell adhesion, signaling) tend to locate at genomic regions with high recombination rates whereas those genes that do not experience low recombination (17). Recombination is also correlated with levels of genetic variation, a finding indicating that recombination itself may be mutagenic (18).

Mutations that expand within a population contribute to genetic variation as polymorphisms, and this process is the basis for the molecular evolution of genomes. The expansion of mutations within a population occurs through the process of genetic drift or natural selection. Drift is a stochastic process that results from chance assortment of chromosomes at meiosis. Only a few of all possible zygotes are generated or survive to reproduce (19); therefore, mutations can expand in the absence of selection through random fluctuations in the transmission of alleles from one generation to the next, resulting from
the random sampling of gametes. Because drift generally has a greater impact on allele frequencies in smaller populations, human demographic history has been a major force in shaping human genetic variation. Severe reductions in population size (bottleneck) can lead to a reduction in genetic variability, whereas rapid expansions can increase genetic variation (3).

Migration and population admixture also affect allele frequency. Modern humans originated in Africa, and small subpopulations migrated to the rest of the world within the past 100,000 years (2). As a consequence, African populations have more genetic variations than other populations (20, 21, 22). Investigators have shown that significantly more deleterious variations exist in the European populations than in the African populations, a finding indicating that genetic variation caused by demographic history has significant health consequences (23). Specific diseases, such as breast cancer, Tay-Sachs disease, Gaucher disease, Niemann-Pick disease, and familial hypercholesterolemia within the Old Order Amish and Hutterite populations may be accounted for by demographic history (19).

Selection is another important evolutionary force that shapes human genetic variation. Most substitutions in the genome are functionally neutral and do not have fitness consequences on their carriers. However, more and more genetic loci have been found to deviate from the neutral null model under various statistical tests, and the results suggest adaptive evolution. When a new mutation arises that affects the fitness in specific environmental contexts, (i.e., the capability to reproduce and propagate the genotype of its carriers), it will be subject to natural selection, which is defined as the differential contribution of genetic variants to future generations. The three general types of selection are positive, purifying, and balancing selection.

When a new mutation increases the fitness of its carriers, positive selection (adaptive evolution) drives the allele to high frequency in a population. Lactase tolerance provides a good example of positive selection (2). Purifying selection, also called negative selection, drives deleterious alleles to low frequency or extinction.

Balancing selection occurs when an allele has heterozygote advantage or it is selected only when it reaches a specific frequency (frequency-dependent selection) (24). One of the best examples of balancing selection is illustrated by the variation in the hemoglobin gene, in which heterozygosity of a variant gene confers resistance against malaria infection, whereas homozygosity results in sickle cell anemia.

Because selection changes rates of molecular evolution at defined loci within the genome, not all genes are expected to evolve at the same rate. Comparison of mammalian genome sequences has permitted the identification of genes that have undergone accelerated evolution (25). These rapidly evolving genes are assumed to enable adaptation and thus to have been positively selected because adaptive mutations expand within populations at accelerated rates relative to neutral mutations. The proportion of amino acid substitutions that result from positive selection is estimated to be 35% to 45% (26). Specific examples of adaptive evolution include glucose-6-phosphate dehydrogenase (G6PD) in malaria (27), the lactase gene (LCT) in lactase persistence (20), amylase in starch digestion (28), and C-C chemokine receptor 5 (CCR5) in immune defense (29).

Comparison of mammalian genome sequences provides evidence that environmental exposures, including pathogens and dietary components, have been selective forces throughout evolution. These selective forces have influenced the generation of polymorphic alleles that alter the use and metabolism of dietary components and may be responsible for the generation of metabolic disease alleles across ethnically diverse human populations (27, 30). Variations that result from positive selection are expected to arise from region-specific selective factors. Therefore, the prevalence of specific functional polymorphisms may be associated with specific geographic or ethnic human populations to the degree that different selective pressures are operative across populations.

Specific allelic variants may be adaptive only in certain environments and neutral or less favored in others (24, 31). For example, the relatively high prevalence of the E6V polymorphism in the β-globin gene is likely the result of an adaptation to the region-specific environmental challenge of the malaria parasite in African populations. This disease allele has high frequency in the population because it enhanced fitness toward the region-specific environmental challenge of malaria in heterozygotes. Identifying and understanding the mechanism for the adaptive evolution of gene variants facilitate the discovery of human disease alleles. For example, a “thrifty gene” hypothesis was proposed to explain the epidemic of obesity and type 2 diabetes (5). The putatively advantageous mutations may have resulted in more efficient adaptations to fasting conditions (e.g., more rapid decreases in basal metabolism) or physiologic responses that facilitate excessive intakes in times of plenty. Adaptive alleles may be recessive disease alleles, or they may become disease alleles even in heterozygote individuals when the environmental conditions change profoundly, such as those brought about by the advent of civilization and agriculture, including alterations in the nature and abundance of the food supply (5).


The primary sequence of the human genome contains approximately 3.2 billion nucleotide base pairs that are organized into chromosomes that range in size from 50 million to 250 million base pairs. The first human genome sequence was obtained from 5 to 10 persons of diverse ethnic and geographic backgrounds or ancestry (2). The human genome, including both nuclear and mitochondrial DNA, contains an estimated 23,000 genes that serve as templates for 35,000 transcripts that encode
information required for the synthesis of all cellular proteins, although a biologic function has not yet been determined for all human genes (32). Other genes encode functional RNA molecules, including tRNAs, small nuclear RNAs, ribosomal RNA, and microRNA (33), which serve various roles in protein synthesis, mRNA processing, or gene expression regulation (34, 35).

Genes account for approximately 2% of the total human DNA primary sequence; the remaining DNA is termed noncoding and serves structural and/or regulatory or no known roles. The number of genes encoded within the genome does not limit the biologic complexity of the mammalian cell. A single gene can encode more than one RNA or protein product through posttranscriptional and posttranslational processing reactions, including RNA editing, alternative splicing, protein splicing, and other modifications (e.g., differential phosphorylation) (36, 37). As a result of such RNA and protein processing, and modification reactions, more than 100,000 proteins with distinct primary sequences can be derived from the human genome.

Human genetic variation is a product of complex and reciprocal interactions among the genome and environmental exposures and is manifested through the formation and propagation of primary sequence alterations in DNA (38). Primary sequence variation among humans is referred to as polymorphism and constitutes one of the molecular bases for human phenotypic variation, including variations in human behavior, morphology, and susceptibility to disease (38).

Polymorphisms arise in populations through the independent and sequential processes of genetic mutation followed by expansion of the mutant allele within the population, and environment can modify both these processes. Human genetic variation was originally estimated to be approximately 0.1% (39). However, with improvements in technology that enabled the identification of structural rearrangements, investigators now estimate a 1% to 3% difference between any two sets of human chromosomes (40, 41). Human genetic variations are usually categorized into common and rare, according to the minor allele frequency (MAF, the frequency of the less common allele) in human populations. Common variations, also called polymorphisms, have an MAF of at least 1% in human populations (38). Genetic variants meeting the MAF threshold include single nucleotide changes and structural alterations, and they can result from mutations ranging from a single nucleotide base change to alterations of several hundred bases through deletions, insertions, translocations, inversions, and duplications (17).

Single Nucleotide Polymorphisms

Single nucleotide polymorphisms (SNPs) are the simplest and most common type of polymorphism and are estimated to represent approximately 90% of all human DNA polymorphisms. SNPs differ from somatic mutations in that they are present in the germ line and therefore are heritable. SNPs are defined as nucleotide base pair differences in the primary sequence of DNA and can be single base pair insertions, deletions, or substitutions of one base pair for another. Nucleotide substitutions are the most common polymorphism; insertion or deletion mutations occur at one tenth the frequency (4).

The density of SNPs in the human genome varies within and among human chromosomes, and it ranges from 1 in 1000 bases to 1 in 100 to 300 bases. Investigators have estimated that approximately 10 to 15 million SNPs exist in human genomes (39, 42). Nucleotide substitutions within protein coding regions of a gene can be classified as either nonsynonymous substitutions, which result in an amino acid replacement substitution within a protein, or synonymous (silent) substitutions, which do not change amino acid sequence resulting from degeneracy in the genetic code. Nonsynonymous SNPs in coding regions are more functionally relevant because they change the amino acid sequence of the encoded proteins, and they subsequently have the potential to affect virtually every aspect of protein function, including protein folding and stability, enzymatic functions, allosteric regulation, and posttranslational modification. However, synonymous substitution can also have important functional consequences by altering mRNA splicing and protein translation efficiency. SNPs in introns, promoters, and intergenic regions may also be involved in regulating gene expression.

SNPs contribute to susceptibility for common diseases and developmental anomalies, and polymorphic alleles have been identified that increase the risk of common disorders including neural tube defects, cardiovascular disease, cancers, hypertension, and obesity (39). SNPs also influence physiologic responses to environmental exposures including diet (43), pharmaceuticals (44), pathogens, and toxins (25), and therefore many SNPs have diagnostic value. High-density human SNP maps facilitate the identification of disease risk alleles through gene mapping studies of complex disease, including low-penetrant alleles that make relatively small contributions to the initiation and or progression of the disorder.

Only gold members can continue reading. Log In or Register to continue

Jul 27, 2016 | Posted by in PUBLIC HEALTH AND EPIDEMIOLOGY | Comments Off on Genetic Variation: Effect on Nutrient Utilization and Metabolism1
Premium Wordpress Themes by UFO Themes
%d bloggers like this: