Genetic influences on addictive substance use vary across developmental stages of life. When an individual initiates substance use (i.e., experiments with drugs), environmental factors have a greater impact on his or her substance use patterns. Access to drugs, peer pressure, and socioeconomic factors are all crucial determinants of an individuals’ substance use patterns. Environmental factors are especially important in adolescents, as more of their activities are under monitoring of authority figures (parental). As an individual moves along the trajectory of continued heavy use, genetic influences become more prominent, and individual differences can be explained by the unique environmental conditions that interact with genetic factors. Unlike in other complex psychiatric disorders, substance use disorders (SUDs) require exposure to and ingestion of the substance as an obligatory environmental component for the development of SUDs and related phenotypes. Since the observation that the family history of substance use is a crucial component in developing SUDs, the exploration of specific genetic loci contributing to the moderate to high heritability has become the goal of many genetic analyses in the addiction genetics field. This chapter discusses basic concepts utilized in the quest to find genes for SUD vulnerability and the most replicated findings to date, with an emphasis on biological relevance and dynamic changes of gene expression when exposed to the abused substances.
Heritability ( h 2 ) of SUDs
Heritability Based on Family, Adoption, and Twin Genetic Studies
The first empirical evidence for a genetic basis of SUDs comes from family, adoption, and twin studies carried out in the pre-genomic era. The traditional family studies were observational studies that reported familial aggregation of addictive phenotypes. First-degree relatives of individuals with a SUD were reported to be at two times or more risk of developing a substance use problem compared with siblings of non–drug-dependent relatives. These observational studies were, however, not designed to examine whether the familial clustering of addictive phenotypes were due to the environment, genes, or their interaction. Adoption and twin study designs were employed to tease apart genetic from environmental effects on addiction vulnerability.
Adoption study design compares the similarity in SUDs or patterns between adopted children and their biological parents with the adopted children and their adoptive parents or between adopted sibling pairs and biological sibling pairs. Several well-powered studies conducted in the United States and Europe in the 1970s and 1980s showed that the risk for developing an alcohol use disorder (AUD) was much greater in the offspring of alcoholics, even when children were raised by nonalcoholic foster parents. The risk of developing an AUD was shown to be about four times greater in sons of alcoholics compared to sons of nonalcoholics adopted by nonalcoholic foster parents. Furthermore, rearing by an alcoholic parent had a greater influence on alcohol abuse but not on alcohol dependence in the offspring, which suggested a strong genetic influence on dependence as a phenotype. Similarly, the risk for nicotine dependence also was reported to be greater between biological siblings but not the adopted siblings, and the sons and their nicotine-dependent biological mothers. The age at adoption is an important confounder of genetic effects in adoption studies, as early environmental exposures can overestimate genetic influences.
Twin studies compare the agreement in the behavior between monozygotic (MZ) or identical twins who are genetically identical, and dizygotic (DZ) or fraternal twins who share on average 50% of their genetic makeup. The term concordant is used if both twins engage in the same behavior (e.g., they both drink heavily). A higher rate of concordance in MZ than DZ twins suggest that genetics likely contributes to addiction vulnerability in addition to environmental factors. Heritability of a phenotype is estimated statistically by modeling the percentage of variation in the phenotype that is explained by genes (heritability), experiences shared by family members (shared environment), and experiences unique to the individual (nonshared environment). Twin studies assume that both twins are exposed to equal environmental influences that affect their substance use behavior. If MZ twins are exposed to more similar environments than DZ twins are, twin studies provide inflated estimates of genetic influences on the phenotype.
The heritability estimates from family, adoption, and twin studies for addictive substances range from 40% to 70%, with lowest rates for hallucinogens and highest for cocaine addiction. In addition to addiction, substance initiation and heaviness of use also are heritable phenotypes. Some studies have suggested gender and race differences in heritability of SUDs, but the replicability of these findings are not consistent across studies. Notably, adoption and early twin studies are suggestive of genetic effects but do not imply any specific genetic loci for addiction vulnerability. Furthermore, data from these large twin registries also can be used to compare how one twin’s dependence on a substance influences his/her co-twin becoming dependent on a different class of substances. In addition, there is only a modest amount of family data available to compare concordance in first- versus second-degree relatives. The existing evidence does not, however, support less concordance in second-degree relatives than we would anticipate based on the observed concordance in first-degree relatives.
SNP-Based Heritability ( h 2 SNP )
Estimation of heritability in traditional family, adoption, and twin studies relied on data from closely related individuals. The recent developments in the molecular genetics field allow estimation of heritability in unrelated individuals by using the variance explained by all single nucleotide polymorphisms (SNPs) used in genome-wide association (GWA) array—defined as h 2 SNP . The definition for h 2 SNP is now extended to include variance explained by any set of SNPs, whether they are a set of candidate SNPs or all SNPs from whole-genome sequencing (WGS). Generally, heritability estimates captured by common SNPs (frequency >1%) for many substances are much lower than their initial estimates from classical genetic studies. A comparison of published twin study–based and SNP-based estimates of heritability for the five most widely used addictive substances in the United States (excluding prescription drugs) and related phenotypes are presented in Table 12.1 . Possible reasons for the discrepancies between heritability estimates derived from twin and SNP-based methodologies are discussed further discussed in the section Missing Heritability.
|SUDs and Related Phenotypes||h 2 [references]||h 2 SNP [references]|
|Alcohol use disorders |
|Nicotine use disorders |
Number of cigarettes per day (CPD)
|Cannabis use disorders |
|Opiate use disorders |
|Cocaine use disorders |
|>80 a ,|
Genetic Architecture of Addiction Vulnerability
Based on molecular genetic studies from the past few decades, we now know that the genetic architecture of vulnerability to developing an SUD using legal or illegal addictive substances in the population is polygenic and is influenced by variants in individual genes, and that each contributes modest amounts to this overall phenotypic variability. Most of the known genetic variations increase the risk for development, progression, and severity of SUDs. Although a few of the risk or protective genetic influences are specific to one class of substance, most of them influence neurobiological mechanisms common to SUDs regardless of the class of abused substance.
Mapping Genetic Loci Influencing Vulnerability to SUDs
Mapping of specific genetic loci that correlate with phenotypes/traits began with mapping DNA markers to chromosomes in affected family members of extensive pedigrees. This locus-driven linkage mapping approach identified chromosomal regions or loci containing many genes that cosegregated with several SUDs.
The first linkage analysis in AUD was carried out by the Collaborative Study on the Genomics of Alcoholism (COGA). This multisite study initially enrolled 10 nuclear and multigenerational families with 987 individuals. Findings of the first genome-wide linkage scan identified chromosomes 1 and 7 as conferring risk and chromosome 2 as being suggestive of protective effects for developing Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) Alcohol Dependence. A follow-up study confirmed these loci and provided evidence of additional linkage to chromosome 3. The linkage locus on chromosome 4p harbors the GABRA2 / B1 cluster and was detected in early linkage analyses of both Southwestern Indian and Caucasian populations. Similarly, strong linkage signals are detected for nicotine addiction on chromosome 9q22 that harbors GABRB1 . Many reviews are available on findings from other well-powered linkage analyses to identify chromosomal locations associated with vulnerability to develop nicotine, opioid, cocaine, and cannabis use disorders. a
a References 1, 22, 31, 43, 68, 73, 86.
The next stage of genetic mapping is highlighted by the search to identify specific alleles associated with SUDs. Genetic association studies are a form of linkage analyses based on alleles, rather than loci that consist of multiple genes. Four different association analyses approaches have been used to identify specific alleles: (1) fine mapping of chromosomal loci from linkage analyses; (2) hypothesis-driven candidate gene approaches; (3) GWAs using SNP arrays; and (4) GWAs based on DNA sequencing. A limited number of large studies were performed with samples collected from individuals with SUDs and their family members. The majority of association studies were performed with unrelated individuals from racially diverse populations with and without a diagnosis of SUDs. The statistical power to detect an association between a specific genetic variant affecting a complex phenotype such as SUD depends on: (1) the effect size of a genetic variant (mutation) affecting some aspect of SUD; (2) how many times the phenotype-affecting variant segregates in the studied population; (3) the number of such phenotype-affecting variants detected in the studied population; (4) the studied sample size; (5) heterogeneity of the trait being studied; and (6) coverage of the genetic panel used in the genome-wide association studies (GWAS) or candidate-based analyses to screen for variants.
Candidate Gene Analyses
Hypothesis-driven small-to-medium scale candidate gene analyses have been reported on a wide range of SUD-related phenotypes and analyzed in populations of different racial and geographic origins. Candidate gene analyses have now covered genes within almost all of the neurotransmitter systems, including γ-aminobutyric acid (GABA), serotonin, dopamine, and glutamate. Other candidate analyses have looked at substance-specific metabolic pathways, and neuroimmune, neuroendocrine, and cell-adhesion pathways.
Genome-Wide Association Results for Addiction
All analyses exploring genotype-phenotype associations with genetic data covering the entire genome can essentially be termed genome-wide association studies (or GWAS), irrespective of whether the acquisition of genotypes was based on sequencing or array-based technology. Application of GWAS in SUDs is now at least 10 years old. GWAS was initially designed as an experimental method to identify SNPs spanning the entire genome that contribute to complex disorders, especially those with polygenic genetic bases. That is, those derived from effects at many gene loci, each with modest effects through their interactions with environmental elements. To date, nearly 10,000 robust SNP-trait associations have been discovered by GWAS for complex traits across all areas of biomedical sciences that reach genome-wide statistical significance level of 5×10 -8 .
SNP Array-Based GWAS
Early GWAS in SUD focused mainly on SNP associations with the DSM-IV diagnosis of substance dependence as a trait. These early array-based GWAS compared allelic differences for SNPs in those who did not meet or who met any three of the seven DSM-IV criteria for substance dependence. The lack of precision of the phenotype is now being viewed as one of the reasons that led to disappointing results in the early GWAS era in addiction research. As the field of addiction genetics evolved, focus has shifted more toward quantitative and qualitative intermediate phenotypes or endophenotypes that constitute the broader phenotype of dependence. These secondary endophenotypes include the age at initiation of substance use, and the degree of substance use such as cigarettes per day or standard drinks per day. Individual studies have analyzed arrays consisting of about 2 million SNPs in Asian, African, and European populations of up to 20,000.
GWAS, as well as smaller candidate genotyping analyses, to date rely on the correlation structure, that is, linkage disequilibrium (LD) estimates that exist between variants in the human genome that process a causal effect on the trait and the genotyped variants in an experimental panel. This is one of the reasons that the commercially available SNP arrays are not yet powerful enough to capture the effects of causal rare variants. Apart from these conceptual issues, relatively smaller sample sizes to detect rare genetic effects and other technical limitations of SNP arrays have contributed to a lack of variant discovery in these studies. There are a number of excellent articles that discuss limitations of early GWAS studies in depth. A commonly used strategy to gain some of the missed information from SNP array genotyping is to statistically infer (i.e., impute) the ungenotyped variants from haplotypes observed in a fully sequenced reference panel. Reference panels for European, African, and Asian ancestry human genomes are publicly available through large-scale population genome sequencing projects such as 1000 Genomes Project, International HapMap Project, and the Personal Genome Project Korea ( opengenome.net/ ).
The main difference between relatively cheaper SNP array based and whole genome sequencing (WGS) based is that the WGS association analyses provide a much larger coverage of the density of variations in the genome from very rare to common SNPs and other DNA variations that are several base pairs in length. Sequencing of large-scale samples is new to the addiction field and the data will provide evidence for the missing heritability.
Interactions Between Genetic and Environmental Factors
Analyses of twin studies for vulnerability to SUDs do not account for the possibility of large interactions between genetic and environmental effects (G×E interactions). Such large interactions may reduce the additive genetic and environmental contributions to addiction vulnerabilities. There are several types of’G×E interactions described in the literature. One classification group’s gene-environment influences into GxE correlations versus interactions . GxE correlations occur when the genotype correlates (r) with the probability of exposure to an environmental factor to influence a disease phenotype. Whereas, GxE interactions are defined as occurring when the effect of the environmental exposure on an outcome is modified by genotype. The serotonin transporter ( SLC6A4 ) variants contributing to interindividual differences in stress resilience is an example of a GxE interaction.
Another classification is passive versus active versus reactive G×E. A passive correlation occurs when parents transmit both genetic and environmental influences on a trait. Active G×E correlation occurs when subjects of a certain genotype actively select environments that are correlated with that genotype. Reactive G×E correlation occurs when an individual’s genotype provides different reactions to stimuli that come from the environment. Small values for c influences of common environments shared by members of sib pairs appear to provide evidence against passive G×E correlations. Active and reactive G×E correlations remain possible.
Large interactions between genetic and environmental components would likely lead to differences in estimates of heritability from samples obtained in different environments and to differences in molecular genetic findings in individuals from different environments. As we have noted, data from studies of twins who were sampled from a number of different environments are nevertheless similar. Such convergence supports relatively modest G×E interactions between genetic and environmental influences on addiction vulnerability, at most. Modest G×E influences also are consistent with GWA molecular genetic results that identify substantial overlaps between the molecular genetics of vulnerability to dependence on illegal substances in samples from substantially different environments, such as the United States and Asia (see subsequent text).
The concept of missing heritability implies that better scrutiny of DNA sequence variations is required in thousands of more people to unearth more associations with addiction phenotypes. The common variant associations detected in large-scale genotype-phenotype association studies have failed to explain all the heritability for addiction phenotypes. Many studies in the post-GWAS era have embarked on exploring the missing heritability employing various methodologies. These include (1) screening for rare variations of the DNA sequence; (2) polygenic risk score assessments; and (3) identifying epistatic effects.
Rare Variations of the DNA Sequence
Several rare variants also explain significant fractions of the genetic vulnerability for addiction that can modulate its effects. Advancements in genome sequencing have enabled the discovery of thousands of rare variations. Although a few individuals are necessary to discover novel genetic variations, thousands of individuals are required to establish associations between rare variants and complex phenotypes such as addiction that present with a high individual variability. In fact, many more individuals are required to establish associations compared with common genetic variations. Even with the reducing costs of genome sequencing, it is still not cost-effective to sequence sample sizes powerful enough to detect rare variants associated with common phenotypes. Rare-variant genotyping chips are available as cheaper alternatives. A recent approach adopted in other fields of common phenotypes is to deconstruct the phenotypes into relatively homogeneous subgroups and explore rare variations within these extreme phenotype subcategories.
Assessing Polygenetic Risk Scores (PRS)
Polygenetic risk scores (PRS) sum statistically insignificant allelic effects of all variants within a gene or a genetic locus for a given trait. PRS is a particularly strong method for identifying unique environmental conditions under which the collective genetic effects are stronger. Currently, PRS has been applied to only a few SUD studies. One of the few studies showed that smoking and PRS increased with increased number of traumatic events. Another study showed that lower parental knowledge was associated with low PRS and higher alcohol consumption.
Findings From SUD Genetic Studies
Genetic Contributions Specific to Substances
Notably, the effect sizes of genetic variants on phenotypic variation is modest, as would be expected for any complex disorder. Several large-scale consortia have done meta-analyses of GWAS data on the vulnerability to develop SUD and related substance use behaviors.
Genetics of alcohol use: Variations within genes for the enzymes involved with alcohol’s metabolism are, by far, the most consistently replicated genetic associations with alcoholism phenotypes. Indeed, they have shown the largest effect size for the genetic contribution to alcohol addiction and use. Large-scale early linkage studies first reported a risk locus on chromosome 4q. Successive analyses found this region to harbor the alcohol dehydrogenase ( ADH ) gene cluster on chromosome 4q23.
Genetics of smoking-related traits: The strongest signals for smoking-related traits are detected in genes coding for the nicotine receptor subunits. Meta-analyses of GWAS have identified SNPs in the CHRNA3 , CHRNA5 , CHRNA6 , CHRNB3 , and CHRNB4 genes. Genes coding for the enzymes involved in the nicotine metabolism pathway also were reported to confer risk for smoking in cytochrome P450 (CYP)2A6–CYP2B6.
Genetics of cannabis use–related phenotypes: Cannabis is the most widely used illicit substance worldwide. Frequent use can progress to addiction, with physical, psychological, and adverse social outcomes. The frequency of cannabis use is a heritable trait. The International Cannabis Consortium was established with the goal of identifying specific genetic variants that conferred risk for increased use. That Consortium did meta-analyses of genomes from over 30,000 cannabis users. Even with the large sample size, researchers failed to identify any single SNP associated with cannabis use at a genome-wide significance level. Polygenic risk scores did, however, reveal four genes— NCAM1 , CADM2 , KCNT2 , and SCOC —that were associated with lifetime cannabis use. These genes are not unique to cannabis use. Previous studies have identified that they also are associated with alcohol and nicotine use disorders as well as conduct disorders. For example, the association between the neuronal adhesion molecule 1 with nicotine dependence has been reported by many research groups.
Many of the genetic influences on addiction vulnerability appear common to dependence on multiple different substances, although others appear to be substance-specific. These features suggest that many of the genetic influences on vulnerability to addiction are more likely to be related to underlying brain mechanisms that are common to addictions, and that fewer may be specific to the primary pharmacological properties of specific drugs, such as aspects of absorption, distribution, metabolism, or excretion.
Elsewhere we have suggested levels of analysis for pharmacogenomics and pharmacogenetics: (1) primary pharmacogenomics that describe the genetics of individual differences in the adsorption, distribution, metabolism, and/or excretion of a drug; (2) secondary pharmacogenomics that describe individual differences in drug targets, such as the G-protein-coupled receptors, transporters, and ligand-gated ion channels that are the primary targets of opiates, psychostimulants, and barbiturates, respectively; and (3) higher order pharmacogenomics that provide individual differences in post-receptor drug responses. Such postreceptor drug responses are more likely to be common to the actions of abused substances that come from several different chemical classes and act at distinct primary receptor or transporter sites in the brain. Based on the data for twins that are available currently, we posit that much of the human genetics of addiction vulnerability represents higher order pharmacogenomics.
Genetic Contributions Common to Behavioral Phenotypes Underlying SUDs
There are a few careful studies of the ways in which most human addiction vulnerabilities move through families (e.g., segregation analyses). No such study indicates a major gene effect on addiction vulnerability in most current populations. There is an exception: the flushing syndrome, whereby variants at the ALDH loci in Asian individuals do provide genes of major effect in this population. Individuals with these gene variants are at lower risk of becoming dependent on alcohol compared with individuals with other genotypes in the Chinese, Korean, Japanese, and other populations. Homozygous ALDH2 ∗ 2 individuals are strongly protected from alcohol dependence. Thus, this locus provides a good example of primary pharmacogenomics, although in a restricted population.
Quantity-frequency data for smoking also provide evidence for a replicable secondary pharmacogenomic effect of moderate magnitude. Markers in the chromosome 15 gene cluster that encodes the α3, α5, and β4 nicotinic acetylcholine receptors display different allelic frequencies in heavy versus light smokers in each of several studies. This chromosome 15 locus is likely to provide a good example of secondary pharmacogenomics, since it has not been associated reproducibly with dependence on other substances.
Linkage-based analyses for addiction vulnerabilities would be expected to reproducibly identify many of the genes whose variants exerted major influences on human addiction vulnerability. Existing linkage data for human dependence on alcohol, nicotine, and a number of other substances do, however, fail to provide any highly reproducible results that would support any major gene locus. These results appear to point to a negative conclusion: that no locus individually contributes a large fraction of the vulnerability to dependence on any addictive substance in most individuals. There are caveats. Many of these data come from subjects with largely European ethnic/racial backgrounds. Rare variants might well contribute disproportionate amounts to the vulnerability of individuals within a relatively few pedigrees. Nevertheless, as with many complex human disorders in which initial hopes for an easier (e.g., oligogenic, caused by variants in only a few genes) underlying genetic architecture supported the use of linkage approaches, the linkage peaks that are identified in each individual study may be more likely to arise on other bases when the underlying architecture is, in fact, polygenic.
Phenotypes That Might Have Contributed to Balancing Selection of Addiction-Related Alleles
It is tempting to speculate about the phenotypes that may have provided the basis for balancing or other selective processes for the common allelic variants that are observed in several current populations and influence vulnerability to substance dependence in current environments. Heritable, interrelated influences on cognitive abilities and brain volumes, especially of the frontal lobe, provide interesting examples of such phenotypes. Both of these phenotypes are substantially heritable in data from twin studies. The heritability of both of these phenotypes is substantially correlated in twin study data. Samples of substance dependent individuals, although of modest size, reproducibly display smaller frontal lobes and poorer performance on tests of cognitive function. It is easy to see how cognitive function might have provided a selective pressure. When we consider the substantial mortality that cephalopelvic disproportion is likely to have caused in the environments in which our distant ancestors lived, it is easy to develop a plausible balancing selection hypothesis.
We have identified substantial, reproducible data for both of these phenotypes from GWA datasets, and identified large overlaps between the genes identified on the basis of cognitive abilities versus the genes identified on the basis of frontal lobe brain volumes, as expected.
Of interest, there also is significant overlap, more than expected by chance, between these sets of genes and those identified in comparing addicted versus control samples.
Personality traits that display substantial evidence for heritability are also found in substance-dependent individuals at rates different from those in the general population. A GWA dataset for the most addiction associated personality feature, neuroticism, displays highly significant overlap with data for substance dependence as well.
Psychiatric and Neurologic Comorbidity
Data for the highly heritable psychiatric diagnosis, bipolar disorder, is now available from four largely independent samples from European ancestries. Our clustering analyses for these datasets provide ample evidence of overlap between the results for bipolar disorder. Of interest, these data also overlap with the molecular genetic results for substance dependence to extents greater than chance.
Success in Smoking Cessation
Studies involving twins support the idea that an ability to successfully quit at least one of the major addictive substances, tobacco smoking, is substantially heritable. Much of this heritability apparently is not the same as the heritability for vulnerability to substance dependence, although some does overlap. We have recently reported GWA analyses of three datasets of smokers who were successful versus unsuccessful in quitting smoking in the context of a clinical trial. These results display gratifying convergence with each other and more modest, but still significant, overlap with results from vulnerability to become substance dependent, as would have been predicted by the results of classical genetic studies.
Failure of Control Experiments to Support Alternative Hypotheses for the Observed Genome-Wide Association Results
There also is no evidence that many of the clustered, reproducibly positive SNPs identified in these data cited earlier and a number of control comparisons, included controls for occult racial/ethnic differences and assay noise within each comparison group.