GWAS for Drug Discovery


Drug name

Drug indication(s)a

Effect allele(s) (Reference)

Method of allele identification

Abacavir

HIV-1 (U.S. Food and Drug Administration et~al. 2013)

Multiorgan clinical syndrome

HLA-B*5701 (Mallal et~al. 2002)

Candidate gene

ACE Inhibitors

Hypertension, acute myocardial infarction, heart failure, myocardial infarction prophylaxis, postmyocardial infarction, reduction of cardiovascular mortality, stroke prophylaxisb,c (Clinical Pharmacology Database 2010a, b, 2012, 2013a; Byrd et~al. 2008; Pare et~al. 2013)

Angiodema risk-MME

rs989692 d (Pare et~al. 2013)

GWAS,e Candidate gene

Allopurinol

Gout, hyperuricemia, nephrolithiasisb (Clinical Pharmacology Database 2011)

Serious dermatological reaction

HLA-B*5801 (Hung et~al. 2005)

Candidate gene

Atomoxetine

ADHD (U.S. Food and Drug Administration, Center for Drug Evaluation and Research 2014b)

Increased response (poor metabolizers)

CYP2D6 f (Ring et~al. 2002)

Response

NET/SLC6A2 g (Ramoz et~al. 2009)

NET/SLC6A2: candidate geneg

Carbamazepine

Epilepsy, trigeminal neuralgia

Serious dermatological reaction

HLA-B*1502 (Man et~al. 2007)

Candidate gene

HMG-CoA reductase Inhibitor

Hypercholesterolemia, hyperlipoproteinemia, hypertriglyceridemia, myocardial infarction prophylaxis and stroke prophylaxisb,h (Clinical Pharmacology Database 2013b)

Myalgia risk

SLCO1B1/rs4363657 (Gelissen and McLachlan 2014; The SEARCH Collaborative Group 2008)

GWAS

Irinotecan

Colorectal metastatic carcinoma (U.S. Food and Drug Administration, Center for Drug Evaluation and Research 2006)

Neutropenia risk- UGT1A1*28 (Ando et~al. 1998)

Candidate gene

Phenytoin

Seizures (U.S. Food and Drug Administration, Center for Drug Evaluation and Research 2014a)

Serious dermatological reaction

HLA-B*1502 (Chung et~al. 2004; Man et~al. 2007)

Candidate gene

Warfarin

Coagulation prevention

Bleeding risk- CYP2C9*2 and *3 and VKORC1 1639 (Aithal et~al. 1999; D’Andrea et~al. 2005)

CYP2C9: PCR genotypingi

VKORC1: pos cloning, candidate gene, GWAS


aIndication refers to FDA label except for drug classes HMG-CoA reductase inhibitor and ACE Inhibitors

bInformation unavailable from FDA website. Obtained from Clinical Pharmacology Database

cBased on lisinopril, enalapril, fosinopril, and ramipril. These ACE Inhibitors were used by subjects in the Nashville Tennessee study. Also according to Pare et~al. (2013), ramipril was used in the ONTARGET trial

dMME polymorphism was identified via candidate gene analysis

eNo association of genome-wide significance has been found

fMight not be clinically significant because low-affinity enzymes might take over metabolization when 2D6 activity is compromised

gNeeds further assessment

hIndications specific for simvastatin since SLCO1B1-simvastatin association is currently the most robust

iBy the time genotyping efforts were undertaken for CYP2C9, warfarin had been in use for decades. Once alleles were identified for patients in the clinical setting, researchers matched individuals’ alleles to the warfarin dose were the respective patient had been stabilized





7.2 GWAS Background: Overview of Gene Mapping


Our genes are located in a linear fashion along strands of deoxyribonucleic acid (DNA) that are divided into 23 pairs of chromosomes within the cell nucleus. This linear architecture provides the opportunity to map and identify the genes that contribute to traits such as drug response. To begin, we assess whether a trait such as drug response has a genetic component and we estimate the trait’s heritability. A significant heritability means that a fraction of the inter-individual variation in that trait is the result of variation due to genes (The 1000 Genomes Project Consortium 2010). These heritable traits are excellent candidates for gene mapping, which is designed to identify the specific genes contributing to the trait using information on gene marker locations along the chromosomes. Unfortunately, it is usually difficult to assess whether a particular response to a drug is heritable. This information would require the analysis of panels of related individuals, such as monozygotic and dizygotic twin pairs, who both receive the same pharmacologic agent. Concordance of response in those who share a greater proportion of their genes, the monozygotic twins, should be statistically greater than concordance in the dizygotic twin pairs. Such information is rarely available in an observational setting.

Linkage analysis is a well-established analytic tool that was used very extensively during the 1960s through the middle of the 2000s to map the genes for heritable traits to their chromosome locations. Linkage is based on a process that has been referred to as “reverse genetics”. That is, the approach works in the reverse order than the model describing how genes operate, biologically. While genes act in a forward fashion to produce and regulate the proteins that lead to a trait, reverse genetics starts with individuals having been measured for the trait, and uses linkage analysis, GWAS or other approaches to identify the predisposing genes. Here, the genes are identified last.

Reverse genetics became fully feasible in the 1990s, when a very extensive panel of multi-allelic markers or genetic variations spanning the human genome was identified. Linkage analysis is a statistical method that follows genotyped marker alleles and measured trait values within family pedigrees to identify chromosome regions where the marker alleles and trait values are aligned. Alignment is assessed statistically and helps us infer that alignment is seen in a certain chromosome region more than one would expect by chance alone. That is, we infer that the gene leading to some aspect of the trait value is ‘linked’ to a marker with a known location along the chromosomes. If the whole genome is analyzed for linkage with the trait, the approach is referred to as a full genome linkage scan. If specific genes are tested, they are referred to as “candidate genes”. In summary, in the regions exhibiting significant linkage, we infer that the gene affecting the trait is close to the linked marker, and reverse genetics is achieved. With linkage, the resolution at the locus is usually quite poor, as many genes can reside within a linked region. Nevertheless, a statistically significant linkage result limits the search for the predisposing gene to those in the linked region. Using the advances made by the Human Genome Project, reverse genetics has been very effective in identifying genes causing rare single gene traits that only result from mutations in a single or a few different genes.

A major change in gene mapping occurred during the last 10 years, with the development of the essential tools for successful GWAS. “Reverse genetics” is also used to conduct GWAS, however, the genotyped markers are bi-allelic, having only two versions, and referred to as single nucleotide polymorphisms (SNPs). These markers occur more frequently and are spaced much more closely than linkage markers. The collection of SNPs on our chromosomes developed over evolution. SNPs are random changes to the DNA that are passed down among humans over time. Most SNPs have no detectable effects on those who inherit them, but their proximity on the chromosomes to the changes that do predispose to traits of interest makes them valuable. This chapter is devoted to presenting the current study designs and methods for testing the GWAS SNPs to identify predisposing genes through association, which can ultimately inform pharmacogenetics.


7.3 Concepts, Designs and Statistical Methods for GWAS


We begin with a brief overview of the GWAS approach, assuming the trait under analysis is binary. For example, an individual either exhibits a particular side effect when given the drug, or they do not. Individuals that exhibit this trait form the sample of cases, and a group matched for age, sex, ethnicity, and other relevant factors, including ethnicity, form the control group. Both samples are genotyped using very dense SNP marker panels that have become available on commercial arrays. These genotyping arrays have evolved over time to contain an increasing number of SNPs, and most often we see studies with one million SNPs, each having a known location along the chromosomes. Genes that contribute to the trait are identified by statistically testing the individual SNPs for an association with the trait, in this case comparing the cases and controls. It should be emphasized that the SNPs themselves are just landmarks along the chromosomes and are not likely to predispose to developing the trait. They just mark alleles that contribute to the trait by their proximity. That is, identifying that a SNP is associated with the trait indicates there is likely to be an allele of a gene that contributes to the risk for developing the trait within a small chromosomal region close to that SNP. This is because the commercially available GWAS SNP panels have been designed to capitalize on an important genetic feature referred to as linkage disequilibrium (LD).


7.3.1 Linkage Disequilibrium Among SNPs that Are in Close Proximity


LD is a genetic factor that allows us to conduct GWAS arrays that provide the genotypes of one million SNPs when there are an estimated 30 million over the whole population. It is ubiquitous along the chromosomes and reflects the genetic history of a population. Thus, we expect LD to be found between the trait allele and a nearby SNP allele if the trait and SNP alleles have traveled together in close proximity on the same chromosome throughout time and there have been only a few crossovers between them. Crossovers occur when gametes are formed and the probability of a crossover between two SNPs is correlated with the distance between the SNPs. We can use the genotype of one allele to predict the genotype of the other if LD exists between the two. The mechanism by which fewer genotyped SNPs can capture the variation of other SNPs in close proximity is referred to as “tagging.” Tagging is illustrated Fig. 7.1, where the SNP genotype sequence for 16 SNPs is given for 6 people, with each person’s DNA sequence represented by a row. It is clear that the SNPs are not distributed independently within individuals. There are patterns that reflect LD. For example, the first and second SNPs are in LD and they ‘tag’ each other. Therefore, only one of them has to be genotyped. If a person has an “A” at the first SNP, they will have a “G” at the second. Likewise, a “T” at the first indicates there is a “T” at the second. Similarly, SNPs 4, 5 and 6 tag each other.

A330233_1_En_7_Fig1_HTML.gif


Fig. 7.1
Illustration of tagging SNPs

In the next sections we describe the biological concepts applied in conducting GWAS and explain the appropriate statistical methods to identify important associations leading to enhancement of our pharmacogenetics knowledge.


7.3.2 Alleles and Minor Allele Frequencies


Alleles are the key elements in genetic variation. They differ from each other in a small genetic element and result in alternate versions of the same gene. Some alleles will change the protein product of a gene, some will affect the amount of protein product produced, and some will be neutral and have no obvious effect. Population geneticists study the frequency of alleles and make inferences about population history from them. Those interested in personalized medicine use the frequencies to identify the genes contributing to traits. One way to describe an allele in a population is to estimate its minor allele frequency (MAF). The MAF is estimated by drawing a sample of individuals and counting the number of copies of that allele in the sample and divide by the number of alleles at that locus, which is two times the number of individuals in the sample, because, for each individual, a genotype is composed of two alleles, one from the mother and one from the father.


7.3.3 GWAS Study Designs and Statistical Analyses


Unlike other methods of searching for genes associated with a trait, GWAS are noteworthy for being genome-wide, producing an unbiased search that does not rely on candidate genes. To begin, the study sample is collected. Then, a microarray of SNPs is genotyped for each individual. The microarray is purchased from a company that specializes in genotyping chips and the genotyping is usually done at a facility that specializes in that work. Most genetic centers have core facilities with the technology and expertise to generate genotypes. The current commercial genotyping chips have about one million SNPs on them. Sometimes there are specialized chips that have been designed for specific classes of disorders. Regardless of the microarray used, the methodology to clean and process the data will be the same. This is described in Sect. 7.5. Currently, there is preliminary work generating SNP genotypes using exome or whole genome sequencing, but the high cost and other considerations associated with data processing have not yet made it fully feasible.


7.3.3.1 Case and Control Designs for GWAS


A case and control study design is used when the outcome of interest is dichotomous. For example, a dichotomous trait in pharmacogenetics might be whether an individual shows response to a drug. In the study sample of those who take the drug, those who have a positive response might be categorized as “cases”, and those who do not might be categorized as “controls”. All individuals are genotyped using the same microarray platform. There will be one million genotypes per individual, so special software to accommodate this volume of data is required. The PLINK software accommodates extensive genotype data (Plink 2009). The files are compressed into a binary format, and data cleaning and statistical analyses programmed into PLINK are conducted on the files that have been compressed. The case and control samples are tested for differences in MAF over the one million SNPs, each undergoing a separate statistical test. Multiple testing is a major concern, and the accepted GWAS approach is discussed in Sect. 7.4.1. The cases and controls should all be from the same ethnic group, as different ethnicities have different allele frequencies, which could lead to a significant number of type 1 statistical errors.
< div class='tao-gold-member'>

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jul 22, 2016 | Posted by in PHARMACY | Comments Off on GWAS for Drug Discovery

Full access? Get Clinical Tree

Get Clinical Tree app for offline access