In this example of AMD, a complex disease, GWAS led to the identification of strongly associated, common SNPs that in turn were in LD with a common coding SNP in the gene that appears to be the functional variant involved in the disease. This discovery in turn led to the identification of other SNPs in the complement cascade and elsewhere that can also predispose to or protect against the disease. Taken together, these results give important clues to the pathogenesis of AMD and suggest that the complement pathway might be a fruitful target for novel therapies. Equally interesting is that GWAS revealed that a novel gene of unknown function, ARMS2, is also involved, thereby opening up an entirely new line of research into the pathogenesis of AMD.
Importance of Associations Discovered with GWAS
There is vigorous debate regarding the interpretation of GWAS results and their value as a tool for human genetic studies. The debate arises primarily from a misunderstanding of what an OR or RR means. It is true that many properly executed GWAS yield significant associations, but of very modest effect size (similar to the OR of 1.1 just mentioned for AMD). In fact, significant associations of smaller and smaller effect size have become more common as larger and larger sample sizes are used that allow detection of statistically significant genome-wide associations with smaller and smaller ORs or RRs. This has led to the suggestion that GWAS are of little value because the effect size of the association, as measured by OR or RR, is too small for the gene and pathway implicated by that variant to be important in the pathogenesis of the disease. This is faulty reasoning on two accounts.
First, ORs are a measure of the impact of a specific allele (e.g., the CFH Tyr402His allele for AMD) on complex pathogenetic pathways, such as the alternative complement pathway of which CFH is a component. The subtlety of that impact is determined by how that allele perturbs the biological function of the gene in which it is located, and not by whether the gene harboring that allele might be important in disease pathogenesis. In autoimmune disorders, for example, studies of patients with a number of different autoimmune disorders, such as rheumatoid arthritis, systemic lupus erythematosus, and Crohn disease, reveal modest associations, but with some of the same variants, suggesting there are common pathways leading to these distinct but related diseases that will likely be quite illuminating in studies of their pathogenesis (see Box).
Second, even if the effect size of any one variant is small, GWAS demonstrate that many of these disorders are indeed extremely polygenic, even more polygenic than previously suspected, with thousands of variants, most of which contribute only a little (ORs between 1.01 and 1.1) to disease susceptibility by themselves but, in the aggregate, account for a substantial fraction of the observed clustering of these diseases within certain families (see Chapter 8).
Although the observation of modest effect size for most alleles found by GWAS is correct, it misses a critical and perhaps most fundamental finding of GWAS: the genetic architecture of some of the most common complex diseases studied to date may involve hundreds to thousands of loci harboring variants of small effect in many genes and pathways. These genes and pathways are important to our understanding of how complex diseases occur, even if each allele exerts only subtle effects on gene regulation or protein function and has only a modest effect on disease susceptibility on a per allele basis.
Thus GWAS remain an important human genetics research tool for dissecting the many contributions to complex disease, regardless of whether or not the individual variants found to be associated with the disease substantially raise the risk for the disease in individuals carrying those alleles (see Chapter 16). We expect that many more genetic variants responsible for complex diseases will be successfully identified by genome-wide association and that deep sequencing of the regions showing disease associations should uncover the variants or collections of variants functionally responsible for disease associations. Such findings should provide us with powerful insights and potential therapeutic targets for many of the common diseases that cause so much morbidity and mortality in the population.
From GWAS to PheWAS
In genome-wide association studies (GWAS), one explores the genetic basis for a given phenotype, disease, or trait by searching for associations with large, unbiased collections of DNA markers from the entire genome. But can one do the reverse? Can one uncover the potential phenotypic links associated with genome variants by searching for associations with large, unbiased collections of phenotypes from the entire “phenome?” Thus far, the results of this approach appear to be highly promising.
In an approach dubbed phenome-wide association studies (PheWAS), genetic variants are tested for association, not just with a particular phenotype of interest (say, rheumatoid arthritis or systolic blood pressure above 160 mm Hg), but with all medically relevant phenotypes and laboratory values found in electronic medical records (EMRs). In this way, one can seek novel and unanticipated associations in an unbiased manner, using search algorithms, billing codes, and open text mining to query all electronic entries, which are fast becoming available for health records in many countries.
As an illustration of this approach, SNPs for a major class II HLA-DRB1 haplotype (as described in Chapter 8) were screened against over 4800 phenotypes in EMRs from over 4000 patients; this PheWAS detected association not only with multiple sclerosis (as expected from previous studies), but also with alcohol-induced cirrhosis of the liver, erythematous conditions such as rosacea, various benign neoplasms, and several dozen other phenotypes.
Although the potential of PheWAS is just being realized, such unbiased interrogation of vast clinical data sets may allow discovery of previously unappreciated comorbidities and/or less common side effects or drug-drug interactions in patients receiving prescribed drugs.