Genes Responsible for Disease by Genome Sequencing


Figure 10-12 Representative filtering scheme for reducing the millions of variants detected in whole-genome sequencing of a family consisting of two unaffected parents and an affected child to a small number that can be assessed for biological and disease relevance. The initial enormous collection of variants is winnowed down into smaller and smaller bins by applying filters that remove variants that are unlikely to be causative based on assuming that variants of interest are likely to be located near a gene, will disrupt its function, and are rare. Each remaining candidate gene is then assessed for whether the variants in that gene are inherited in a manner that fits the most likely inheritance pattern of the disease, whether a variant occurs in a candidate gene that makes biological sense given the phenotype in the affected child, and whether other affected individuals also have mutations in that gene. AR, Autosomal recessive; mRNA, messenger RNA.


In the end, millions of variants can be filtered down to a handful occurring in a small number of genes. Once the filtering reduces the number of genes and alleles to a manageable number, they can be assessed for other characteristics. First, do any of the genes have a known function or tissue expression pattern that would be expected if it were the potential disease gene? Is the gene involved in other disease phenotypes, or does it have a role in pathways with other genes in which mutations can cause similar or different phenotypes? Finally, is this same gene mutated in other patients with the disease? Finding mutations in one of these genes in other patients would then confirm this was the responsible gene in the original trio.


In some cases, one gene from the list in step 4 may rise to the top as a candidate because its involvement makes biological or genetic sense or it is known to be mutated in other affected individuals. In other cases, however, the gene responsible may turn out to be entirely unanticipated on biological grounds or may not be mutated in other affected individuals because of locus heterogeneity (i.e., mutations in other as yet undiscovered genes can cause a similar disease).


Such variant assessments require extensive use of public genomic databases and software tools. These include the human genome reference sequence, databases of allele frequencies, software that assesses how deleterious an amino acid substitution might be to gene function, collections of known disease-causing mutations, and databases of functional networks and biological pathways. The enormous expansion of this information over the past few years has played a crucial role in facilitating gene discovery of rare mendelian disorders.



Example: Identification of the Gene Mutated in Postaxial Acrofacial Dysostosis



From an initial list of more than 4 million variants and assuming autosomal recessive inheritance of the disorder in both affected children, a filtering scheme similar to that described earlier yielded only four possible genes. One of these, DHODH, was also shown to be mutated in two other unrelated patients with POAD, thereby confirming this gene was responsible for the disorder in these families. DHODH encodes dihydroorotate dehydrogenase, a mitochondrial enzyme involved in pyrimidine biosynthesis, and was not suspected on biological grounds to be the gene responsible for this malformation syndrome.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Nov 27, 2016 | Posted by in GENERAL & FAMILY MEDICINE | Comments Off on Genes Responsible for Disease by Genome Sequencing

Full access? Get Clinical Tree

Get Clinical Tree app for offline access