Open Source Drug Discovery for Neglected Diseases
Sciclips, Madison, WI, USA
In the recent years, there has been a steady decline in the number of drugs approved by the U.S. FDA, which has been a major concern in the drug discovery research industry. Pharmaceutical companies are facing major challenges such as lack of innovation in the research and development (R&D) sector and pressure for production of cheaper medicines due to global competition. Although pharmaceutical companies are increasing their R&D budget every year to overcome this problem, they are losing billions in cash flow as many drugs become generic. Today, pharma companies are constantly looking for innovative ways to reduce R&D cost by adopting new approaches like open source, collaboration, and IPsharing in their business model. Open source and open science are new avenues where companies look for talents, resources, tools, and technologies from outside the company. In biomedical science, the open source model was used for the first time in bioinformatics to produce tools to understand the genome and the proteome. The Human Genome Project (HGP) was the first initiative in an open source model where thousands of researchers worldwide worked together with all the necessary tools and resources to sequence the entire human genome and make it openly accessible to researchers worldwide. This opened up new avenues to accelerate drug discovery research. Because of its close and secretive nature, the traditional drug development process could not embrace the open source model. The entire drug development process, from the early discovery phase to the actual manufacturing of the drug, is covered by intellectual property (IP) rights. These rights give the pharmaceutical companies a legal monopoly and the ability to exclude others from using the invention. IP protection is one of the main causes for the high cost of medicines. Millions of people die every year in developing countries because they cannot afford the high price of drugs set by pharmaceutical companies. For the last 8–9 years, there have been collaborative efforts by various nonprofit initiatives, public–private partnerships, product development partnerships, government institutions, and philanthropic consortia to adopt the open source model through the voluntary sharing of knowledge, data, tools, and infrastructure for developing cheaper drugs for the poor people in developing countries who are suffering from neglected tropical diseases (Figure 23.1). Recently, major pharmaceutical companies like GlaxoSmithKline (GSK), Novartis, Merck, Lupin, Bayer, Sanofi-Aventis, and Pfizer are showing interest in the open source model by donating compounds for target drug candidates, and sharing IP rights and high-throughput screening technologies in the “research pool” to speed up the drug development process for neglected diseases. This chapter will review the open source initiatives in drug discovery with special emphasis toward neglected diseases.
Open Source
The term “open source” came into the limelight in the early 1990s with the development of Linux (a computer operating system, like Microsoft Windows or Apple Mac); however, unlike Windows or Mac, Linux is assembled under the free and open source software development and distribution model [1]. Linus Torvalds, a Finnish American software engineer, adopted the path of free software advocates of the early 1980s like Richard Stallman (mastermind behind the GNU Project), who believed that people have the right to have full control over information and the freedom to share and change software. In 1991, Torvalds launched Linux kernel, which is the most prominent example of free and open source software collaboration to date [2]. Since the source code was made available to the public, Torvalds’ approach became known as the “open source” model. Free and open source software allows anybody the freedom to study, use, change, participate in the development of, and even distribute the software. This open source model of product development contrasts with the proprietary or “closed” product development model. Advocates of the free software model came up with a new copyright license, known as the General Public License or “copyleft,” which allowed them to make the source code available to the public. In the open system, the transparency of the source code created a platform where skilled people around the world could scrutinize the code and make improvements, using their imagination and creativity to come up with new innovative products [3]. The impact of the open source model adopted in the software industry in the early 1980s is huge. Currently, there are over 30 million users of Linux, which powers over 85% of the top 500 supercomputers in the world and runs on one quarter of all new smartphones. Popular web browsers like Mozilla Firefox, Chrome, and Konqueror are examples of open source software. In fact, anytime we are using Google, Yahoo, Youtube, Facebook, Wikipedia, or any other website, we are communicating with computers running on free open source software [4].
Open Source Model in Biomedical Science
The unequivocal success of the open source model in software industries caused it to gain momentum over the years and successfully fill the niche neglected by the private sector. The HGP is an example of the emergence of an early open source model in the biopharmaceutical sector. It was the collaborative effort of thousands of scientists all over the world (although primarily in the United States, United Kingdom, Japan, France, and Germany) to map and sequence the entire human genome. The project began in 1990 with the help of the U.S. Department of Energy and National Institutes of Health, and the Wellcome Trust in the United Kingdom; later the National Human Genome Research Institute of Health also joined in the venture. The main objective of the project was to identify disease-causing genes, determine the proteins coding the genes and their function, and use that information for drug development. In 1996, at the summit in Bermuda, all the HGP advocates (scientists as well as policy makers) agreed on a set of rules and principles, now known as the Bermuda Principles, which stated that the DNA sequencing information generated from the project should be open to public within 24 hours of its generation. This rapid release of genomic data from the HGP and making it available to the public revolutionized the scientific approach for data transparency in the biopharmaceutical sector [5]. The Bermuda Principles supported minimizing patent rights for the data generated through HGP. The “First Draft” of the human genome sequence was completed in 2000, and the sequence was published in the public Genbank database in 2001: a 90% complete sequence of all 3 billion base pairs in the human genome.
The open source model of HGP had a huge impact on biomedical research in subsequent years, especially in clinical medicine. Genomic mapping helped scientists understand the genetic basis of common hereditary diseases and assisted with the development of a wide variety of genetic tests for many common diseases. Related fields such as bioinformatics (a relatively new discipline at the convergence of biology and computing that is focused on capturing, storing, analyzing, and integrating biological and genetic information) and proteomics (the study of protein expression throughout an organism) evolved and benefited from the technological and scientific advances made possible by the HGP [6]. It is hardly surprising that in the post-HGP era, more and more biological software and bioinformatics programs and databases surfaced with an open source model: this may be due to the very fact that they are primarily written in codes and can be subject to IP coverage by copyright [7]. Examples of open source include Biojava [8], BioPython [9], Bio-SPICE [10], BioRuby [11], Simple Molecular Mechanics for Proteins [12], and Genetic components for Model Organism Databases [13]. Another major example of open source after the HGP is the HapMap Project. Researchers from six countries combined their efforts to identify and catalog genetic differences in humans and create a haplotype map of the human genome. Data from the HapMap project, which include information about areas known as single nucleotide polymorphisms (where DNA sequences vary among individuals), are open to the public [14].
In 2002, the Swiss Institute of Bioinformatics, the European Bioinformatics Institute, and the Protein Information Resource joined forces to create the UniProt (Universal Protein Resource) consortium [15] to provide a high-quality protein sequence database and functional information (much of it derived from the genome-sequencing project and freely accessible to the scientific community). The Protein Drug Target Database, a freely accessible protein database for in silico (computer simulated) drug target identification, currently contains more than 1100 protein entries with 3-D structures presented in the Protein Data Bank [16]. The data included in the open access database launched by Sciclips in 2010, ddTargets (Drug Targets/Disease Target) [17], covers more than 4500 drug targets from U.S. patents, U.S. and international patent applications, and published research articles. Detailed information regarding drug target type, assays, and methods to characterize these drug targets are freely accessible to researchers.
Various biological software programs that evolved after the HGP helped immensely in identification of lead compounds through in silico drug design [18]. Currently, various methods are used to identify lead compounds, including computational techniques to study the binding of different drugs or ligands to the target, and computational chemistry, cheminformatics, and bioinformatics techniques for identifying drug targets from genome sequences and databases. The major difference between software development and drug development is that in drug development all these computational techniques and programming could only speed up the process of identifying the lead compound or target molecule, which is not the final product of drug development. However, early identification of lead compounds and target molecules may provide a stepping-stone in the process of open source drug discovery.
Why Is the Progress of Open Source Drug Discovery in Biomedical Research So Slow?
It has been more than a decade since the completion of the HGP, and despite the resultant breakthrough technologies and resources there has been very little progress in open source drug discovery research. There are huge differences between software and heath care products, including differences in IP protection, the complexity of the development process and regulatory requirements, the total cost of the development process, and in the time and resources required.
With respect to IP protection, one of the most significant differences between software development and drug development is the licensing system. Software is written in “source codes” and can be protected by copyright licenses, whereas scientific ideas (e.g., new assay development or a new method for a diagnostic test needed for the drug development process) are covered by patents. Patents, which exclude everyone other than the inventor from practicing the invention, are very expensive to obtain and also require a high maintenance cost [19].
The drug development process is also very complex, and is subject to government regulatory requirements. The process of bringing a new drug into the market begins with the identification of thousands of chemical compounds that have the potential to alter the action of the drug target. Out of these, about 250 will be screened by preclinical testing to evaluate the compound’s toxic and pharmacologic effects through various in vitro and in vivo assays in mice and other test animals. Once the candidate drug has gone through a set of rigid tests to prove its safety, compounds are then tested for their pharmacology (absorption, distribution, metabolism, excretion, and toxicology) and sometimes chemically modified to improve their tolerability by the human body. Typically, about 10 of these compounds will be screened for tests on humans, and ultimately only one compound will be selected as a drug for sale after the rigorous process of clinical studies and approval from regulatory bodies like the U.S. FDA or the European Medicines Agency [20]. The approvals required by government regulatory bodies for biomedical products in development are in stark contrast to the development process for software products.
The full cost of bringing a new drug to the market, from discovery through clinical trials to approval, is enormous. The estimated total cost is somewhat controversial, as different studies have calculated different numbers. In 2003, DiMasi et al. reported that bringing a new drug to the market may cost approximately $800 million [21]. Some health economists estimate the cost between $500 million and $2 billion, depending on the therapy or the developing firm [22]. In 2010, in a study published in the journal Health Economics, one of the authors who criticized the methods used by DiMasi estimated the final cost to be ∼$1.2 billion [23]. Critic Marcia Angell, MD, a former editor of the New England Journal of Medicine, has called that number grossly inflated and estimates that the total is closer to $100 million. A 2011 study, also critical of the DiMasi methods, puts average costs at $55 million [24].
It takes anywhere between 8 and 14 years to complete the entire drug development process from start to finish. Unlike software products, health-care products require specialized equipment, laboratory space, access to biological reagents, and research tools. All the work at various stages of drug development must be conducted according to stringent Good Laboratory Practices, Good Manufacturing Practices, and Good Clinical Practices [19]. Pharmaceutical companies, like any other for-profit organization, manufacture products that must be sold for a profit in order for the company to grow. Pharma companies have to invest heavily at various phases of drug development, including R&D and preclinical and clinical trials, as well as to protect IP, comply with various organizational requirements, gain approval from regulatory agencies, market the drug, and conduct postmarket monitoring to ensure that IP rights are protected. In the conventional drug discovery model, drug companies usually protect the active ingredients of the drug, the nature and the source of the compound or biological substance, the manufacturing process, and other knowledge-based information through securing patents, which provide protection for a particular exclusivity period during which no one can market the drug without infringing the patent. Because of patent exclusivity, pharmaceutical companies can earn a significant return on their investment.
By contrast, open source drug development has features that won’t allow the pharmaceutical companies to impose a monopoly through patent exclusivity. One of the major goals of open sourcing is to promote the free flow of information to the public; this free use and access to information while the drug is in the development process will create further innovation and competition. Innovation and competition will, in turn, significantly lower the price of the drug. It is very unlikely that pharmaceutical companies will ever adopt an open source model for future drug development endeavors, since they cannot recover money invested for the R&D and other drug development processes.
There is no doubt that in the coming years biomedical science will continue to make breakthrough innovations, and pharmaceutical companies will protect these inventions with IP rights and bring new or “me-too” drugs (those that are nearly identical to current medications) to the market. But the question is whether these expensive drugs or high-end treatments can reach the billions of people in developing countries who need them the most. Will pharmaceutical companies invest time and money in conducting R&D to bring new drugs for neglected tropical diseases to the market? The open source model can perfectly fit into drug development for neglected disease, and it could play a vital role in filling this huge unprofitable niche area. Philanthropic consortia, public–private partnerships, and other nonprofit organizations all over the world can come forward and promote the open source model with a common goal to develop and provide low-cost drugs to millions of people suffering from these diseases around the world.