The general public may be able to follow the details of scientific research to only a modest degree; but it can register at least one great and important gain: confidence that human thought is dependable and natural law is universal.
–Albert Einstein. From Science and Society.
If you didn’t document it, it wasn’t done.
–Anonymous
This chapter discusses data management and systems at a general level and does not present many details of such topics as cleaning data, locking studies, tracking clinical research form (
CRF) pages and queries, and change control. Readers who are interested in a more detailed level of data management are referred primarily to the literature, to discussions with contract research organizations (
CROs) and to the book by
Prokscha (2007) as a next step to explore the world of data management.
THE WORLD OF DATA MANAGEMENT
Data management is a general phrase that has many meanings within a pharmaceutical company. A number of years ago, fewer functional areas were concerned with data management than today, and most issues were less complex. Today, large quantities of data are generated by all functions within a company. Data must be processed, combined, interpreted, and used in ways that are clear, efficient, and designed to achieve one’s goals. The term data management refers to the processes whereby some or all of these steps are handled, monitored, and controlled.
Functions in a Company and Their Relationship with Data Management
This section describes major areas in research and development where large amounts of data are generated.
Research
Drug discovery activities utilize computers to an increasingly greater degree to generate, store, and analyze large quantities of data. This category includes many preclinical activities in chemistry, pharmacology, and other biological sciences. Results obtained on each of a company’s compounds are stored and made readily available to scientists. Molecular modeling, combinatorial chemistry, and high throughput screening are a few of the areas requiring sophisticated data handling (see
Chapter 114).
Toxicology
The flow of toxicological information and data into formal reports suitable for registration dossiers is becoming progressively more automated. In many individual studies, a large number of animals are involved; a large number of blood chemistry, toxicokinetic, behavioral, gross pathology, microscopic pathology, and other parameters are measured in each animal; and evaluations are made at many scheduled time points throughout the study. This means that efficient
systems must be developed and utilized to handle an enormous quantity of data. Good Laboratory Practices regulations require high standards of data management in addition to high standards of animal care and treatment (see
Chapters 13,
103, and
104).
Clinical Trials
The huge amount of data, generated in even a single clinical trial, requires sophisticated data management systems. Data management involves many aspects including data collection, data transmittal to the sponsor (this is done by remote entry methods in some trials), data editing, data entry into computers, data verification, data analysis by statisticians, and data interpretation. In an increasing number of trials, data are entered by investigators or their staff that go directly into computers at the sponsor’s site. For large-scale multicenter trials, special groups and monitors are often required in addition to the sponsor’s staff at both a central data management site (which may or may not be a
CRO) and the individual trial sites (see Section 6).
Regulatory Affairs
Compiling regulatory dossiers involves numerous data management techniques. These include collating and completing reports, numbering pages, transferring some or all data to another medium (e.g., electronic formats for an electronic submission, such as an electronic Common Technical Document [also referred to as an e-
CTD], or an electronic Investigational New Drug Application), and submitting different subsets of data to numerous regulatory agencies around the world in a variety of different formats and languages (see Section 7). The Food and Drug Administration has specific guidances for electronic submissions and other related aspects of data management and readers are referred to their website (www.fda.hhs.gov).
Technical Development
Numerous aspects of technical development generate large amounts of data (e.g., stability tests conducted under various conditions, with samples obtained and analyzed at numerous time points). Capsules are weighed individually and their quality assured. Complex clinical trial drug labels and bottles may be prepared and the bottles filled with capsules or tablets using sophisticated systems requiring computer-assisted data management.
Chapter 107 describes other examples in technical development departments where large amounts of data are generated.
Data and Information Storage
Keeping up with all of the company’s reports, minutes, and other documents is a major effort requiring a variety of systems to keep these records in order. Documents must be indexed to allow for rapid retrieval. Published literature on the company’s drugs and other topics of interest must also be systematically analyzed and referenced for easy retrieval. Libraries are becoming less pure hard copy books, magazines, and other materials and are rapidly transitioning to electronic records (
Cullen and Mason 1995). Search engines such as Google are becoming an increasingly important method of searching within large website databases, as well as searching across websites for key terms (see
Chapter 100).
Postmarketing Surveillance
Reports on adverse events to marketed drugs reach companies from many sources and countries. The quality and relevance of those data vary to an incredible degree. The data must nonetheless be dealt with rapidly and accurately by pharmacovigilence groups who must deal with regulatory requirements as well as data management issues (see
Chapters 66 and
67).
Quality Assurance Procedures
Quality assurance procedures are used in many areas of a company in relation to drug development, including toxicology and drug manufacture. The large volumes of data generated require automated computer-assisted systems to help in this substantial effort (see
Chapters 70,
107, and
108).
General Approach to Data Management
An important principle about data management is that it must be a planned process where a great deal of forethought has gone into the creation of systems, flow of information, checks and balances, and other aspects of plans that are subsequently implemented. Rushing any part of this process may easily cause complex problems that can take years and significant effort and money to unravel and rectify. The data management process cannot be allowed to develop on its own in a topsy-turvy manner. Many data management issues involve reassessing an area where the company has fallen behind the state of the art or believes it can handle data in a better way. To address those issues, new systems (and often new technologies) must be evaluated, chosen, and then implemented.
The use of computers in data management has undergone dramatic change through advances in both hardware and software. Few areas of data management are not intimately connected with computers and computer use. Some of the roles that computers can play in managing data are shown in
Tables 102.1 and
102.2. Issues relating to computer hardware or software are not discussed here, nor are the myriad of more technical computer issues.
Data Management Plan
A data management plan is an important document that outlines both what will be done with the data prior to initiating the trial, as well as eventually documenting what was done at each step along the way, both during the trial and subsequent to it. It stays with the database so that in future reviews of the data this reference will be able to answer many important questions on the methodologies followed and the issues and problems that arose and how they were addressed.
The data management plan should indicate prior to initiating the clinical trial, the activities, the responsible people, the required documents that will need to be completed, as well as the standard operating procedures (
SOPs) and other policies that are to be followed for the various elements of the plan [see Appendix B of
Practical Guide to Clinical Data Management by
Prokscha (2007) for a listing of
SOPs that need to be created]. A good principle is to document everything that seems relevant as questions invariably arise at some point where a more complete plan will be helpful in answering them. The contents of a typical data management plan are shown in
Table 102.3. The book by Prokscha is an excellent text for those wishing to delve into topics of data management more deeply, as it is logically organized and clearly written.
Specific Data Management Issues
Some of the specific data management issues currently being explored within the pharmaceutical industry include the following: