Using the Personal Computer for Healthcare Epidemiology
John A. Sellick Jr.
Keith F. Woeltje
Rebecca Wurtz
We live in the information age. The effective and comprehensive use of digital information in healthcare epidemiology and infection control is both desirable and necessary. This chapter discusses computer systems, networks, and the Internet from a high-level perspective. Because of space limitations, it is not possible to discuss each computer system, software package, network configuration, or troubleshooting program or to refer to specific products. Rather, this chapter provides a conceptual framework for discussing information services (ISs) in a healthcare setting.
THE ROLE OF THE HEALTHCARE EPIDEMIOLOGIST IN HEALTH INFORMATION MANAGEMENT
For the purposes of this chapter, health information management will be defined as the storage, exchange, and analysis of data generated by healthcare. The term “health information system” encompasses the digital hardware and software applications, architecture and network structure, interoperability standards, and policies governing the generation and use of health data.
The healthcare epidemiologist (HE) and quality officer obviously have a fundamental need for the data generated in electronic health systems. Almost no one else in the hospital and ambulatory setting routinely and systematically analyzes aggregated clinical data and data patterns. A common complaint following the implementation of enterprise-wide health information systems is that “data goes in, but I can’t get anything out.” It is essential for the HE to be knowledgeable about and involved in discussions about data elements, extraction, analysis, and visualization during the implementation and updating of information systems. Ideally, the HE develops the skills and permissions necessary for querying databases and extracting data without requiring a middleman. Healthcare organization IT departments are often overburdened and may not respond in a timely way to requests for query construction and data.
COMPUTER SYSTEMS AND NETWORKS
Historical Perspective
The Era of the Mainframe Historically, electronic data processing in large organizations, including hospitals, was done on mainframe computer systems. These monolithic systems were developed to support the fiscal and demographic data needs of healthcare organizations and became known as management information systems (MISs). Healthcare MISs have matured to include clinical components such as laboratory, pharmacy, and radiology information systems and clinician order entry, making important patient data more readily available to care providers. Data management systems for a myriad of other discrete purposes, such as surgical, obstetrical, and emergency department resource and inventory management, are available. Data input generally is from a keyboard terminal or directly from laboratory equipment. Printing may be done at central, high-speed printers or distributed network printers.
Mainframe computer software operating systems (OSs) and applications software typically were cryptic and complicated, requiring extensive expertise and time for setup and maintenance. Access for end users usually was through a hard-wired (directly connected) dumb terminal, essentially a monitor with an attached keyboard. User interaction was restricted to a set of defined keyboard commands and functions, often in the setting of predesigned menus or screens.
Attempts were made to develop customized mainframe infection control software to capture relevant demographic and clinical data, and then merge them with surveillance data collected and entered for individual healthcare-associated events (1,2). Standard reporting templates allowed the production of a wide variety of summary reports at set intervals, and provisions were made for the retrieval of data via ad hoc queries when the standard reports were insufficient (3). These custom-developed mainframe infection control data management systems typically were developed through cooperative efforts between infection control and the IS departments of a few larger hospitals. They were usually computer OS specific and thus difficult to adapt to other institutions with different computing configurations. Their high development (1) and maintenance costs and their lack of adaptability to other computing environments made them impractical.
The Rise of Personal Computers In the 1970s and 1980s, desktop microcomputers, often called personal computers (PCs), were developed by Apple, Inc. (Cupertino, CA) and International Business Machines Corp. (Armonk, NY). They have become so ubiquitous that most people associate the term computer with desktop microcomputers. The terms personal computer, desktop computer, and microcomputer all describe the same machine and are used interchangeably. These user-friendly machines sport a graphical user interface (GUI) and a control device such as a mouse or trackpad along with the ability to connect to a network and local printer. External input devices such as scanners and bar-code readers are inexpensive and easily attached, as are removable media (hard drive, disk, cartridge, or tape) storage and backup devices. (The standards for such connections are set by the Information Technology Working Group of the Institute of Electrical and Electronics Engineers [IEEE] and often are referred to by number.)
Stand-alone desktop computer hardware and software tools allow for a large measure of autonomy in developing and maintaining data management systems independent of the hospital’s information system. As a result, they have significantly altered the practice of surveillance, data management, and data analysis in healthcare epidemiology. Both infection control-specific software programs and generic database management programs are available to develop customized infection control databases. Word processing, statistical analysis, charting, communications, and presentation programs fulfill the remaining needs for most healthcare epidemiology and infection control providers.
Unfortunately, the learning curve for computers and software may be steep, and even software designed specifically for infection control may be difficult to use (4,5). Data often must be manually entered, which is time-consuming and leads to errors. Moreover, the distributed nature of personal computing has resulted in both duplicative and fragmented data sets throughout organizations. Similarly, the keepers of individual databases may be unwilling to share data with others and may not take the necessary steps to ensure the integrity and safety of their data.
The Advent of Client/Server Computing The desire for both the flexibility of a microcomputer and access to the data archives and computing power of a mainframe computer system led to the development of client/server computing. In this system, the desktop computer is a “client” that can connect to a “server” of data and software applications via a network. A network is a series of hardware devices and wiring that connect any number of client or server computers, printers, storage devices, etc. The network also requires its own software or protocol in order for the various devices to communicate and function with one another. Ethernet is the commonest network hardware specification and communications protocol for local area networks (LANs) in use today. There are a variety of network operating system (NOS) software products available to control the services, for example, access to files and printers, provided by the LAN.
A server may simply store files, deliver messages, and queue print jobs, or, in the true sense of client/server computing, it may provide for interaction and shared computing capability between the client and the server. The potential benefits for healthcare epidemiology and infection control services of having a computer attached to an information-laden server on a LAN are readily apparent. With proper client software, the organization’s demographic, clinical, laboratory, and financial databases can be searched, data analyzed, and items of interest moved to the client computer for further use or analysis. Other tasks such as providing backup copies of files or sharing documents and mail with other clients on the network are easily accomplished.
The appeal and potential benefits for client/server networks is considerable, but so are the potential problems of implementation. The databases on large servers generally are built using proprietary systems that require additional software and training for appropriate use. Intermediary programs, collectively called middleware, may be needed to create an interface between the client and the database to extract the information desired by the healthcare epidemiology service. However, if this can be accomplished, the rewards can be great once the desired data are identified, collected, and analyzed.
Recommendations for Personal Computer Systems
It is not possible to recommend a “one-size-fits-all” approach to computerizing a healthcare epidemiology or infection control service. As with purchasing a car, the users must evaluate their requirements and budget. Most organizations have moved to a client/server model of computing, making it feasible for healthcare epidemiology team members to have PCs that may fill several roles and satisfy multiple needs. The selection of hardware and software should take into consideration the requirements of connecting to a LAN and accessing a mainframe or other server(s). Consultation with the appropriate university, hospital, or other organization’s IS department can provide guidance in these areas.
Hardware If choosing a PC, the first decision to be made is whether to purchase a desktop/minitower or a laptop. The former traditionally have offered the potential of greater power and expandability; however, current laptop models have more than adequate power for almost all healthcare epidemiology functions, plus the advantage and potential problems of portability. Likewise, the availability of universal serial bus, IEEE 1394 (“Firewire”), and Express-Card or CardBus ports make laptops widely expandable as well. Laptops typically come with active matrix (thin film transistor) liquid crystal diode displays, but a display is a separate item for desktop units. The ultimate decision regarding the choice of desktop vs. laptop often will be a financial one, but the issues of ergonomics and security of the equipment also must be considered.
Regardless of whether a desktop or laptop unit is chosen, it is important that it have adequate random access memory (RAM) for anticipated tasks, an adequate size hard disk drive, and an optical drive. The latter may be a CD-ROM reader or a drive that both reads and writes CD-ROM media (CD-RW). Drives that combine CD-RW with the ability to read and write DVD-ROM (digital versatile disk) media also are widely available as either internal drives or external drives. These will accommodate the use of the many software titles, educational programs, and databases available on optical media. Appropriate networking connections such as Ethernet cable ports or adapters or wireless networking receivers (see below) also are necessary. If not connected to a network, a telephone modem will be needed to connect to the Internet.
Information “output” is a critical part of healthcare epidemiology and infection control and a variety of options are available. Often, a printer will be used and the two types in widespread use are the laser and inkjet varieties. The laser printer fuses microscopic plastic toner particles to a page (paper or transparency) the same way a photocopier does, while inkjet printers deposit droplets of ink on the pages. Laser printers usually are monochrome/grayscale, while almost all inkjet printers produce a full spectrum of colors, including the ability to produce photographic quality prints. Laser printers generally cost more to purchase, but inkjet printers usually have a higher cost of “consumables,” namely, ink cartridges and specially coated photographic paper and transparencies. It is important to compare per page costs and also individual needs, that is, frequent presentations might argue for an inkjet printer that can produce color transparencies while predominantly paper reports might favor a laser. It is worthwhile to investigate if a workgroup printer, available to several individuals or departments, is available since it may allow costs to be shared or even avoided.
Presentations may be sent directly to electronic multimedia projectors, but these devices generally are too expensive for individual or even departmental purchase. Most hospitals and universities have them installed in conference/lecture rooms or available for use and/or loan. Information also can be “published” on an intranet or the Internet (see below) using one of the many simple software programs available.
A method to provide backup or duplicate copies of important data must be provided, whether through a local device (such as an external hard drive) or a network file server, so that important or unique information may be retrieved in the event of equipment failure or theft. Many organizations provide each user with “space” on a network file server for copying important files, though this may not be large enough to copy an entire hard disk drive. A DVD-RW with appropriate software is a convenient and inexpensive way to make duplicate copies of important files or even an entire hard disk drive. External hard disk drives now are very inexpensive and much faster than optical drives. They may be disconnected and stored in a safe place.
Operating Systems An OS is the core software that enables computers to function, communicate with the hard disk drive and other devices. Several such systems are in widespread use. Microsoft Windows (Microsoft Corporation, Redmond, WA) is most widely used. However, Apple MacOS and a number of UNIX variants such as the freeware Linux also are popular. OS software often is hardware specific in either type or speed, sometimes both. Other than applications written in Sun’s Java programming language, software written for one OS does not run on computers that use a different OS. Also, application software may be specific for different versions of the same OS. For example, documents created using Windows 7 may not open on prior Windows versions. It is important to thoroughly understand the processor, memory, and hard disk space requirements of any OS or application software prior to purchasing it.
Basic Software Aside from the OS software, application software programs (often called applications, software or programs for short) are needed to make a PC more than an expensive Tetris machine. Basic software tools include a word processor for making reports and writing correspondence; a spreadsheet for making calculations and graphing data; a database for storing information, such as surveillance data; and a presentation program for making electronic slideshows and other visual reports. These basic tools often come as part of an “office” suite of software, which may be proprietary (e.g., Microsoft Office) or open source (e.g., OpenOffice.) Price, required RAM, hard drive storage space and the expertise required for efficient use are variable. Before purchasing software, attempt to evaluate it with a demonstration version (often available from publisher Web sites) or on a colleague’s PC. Software user reviews are widely available online. Universities, hospitals, and other large organizations often have site license agreements with software publishers that will provide basic software at little or no cost to employees or affiliated professionals.
Basic software also may include statistical analysis software, which no longer requires a mainframe computer to use. Many commercial packages as well as the freeware Epi Info from the Centers for Disease Control and Prevention (http://www.cdc.gov/epiinfo) are available for PCs.
Modern software, running on a GUI-based computer, has the ability to change typefaces (fonts), manipulate typeface styles, organize materials, and add tables, charts, or images to a document. A report of epidemiologic activity, therefore, could include text with bold headings, a table of key data, and several salient charts that result in a clear, concise, and compelling document. However, the very features that allow for this flexibility also can make the report
garishly unattractive if used in excess. In general, multiple typefaces should not be used in a single document, and script or other specialty typefaces should be avoided since they are difficult to read. Likewise, underlining and ALL CAPITALS are distracting and difficult to read in a body of text; emphasis may be added with boldface or italics. Some of the most egregious violations of publishing taste occur with newsletters and information sheets which, along with excessive typeface manipulation, often contain excessive amounts of clip art and other nontext items. Many publications are available to provide guidance for creating attractive documents and compelling visual displays of quantitative information (6,7).
garishly unattractive if used in excess. In general, multiple typefaces should not be used in a single document, and script or other specialty typefaces should be avoided since they are difficult to read. Likewise, underlining and ALL CAPITALS are distracting and difficult to read in a body of text; emphasis may be added with boldface or italics. Some of the most egregious violations of publishing taste occur with newsletters and information sheets which, along with excessive typeface manipulation, often contain excessive amounts of clip art and other nontext items. Many publications are available to provide guidance for creating attractive documents and compelling visual displays of quantitative information (6,7).
Personal Digital Assistants and Smartphones
The last decade has witnessed the explosive growth of shirt-pocket-size devices known as personal digital assistants (PDAs). Originally developed to be date books and a place to store names and contact information, these small computers have increased in speed and memory and now can provide data retrieval, basic word processing, statistical and database functions. Newer models include wireless access to the Internet or to local networks using the IEEE 802.11 standard. As with PCs, there are competing product platforms available: the original PDAs were produced to run the Palm computing platform (Palm, Inc., Sunnyvale, CA) or a “mobile” version of Microsoft Windows.
Many popular cellular telephone network devices, such as the BlackBerry devices of Research In Motion (RIM, Waterloo, ON, Canada), have emerged in the past few years. Dubbed “smartphones,” they offer PDA functions along with Internet/electronic mail access as well as voice telephony. Palm also has transitioned completely away from standalone PDAs to smartphones.
In early 2007, Apple, Inc., introduced the first iPhone and has released several updated versions. Along with PDA, audio playback and telephone functions, this device runs small applications, now commonly called “apps,” which gives it computer-like functionality. A nearly identical device, the iPod Touch, and the recently released iPad tablet have the same functions without cellular telephone capability. These devices offer 802.11 wireless Internet communications as well.
Apps for the iPhone, iPod Touch, and iPad have been developed for many purposes including medical applications and can be purchased and/or downloaded from the iTunes store. These include drug databases, medical calculators, and a hand hygiene assistant called iScrub that was developed at the University of Iowa (8). Through the use of third-party software, it also is possible to read Microsoft Office and Adobe portable document format (pdf; Adobe Systems, San Jose, CA) documents. Audio programs called “podcasts” also may be downloaded, and there are many medical lectures, news summaries, journal commentaries, etc., available. In response, some smartphone device manufacturers such as RIM have updated their OSs to offer similar apps and services.
In a recurring theme, the software made for one platform will not run on another and not all titles are available for all platforms. The potential user must evaluate these devices based on intended use, availability of apps and cellular network. Note that the iPhone and smartphone cellular contracts mandate an additional network data service charge beyond cellular voice service charges. Since PDAs and smartphones do not have built-in hard disk drives, it may be necessary to “synchronize” them with a PC to ensure data retention, availability, and security.
This group of devices holds promise in the healthcare setting though there are few data describing specific uses in healthcare epidemiology and infection control (8,9). Most uses appear to be for schedules, calculators (10), and pharmaceutical databases, so cost, personal preference, etc., will determine purchasing decisions. One theoretical “downside” of these devices is the potential to be fomites (11,12).
CONVERTING DATA TO INFORMATION
Data
Data for infection prevention can come from a wide variety of sources. Manually collected data (e.g., device days) can be entered by hand. Many data can be obtained electronically from hospital systems. Chapter 16 discusses the use of the electronic medical record for infection prevention and enterprise-level surveillance support. But even those who use their desktop PCs for supporting infection prevention activities can benefit greatly by receiving reports electronically in formats that can be imported into the analytic software they use. Many hospital departments may be able to generate reports (e.g., lists of surgical patients, patient days by ward) in standardized formats (see below) that many PC programs can import. This is certainly faster than reentering the data, but more importantly, avoids the possibility of transcription errors.
When trying to use data from multiple sources, it is important to ensure that the same terminology is used consistently. A wide variety of standardized terminologies exist for medical purposes (e.g., ICD-9, SNOMED). While each terminology is usually used for specific purposes—such as ICD-9 for billing—these often overlap, and even within a single institution different departments or computer systems may use different terms for the same fundamental concept. In such cases, there will be a need to settle on a standard terminology for infection prevention, and translate or “normalize” other codes or terms when necessary.
Once systems are in place to obtain data, provisions should be made for ensuring data integrity and completeness: that all relevant data are transmitted intact, that missing data are detected and replaced, and that new systems and terms are handled appropriately.
Spreadsheets
Even more so than word processing software, spreadsheet software ushered in the PC era. Spreadsheet software remains one of the most versatile tools at the disposal of the HE. Spreadsheet software is any application that allows data to be stored in the familiar row and column layout; Excel (Microsoft, Redmond, WA) is one example. All infection control personnel should know how to use an electronic spreadsheet.
Flat-file Database The column and row design of computer spreadsheets lends itself well to use as a “flat-file”
database. This is a database where all of the information can be contained on a single page (see the discussion of databases below for other types of databases). Typically, the first row is used to enter headings for the information to be collected (i.e., the database “field”; e.g., Name, Age, etc.). Each subsequent row is then used to store information for one subject (i.e., a database “record”).
database. This is a database where all of the information can be contained on a single page (see the discussion of databases below for other types of databases). Typically, the first row is used to enter headings for the information to be collected (i.e., the database “field”; e.g., Name, Age, etc.). Each subsequent row is then used to store information for one subject (i.e., a database “record”).
Although database programs can also be used for such flat-file databases, using a spreadsheet for this purpose has a number of advantages. For many users, a spreadsheet is almost always included with their office suite of software, whereas a database program will probably have to be purchased separately. Some users familiar with the use of a spreadsheet for calculations can extend that familiarity more easily than learning an entirely new program. Some statistical software may be able to import data from a spreadsheet table, making the spreadsheet file format a “common denominator” for sharing data between programs.
One of the handiest features of spreadsheets is the “filtering” function. This allows the user to see only records that meet a certain criterion. Although the same information may be obtained from a true database program, it typically involves more work. For example, an ICU may want to keep track of the intravenous devices used on various patients. In a spreadsheet, there may be a column for patient name, and others for medical record number, date of admission to the ICU, etc. A column could then be made for “IV device used.” Then, “peripheral catheter” or peripherally inserted central catheter (PICC) or “triplelumen catheter” could be entered into this column. By using the filter function, one could readily see all of the patients that had a PICC line. But typically, an ICU patient will have multiple types of IV catheters. One could have a column “IV catheter 1” and another column “IV catheter 2,” and so on. But consider what would be necessary to find all of the patients who had a triple lumen catheter. First one would have to filter for “triple-lumen catheter” in the “IV catheter 1” column, then in the “IV catheter 2” column, etc. One way to get around this is to have only one column for device and then enter a new row for each type of catheter the patient had. However, this would mean duplicating the patient demographic data for each row and would make a simple count of the patients or calculations like the mean patient age difficult. A more effective option would be to have a column for “triple-lumen catheter” with either “Yes” or “No” listed for each patient. A search for patients with such a catheter would only require that “Yes” be filtered for in that column. Obviously, such a system would become unwieldy if there were a large number of options, but for limited numbers of choices, looking up data can be very efficient. Many spreadsheets will filter on multiple columns, so that, for example, patients with both a PICC line and a triple-lumen would require only filtering for “Yes” in both of these columns.
Simple Calculations Although spreadsheets can be used for simple database functions, they were designed primarily to do mathematical calculations. Spreadsheet software can easily handle nonstatistical data needs of an HE. Users can enter very complex formulas, although for most purposes only relatively simple formulas are necessary. Rates, the fundamental calculation of the epidemiologist, are trivial calculations for these programs. Tables can be created to show rates over time such as monthly. As we’ll see, such time series also lend themselves to graphing the data.
A great deal of the power of spreadsheets comes from the ability to program formulas that refer to other cells, even cells that are on sheets other than the one with the formulas. This can be used to great advantage. A user can, for example, use one sheet for entering National Healthcare Safety Network (NHSN) rates into designated cells. That way all other pages are automatically updated for the latest rates without having to enter them into each page or formula separately.
Another powerful capability of spreadsheets is the ability to copy formulas from one cell to another. Thus, once a formula is written, say for a rate calculation, it need not be reentered from scratch over and over again, but can simply be copied. One must be careful to ensure that formulas are copied correctly. Each cell in a spreadsheet has a unique “address,” typically formed by the column letter and row number of that particular cell. Thus, the cell in column “C” on row “22” is designated “C22.” Sometimes, when copying formulas, the user wants the column or row number to change. Consider a column D with January surgical site infection data. Row 8 has the number of infections and row 9 the number of procedures. Row 10 is designed to have the rates. The user can enter a formula like (D8/D9)*100 to calculate the SSI rate per 100 surgeries in cell D10 (the exact method to designate that there is a formula within in a cell, as opposed to just text, will vary depending on the spreadsheet software used). Column E then will represent February data. Rather than retyping the formula in cell E10, the user can copy the formula from row D10. But the formula must read (E8/E9)*100—the February, not the January data must be used. In other cases, for example, when calculating standardized infection ratios, the user will likely want to use the NHSN rate in multiple calculations. If the spreadsheet is set up as previously mentioned, with NHSN rates entered into designated cells, then the user must be sure that that cell reference stays the same even when formulas are copied. Spreadsheet software has different ways to designate whether the cell references can be changed when a formula is copied.