Cognition and Human Computer Interaction in Health and Biomedicine



Fig. 2.1
Partial space of frameworks and cognitive theories





2.2 Human Information Processing


A computational theory of mind provides the fundamental underpinning for most contemporary cognitive theories. The basic premise is that much of human cognition can be characterized as a series of operations which reflect computations on mental representations. Early theories and models of human performance were often described in terms of the perceptual and motor activities and assumptions by their structural components (e.g., limits of short-term memory). These were primarily derived from the stimulus-response paradigm, and considered the human as an “information processor.” In other words, within this paradigm the human was an information controller, perceiving and responding to activities (Anderson 2005). This approach led to the development of several commonly used models such as Fitts Law (Mackenzie 1992) and the theory of bimanual control (Mackenzie 2003) – that predict performance of human activities in a variety of tasks (e.g., task acquisition, flight controls, and air traffic control). Detailed descriptions of the use of these theories can be found in Chap. 5 of this volume.

With the advent of computers, and more recently significantly interactive environments, there was a need for more integrated information-processing models that accounted for the human-computer interaction (HCI). There were two important requirements: first, the models needed to account for the sequential and integrated actions that evolve during human-computer interactions; second, in addition to the layout and format of the interface, the models also needed to account for the content that was presented on the interfaces (John 2003). In its most general form, the human information processor consists of input, processing and output components (see Fig. 2.2). The input to the processor involves perception of stimuli from the external world; the input/stimuli would be processed by a processor and involves a series of processing stages. Typically, these stages include encoding of the perceived stimuli, comparing and matching it to known mental representations in memory, and selection and execution of an appropriate response. The response is realized through motor actions. For example, consider a clinician’s interaction with an EHR interface, where he/she has to select a medication from a dropdown menu. The input component would perceive the dropdown menu from the interface, which would be matched in memory and a click action response would be triggered. This click action would be relayed to the motor components (output), which executes the action by clicking the dropdown menu item. This cycle repeats till the entire task of selecting the medication is completed. In the next sections, we consider core constructs associated with this approach including the model human processor, Norman’s theory of action, and mental models.

A322542_1_En_2_Fig2_HTML.gif


Fig. 2.2
Input-output model of human information processing. STM refers to short-term memory and LTM is an abbreviation for long-term memory


2.2.1 Model Human Processor


One of the earliest and most commonly described instantiations of a theoretical human information processing system is the Model Human Processor (MHP). MHP can be described as a set of processors, memories and their interactions that operate based on a set of principles (Card et al. 1983). As per MHP, the human mind consists of three interacting processors: perceptual, cognitive and motor. These processors can operate in serial (e.g., pressing a key) or in parallel (e.g., driving a car and listening to radio). Information processing of MHP occurs in cycles. First, the perceptual processor retrieves sensory (visual or audio) information from the external world and is transmitted to the working memory (WM). Once the information is in the WM, information is processed using a recognize-act cycle of cognitive processor. During each cycle, contents of WM are connected to actions that are linked to them (from long term memory). These actions, in turn, modify the contents of the WM resulting in a new cycle of actions. MHP can be used to develop an integrated description regarding the psychological effects of human computer interaction performance. While it is considered a significant oversimplification for general users (see applications of the MHP using the GOMS model in Chap. 5), it provided a preliminary mechanism on which much of the human performance modeling research was developed. MHP is useful to predict and compare different interface designs, task performance and learnability of user interfaces. It can be used to develop guidelines for interface design such as spatial layout, response rates and recall. It also provides a significant advantage, as these human performance measures can be determined even without a functional prototype or actual users.

Although the use of MHP approach has not commonly been applied in healthcare contexts, there have been a few noteworthy studies. For example, Saitwal et al. (2010) used the keystroke level model (KLM, an instantiation of the GOMS approach) to compute the time taken, and the number of steps required to complete a set of 14 EHR-based tasks. Using this approach, they characterized the challenges of the user interface and identified opportunities for improvement. Detailed description of this study and the use of the GOMS approach can be found in Chap. 5.


2.2.2 Norman’s Theory of Action


In the mid 1980s, cognitive science was beginning to flourish as a discipline and HCI was viewed as both a test bed for these theories and as a domain of practice. The MHP work was indicative of those efforts. At the same time, microcomputers were becoming increasingly common in homes, work and school. As a result, computers were transitioning from being a tool that was used by experts (i.e., computer scientists and those with high degrees of technical expertise) exclusively to one that was used broadly by individuals in all walks of life. Systems at that point in time were particularly unwieldy and often, extremely difficult to learn. In a seminal paper on cognitive engineering (Norman 1986), Norman sought to craft a theory “to understand the fundamental principles behind human action and performance that are relevant for the development of engineering principles of design” (p 32). A second objective was to devise systems that are “pleasant to use.”

A critical insight of the theory is the discrepancy between psychologically expressed goals, and the physical controls and variables of a system. For example, a goal may be to scroll down towards the bottom of a document, and a scroll bar embodies the physical controls to realize such a goal. Shneiderman presented a similar analysis in his theory of direct manipulation (Shneiderman 1982). The key question is how an individual’s goals and intentions get expressed as a set of physical actions that transform a virtual system and result in the desired change of state (e.g., reaching the intended section of the document). The Norman model draws on many of the same basic cognitive concepts as the MHP model, but embodies it in a seven stage model of action (Norman 1986), illustrated in Fig. 2.3.

A322542_1_En_2_Fig3_HTML.gif


Fig. 2.3
Norman’s action cycle

The action cycle begins with a goal, for example, retrieving a patient’s surgical history. The goal is a generic one independent of any system. In this context, let us presuppose that the clinician has access to paper record as well as those in an EHR. The second stage involves the formation of an intention, which in this case might be to retrieve the patient record in an EHR. The intention leads to the specification of an action sequence, which may include signing on to the system (which in itself may necessitate several actions), engaging a component system or simply a field that can be used to locate a patient in the database, and entering the patient’s identifying information (e.g., last name or medical record number, if it is known). The specification results in executing an action, which may necessitate several actions. The system responds in some way or in the case of a failed attempt, may not respond at all. A change in system state may or may not provide a clear indication of the new state or a failure to provide feedback as to why the desired state has not appeared (e.g., system provides no indicators of a wait state or why no response is forthcoming). The perceived system response must then be interpreted and evaluated to determine whether the goal has been achieved. If the response provided by the system is “record not found,” that could mean a number of things including that a name was mistyped or the number was incorrectly listed. On the basis of this determination, a next action will be chosen.

Any task of moderate complexity will involve substantial nesting of sub-goals, requiring a series of actions. To an experienced user, the action cycle may appear as a completely transparent and seamless process. However to a less experienced user, the process may breakdown at any of the seven stages. Norman (1986) describes two primary means in which the action cycle can break down. The gulf of execution reflects the difference between the goals and intentions of the user and the kinds of actions enabled by the system. For example, a user may not know the appropriate action sequence or the interface may not provide discernible clues to make such sequences transparent. For instance, a transaction may appear to be complete, but further action is needed to execute the selection process (e.g., pressing enter to accept a transaction).

The gulf of evaluation reflects the degree to which the user can make sense of the state of a system and determine how well their expectations have been met. For example, it is sometimes difficult to interpret a state transition and to know whether one has arrived at the correct state or whether the user has chosen an incorrect path. Goals that necessitate multiple state or screen transitions are more likely to present difficulties for users, especially as they learn the system. Bridging gulfs involves both bringing about changes to the system design and training users to become better attuned to the affordances offered by a system resources. Gulfs can be partially explained by differences in the designer’s models and the users’ mental models, as discussed in the next section. The designer’s model is the conceptual model to be built, based on analysis of the task, requirements, and an understanding of the users’ capabilities (Norman 1986). The users’ mental models of system behavior are developed through interacting with similar systems and gaining an understanding of how actions (e.g., selecting an item from a menu) will produce predictable and desired outcomes. Graphical user interfaces that involve direct manipulation of screen objects and widgets represent an attempt to reduce the distance between a designer’s and user’s model (Shneiderman 1982). Obviously, the distance is likely to be more difficult to bridge in a system like an EHR that incorporates a wide range of functions and components that may provide different layouts and forms of interaction.

Norman’s theory of action has given rise, or in some cases, reinforced the need for sound design principles. For example, the state of a system should be plainly visible to the user and feedback should be transparent. In illustration, dialog boxes or alert messages can trigger the intention of reminding users to what is possible or needed to complete the task. There is a need to provide good mappings between the actions (e.g., clicking on a tab) and the results of the action as reflected in the state of the system (e.g., providing access to the expected display).

Norman’s theory of action informed a great deal of research and design across domains. The seven-stage action theory was used to good effect by Zhang and colleagues in their development of a taxonomy of errors (Zhang et al. 2004). The theory also draws on Reason’s categorization of errors as either slips or mistakes (Reason 1992). Slips result from the incorrect execution of a correct action sequence and mistakes are the product of the correct completion of an incorrect action sequence. Slips and mistakes are further categorized into execution errors and evaluation errors. They are further categorized into each of the descriptors that correspond to the Norman’s seven stages (e.g., goals, intentions). Zhang et al. (2004) provide the following example of an intention slip: “A nurse intended to enter the rate of infusion using the up–down arrow keys, because this is the technique on the pump she most frequently uses; however, on this pump the arrow keys move the selection region instead of changing the selected number” (p 98). An example of an evaluation/intention slips is that a nurse interprets a yellow flashing light on a device analogically (based on prior knowledge of yellow as a warning) and interprets it as noncritical when it is in fact signaling a critical event. Norman’s seven-stage action theory proved to be a useful model for characterizing a wide range of medical error types.

Although theory of action has been very influential in the world of design and research, it also has shortcomings (Sharp et al. 2007). The theory proposes that stages are followed sequentially. However, users do not necessarily proceed in such a sequential manner, especially in a domain such as medicine, which is constituted by numerous and complex nonlinear tasks. Contemporary GUIs, for example, web-based or app-based systems provide users greater flexibility in achieving the desired state or access the needed information. As discussed in subsequent sections, external representations (e.g., as expressed in text displays or visualizations) offer guidance to the user or even structure their interactions in such a way that a planned action sequence may not be necessary.


2.2.3 Mental Models


Mental models are an important construct in cognitive science and have been widely used in HCI research (Van der Veer and Melguizo 2003). Mental models are an analog-based construct for describing how individuals form internal models of systems. They are employed to answer questions such as “how does it work?” or “what will happen if I make the following move?” “Analog” suggests that the representation explicitly shares some aspect of the structure of the world it represents. For example, one can envision in the mind’s eye a set of connected visual images of the succession of ATM screens one has to negotiate to get $200 out of one’s checking account or buildings one passes on the way home from a local grocery store. This is in contrast to an abstraction-based form such as propositions or schemas in which the mental structure consists of either the gist, or a summary representation, for example, the procedures needed to complete an ATM transaction. Like other forms of mental representation, mental models are invariably incomplete, imperfect and subject to the processing limitations of the cognitive system (Norman 1983). Mental models can be derived from perception, language or from one’s imagination (Payne 2003). Running a model corresponds to a process of mental simulation to generate possible future states of a system from observed or hypothetical state.

The constructs discussed in the prior sections emphasize how the general limits of the human-information processing system (e.g., limits in perception, attention and retrieval from memory) influence performance on a given task in a particular context (Payne 2003). On the other hand, mental models emphasize mental content, namely, knowledge and beliefs. An individual’s mental model provides predictive and explanatory capabilities regarding the functions of a particular system. The construct has been used to characterize differences in expertise in a range of knowledge domains such as physics (Payne 2003). Experts have richer and more robust models of a range of phenomena, whereas novices are more prone to imprecision and errors. Mental models has been used to characterize models that have a spatial and temporal context, as is the case in reasoning about the behavior of electrical circuits (White and Frederiksen 1990). The model can be used to simulate a process (e.g., predict the effects of network interruptions on downloading a movie from www.​amazon.​com).

Kaufman et al. (1996) characterized clinician’s mental model of the human cardiovascular system (specifically, cardiac output). The study characterized progressions in understanding of the system as a function of expertise. The research also documented various conceptual flaws in subjects’ mental models and how these flaws impacted subjects’ predictions and explanations of physiological manifestations (e.g., changes in blood flow in the venous system). In general, mental models are a useful explanatory construct for characterizing errors that are due to problems in understanding and not ones associated with flawed execution of procedures.

Mental models are a particularly useful explanatory device in understanding human-computer interaction (Staggers and Norcio 1993). The premise is that by exploring what users can understand and how they reason about the systems, it is possible to design them in a way that support the acquisition of the appropriate mental model and to reduce errors while performing with them. It is also useful to distinguish between a designer’s conceptual model of a given system and a user’s mental model (Staggers and Norcio 1993). The wider the gap, the more difficulties individuals will experience in using the system. For example, Kaufman and colleagues (2003) evaluated the usability of a home-based telemedicine system targeting older adults with diabetes. The study documented a substantial gulf between patients’ mental models of the system and the designer’s intent of how the system should be used. Although most of the participants had a shallow understanding of how such systems worked, there were some who possessed more elaborate mental models, and were better able to negotiate the system to perform a range of tasks including uploading blood glucose values and monitoring one’s condition over time.

It is believed that novice users of a system can benefit from instructions that imparts a conceptual model or supports a mental simulation process (i.e., helping the users mentally step through problem states) (Payne 2003). Diagrammatic models of the device or system are often used to support such a learning process. For example, Halasz and Moran (1983) found that such a model was particularly beneficial to students learning to use a programmable calculator. Kieras and Bovair (1984) demonstrated a similar benefit for students learning to master a simple control panel device. They conducted a series of studies contrasting two groups learning to use a device. One group was trained to operate the device through learning the procedures by rote. The second group was trained using a model of how the device works. The model group learned the procedures faster, executed them more rapidly and improvised when necessary, e.g., replacing inefficient procedures with simpler ones. The study provides an illustration of how having a more robust mental model of a system can impact performance. A more advanced model can enable a user to discover alternative ways to achieve the same goal and overcome obstacles.

The construct of mental models fell into disuse in the last couple of decades as theories that emphasized interaction and externalization of representations flourished. However, the construct has resurfaced in recent years as a means to characterize how individuals’ conceptualizations differ from representations in systems. For example, Smith and Koppel (2014) take the approach a step further in that they conceptualize three models: the patient’s reality, that reality as represented in an EHR and as reflected in a clinician’s understanding or mental model of the problem. Drawing on data from a wide range of sources (e.g., observations and log files) and findings, they constructed “scenarios of misalignment” or misrepresentation including categories such as “IT data too broadly focused” (i.e., lacking precise descriptions). For example, medical problem lists that do not permit sufficient qualification or classification illustrate an example of IT as being too broad or coarse. For instance, clinicians were not able to specify that a stroke resulted from a left-sided cerebrovascular accident. The typology provides a useful basis for IT designers to potentially reduce the gaps, better support users and diminish the potential for unintended consequences.

Shared mental models (SMM) represent an extension of the concept of mental models. The construct is rooted in research on teamwork in areas such as aviation (Orasanu 1990). Clinical care is recognized as a highly collaborative practice and there is a need to develop shared understanding about the processes involved in patient care as well as the evolving conditions of patients that are currently under their care. Breaks in communication among team members are known to be significant contributors to medical errors (Coiera 2000). There are only a few studies that demonstrate a relationship between SMM and clinical performance (Custer et al. 2012). Mamykina and colleagues (2014) investigated the development of SMM in an intensive care unit. The data included observations, audio recorded transcripts of patient handoff (i.e., transfer of patient during shift change) and rounds. In a recent paper, the analysis focused on a single care team including an attending physician, residents, nurses, medical students and physician assistants. The results indicated that the team initially had rather divergent perspectives on how well patients were doing, and the relative success of the treatment. Rounds served as an important coordinating event and the team endeavored to construct shared mental models (i.e., achieving a shared understanding) through an iterative process of resolving discrepancies. There was substantial evidence of change in SMM and in the coordination of patient care over a 3 day period. Whereas conversations on the first day focused on creating basic alignment and making immediate modifications to the care, discussions on the third day focused on understanding of underlying reasons for the situation, and developing a long-term plan more consistent with this collective causal understanding (Mamykina et al. 2014).

As mentioned previously, the concept of mental models has diminished as a construct employed by HCI researchers. One of the reasons is that mental models are not observable and can only be inferred indirectly. However, we believe that it has enduring value as an explanatory device for characterizing how individuals understand a system. The construct is too often used as a synonym for understanding, or for generic mental representation (i.e., with no commitment to the form of the representation). We favor the more specific instantiation of it as a model that can be used to simulate a process and project forward to predict events or outcomes or to explain why a particular outcome occurred. This enables us to develop theories or models for a given domain and then be able to predict and explain variation in performance. This should apply to a wide range of contexts whether the goal is to teach patients with diabetes to understand the basic physiology of their disease or for clinicians to use a newly implemented EHR. There is also evidence that a model-centric approach to teaching, in which an effort is made to foster an understanding of how a system works, confers some advantages over rote learning approaches to acquire the procedures needed to complete a task (Payne 2003; Gott and Lesgold 2000).


2.3 External Cognition


Internal representations reflect mental states that correspond to the external world. The term external representation refers to any object in the external world that has the potential to be internalized or to be used to augment cognitive processes (without internalizing). External representations such as images, graphs, icons, audible sounds, texts with symbols (e.g., letter and numbers), shapes and textures are vital sources of knowledge, means of communication and cultural transmission. The classical model of information-processing cognition viewed external representations as mere inputs to the mind that were processed and then internalized (Zhang 1997). The landscape began to change in the early 1990s when new cognitive theories focused on interactivity rather than solely modeling what was assumed to happen inside the head. Rogers (2012) cites Larkin and Simon’s (1987) classic paper on “why a diagram may be worth a thousand words” as seminal to researchers in HCI. It offered the first alternative empirical account that focused on how people interact with external representations. The core idea was that cognition can be viewed as the interplay between internal and external representations, rather than only about modeling an individual’s mental state and processes. Similar ideas had been put forth by others (Hutchins et al. 1985), but Larkin and Simon provided an explicit computational account that inspired the HCI community (Rogers 2012). Larkin and Simon (1987) made an important distinction between two kinds of external representation: diagrammatic and sentential representations. Although they are informationally equivalent, they are considered to be computationally different. That is, they contain the same information about the problem but the amount of cognitive effort required to come to the solution differs. For example, effective displays facilitate problem solving by allowing users to substitute perceptual operations (i.e., recognition) for effortful cognitive operations (e.g., memory retrieval and computationally-intensive reasoning) and effective displays can reduce the amount of time spent searching for critical information (Patel and Kaufman 2014). On the other hand, cluttered or poorly organized displays may increase the burden.

In the next two sections, we consider two extensions of external cognition, namely, the representational effect and the theory of intelligent spaces.


2.3.1 Representational Effect


The representational effect can be construed as a generalization of Larkin and Simon’s (1987) conceptualization of the cognitive impact of external representations (Zhang and Norman 1994). It is well-known that different representations of a common abstract structure can have a significant impact on cognition (Zhang and Norman 1994; Kahneman 2011). For example, different forms of displaying patients’ lab values can be more or less efficient for tasks. A display may be oriented to support a quick readout of discrete values or alternatively, one that allows clinicians to discern trends over a period of time. A simple illustration of the effect is that Arabic numerals are more efficient for arithmetic calculations (e.g., 26 × 92) than Roman numerals (XXVI × XCII) even though the representations are identical in meaning. Similarly, a digital clock provides a quick readout for precisely determining the time at a glance (Norman 1993). On the other hand, an analog clock enables one to more easily determine time intervals (e.g., elapsed or remaining time) without recourse to mental calculations. Norman (1993) proposed that external representations play a critical role in enhancing cognition and intelligent behavior. These durable representations (at least those that are visible) persist in the external world and are continuously available to augment memory, reasoning, and computation. Imagine the cognitive burden of having to do multi-digit multiplication without the use of external aids. Even a pencil and paper will allow you to hold partial results (interim calculations) externally. Calculations can be extremely computationally intensive without recourse to external representations (or memory aids).

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Oct 21, 2016 | Posted by in BIOCHEMISTRY | Comments Off on Cognition and Human Computer Interaction in Health and Biomedicine

Full access? Get Clinical Tree

Get Clinical Tree app for offline access