Addiction to drugs of abuse is a chronically relapsing disorder characterized by a compulsion to take the drug, loss of control of intake, and the development of a negative emotional state when drug access is withheld. Initiation of drug use typically includes a variety of influencing factors, including (but not limited to) social, biological, and emotional factors. It is widely held that drug use is initiated because of the ability of these substances to produce feelings of pleasure and well-being (i.e., euphoria). Over time, however, tolerance develops to the euphorigenic properties of many drugs of abuse, which perpetuates drug-seeking behavior by leading the user to increase the dose and/or frequency of drug use in order to obtain the euphoria that was previously experienced (so-called chasing the dragon). With repeated drug use, the user begins to form associations between the subjective effects of the drug and environmental stimuli that are associated with the drug. These associations are formed by classical (Pavlovian) conditioning processes, and the types of stimuli or cues that become paired with drug use can be spatial, visual, auditory, tactile, olfactory, temporal, or interoceptive in nature. Of interest, and as discussed in more detail later, tolerance can develop that is conditioned to the stimuli associated with drug administration, and these stimuli become quite powerful in exerting control over the biological expression of tolerance. Examples of such stimuli include drug paraphernalia, the location in which the drug is repeatedly taken, the smell of alcohol or tobacco smoke, and the time of day. Because drug addicts do not typically live under conditions in which they are isolated from drug-associated cues (possible exceptions being an addict who has been incarcerated or placed in a residential-treatment program), active drug addicts typically encounter these drug-associated environmental stimuli on a daily basis. This repeated exposure to drug-associated stimuli can elicit expectation of drug availability or memories of previous euphoric experiences under the influence of a particular drug, which may in turn result in drug craving and drug-seeking behavior, leading ultimately to the perpetuation of drug self-administration and the addiction cycle.
Most drugs of abuse are consumed in cyclic patterns consisting of active drug self-administration followed by abstinence. During the abstinence phase, the repeated emergence of withdrawal symptoms may result in conditioned associations between environmental stimuli and the negative affective state (i.e., depression, anxiety, and irritability) that typically manifest during withdrawal. As a result, withdrawal-associated environmental stimuli may also trigger drug-seeking behavior to alleviate the evoked negative affect via negative reinforcement processes (i.e., removal of withdrawal-induced dysphoria).
The neurobiological basis of conditioning in drug addiction has been advanced significantly by (1) the development of various animal models of drug-environment conditioning and (2) human imaging studies in which brain activity is monitored during exposure of an addict to drug-associated stimuli. In this chapter, we discuss the most widely used animal models of drug conditioning: the conditioned place preference paradigm, cue-induced enhancement of drug-self administration, second-order schedules of reinforcement, and cue- and context-induced reinstatement of drug-seeking behavior. We also discuss additional processes of drug conditioning including incentive salience attribution and Pavlovian-instrumental transfer. We then summarize key findings from studies using these paradigms on the neural substrates of drug conditioning, in addition to results from human brain imaging studies. Finally, we highlight several recent studies using newer neurobiological methods to reveal novel neural substrates of drug conditioning and the mechanisms underlying cue-evoked relapse-related behaviors.
Methods for Assessing the Conditioned Effects of Drugs of Abuse in Laboratory Animals
Like human beings, laboratory animals including rats, mice, dogs, and nonhuman primates are able to form associations between environmental stimuli and appetitive rewards such as food, sweetened substances such as sucrose, and drugs of abuse. These species are also able to form similar associations between environmental stimuli and aversive events such as the presentation of an electric shock or the experience of drug withdrawal symptoms. The most notable experimental studies on this type of conditioning were conducted in the late 19th and early 20th centuries by noted Russian physiologist Ivan Pavlov. Pavlov noted that experimental dogs began to salivate in anticipation of the presentation of food. Eventually Pavlov was able to elicit salivation in these dogs by presentation of a discrete environmental stimulus (the sounding of a bell) immediately before the presentation of food. These landmark studies, for which Pavlov was awarded the Nobel Prize in Physiology and Medicine, were the first to describe the phenomenon of classical or Pavlovian conditioning, where a previous neutral stimulus (i.e., the sound of a bell, serving as the conditioned stimulus) becomes associated with a naturally appetitive stimulus (i.e., food, the unconditioned stimulus). Eventually, with repeated conditioning, the organism learns to predict the availability of the unconditioned stimulus upon presentation of the conditioned stimulus, and thus the conditioned stimulus becomes motivationally salient.
In the context of drug addiction, classical conditioning is a widely prevalent phenomenon, such that during the course of repeated drug-taking behavior, environmental stimuli associated with the drug (i.e., the conditioned stimulus, such as the smell of tobacco smoke or the sight of a hypodermic syringe) become associated with and eventually predict the availability of the drug (i.e., the unconditioned stimulus). The chronic nature of drug addiction allows for numerous pairings of the conditioned stimulus and unconditioned stimulus, to the point that the conditioned stimulus becomes motivationally salient to the addicted individual. In the case of an individual attempting to abstain from drug use, encountering a conditioned stimulus can provoke intense drug craving, which leads to drug-seeking behavior and greatly increases the propensity for relapse.
The neural basis of classical conditioning has been studied for decades at the cellular and molecular levels from in vitro preparations to the behavioral analysis of animals and humans. Here, we briefly summarize four of the most commonly used behavioral paradigms in laboratory rodents that are designed to investigate the phenomenon of conditioning factors in drug addiction. These include the conditioned place preference paradigm, cue-induced enhancement of drug-self administration, second-order schedules of reinforcement, and cue-induced reinstatement of drug-seeking behavior. Although preclinical models of addiction processes are frequently used to examine the neurobiology of drug use and relapse, it is important to critically examine their predictive validity. In addition, it is important to examine the translational value of the models, as effective treatment of drug use is the goal. However, it should be noted that relapse rates remain high, and thus although animal models have led to important advancements in our knowledge of the neurobehavioral underpinnings of addiction, there is more to uncover in our understanding of these processes.
Conditioned Place Preference
In the conditioned place preference paradigm , an animal learns to associate the effects of a passively administered substance with the environment in which the drug was received. A typical conditioned place preference apparatus is shown in Fig. 8.1 , and consists of two compartments with unique tactile and visual characteristics (i.e., striped walls and mesh flooring in one compartment versus transparent or solid walls and metal bar flooring in the other). Occasionally, distinct olfactory cues are used in each compartment. These two conditioning compartments are connected by a neutral center start compartment. Each compartment is typically equipped with photobeams located just above the floor that can detect the presence of the animal and concurrent locomotor activity and record them via an interfaced computer.
In a typical conditioned place preference experiment, an animal undergoes baseline preference testing and habituation, whereby it is placed in the center start compartment and allowed free access to both conditioning chambers for a set amount of time (i.e., 30 min). This allows for the animal to habituate to the testing environment as well as for the experimenter to determine whether the animal exhibits any innate bias toward one of the two conditioning compartments. (An ideal conditioned place preference apparatus would produce no innate preferences for either compartment.) This first period of access to the conditioning compartment also serves as a preconditioning test, and the time spent in either compartment can later be compared against the same variable after conditioning with the drug. Following this habituation and preconditioning test, the animal is injected with a neutral substance (i.e., saline) and is then confined to one of the two conditioning compartments (using automated or manual guillotine-type doors) for a fixed period of time. On the following day, the animal is injected with the conditioning drug (e.g., morphine, cocaine, or amphetamine) and confined to the other conditioning compartment for the same amount of time. These conditioning trials are repeated in an alternating fashion (i.e., saline-drug-saline-drug-…) a number of times so that the animal learns to associate the unique physical characteristics of the drug-paired compartment with the subjective effects of the conditioning drug. Finally, on the test day, the animal is placed back in the center compartment in a drug-free state and is allowed free access to both conditioning compartments for the same amount of time as during the preconditioning test. If the animal spends significantly more time in the drug-paired compartment than in the saline-paired compartment, conditioned place preference has been established, reflecting the animal’s association of the drug compartment with the subjective (presumably pleasurable or rewarding) effects of the drug. Conditioned place preference has been demonstrated in rodents for all drugs of abuse, although the experimental procedures may vary by the drug and its individual pharmacokinetic properties. Conditioned place aversion is observed if the animal spends significantly less time in the drug-paired compartment than in the saline-paired compartment. Withdrawal from chronic drug exposure reliably produces conditioned place aversion. In addition, some drugs such as ethanol can also produce conditioned place aversion if the peak positive subjective effects of the drug are not timed and are paired correctly with the drug-conditioned compartment.
One advantage of the conditioned place preference paradigm is that the experiments are relatively simple, inexpensive, and less time-consuming to conduct than more involved procedures such as intravenous drug self-administration. In addition, conditioned place preference paradigms can be used to simulate various aspects of relapse. This is accomplished in one of two ways: (1) extinguishing an established conditioned place preference by repeatedly pairing the previously drug-paired compartment with saline, or (2) allowing the conditioned place preference to dissipate over a period of several weeks by repeated testing of place preference. Then drug priming or stress can be introduced to the animal to reinstate the original conditioned place preference, a phenomenon that has been hypothesized to model drug-seeking behavior.
Despite its simplicity and ease of use, there are several disadvantages of the conditioned place preference paradigm. First and foremost, the animal subjects do not actively self-administer the drug; it is passively administered as a bolus injection by the experimenter. In addition to potential pharmacokinetic differences in plasma and brain levels of the drug between passive and active self-administration, a substantial amount of evidence has accumulated indicating that active versus passive drug administration produces significant differences in neurochemical, endocrine, and other responses to drugs of abuse. These differences may underlie some of the discordant findings between studies using pharmacological or other experimental manipulations in the conditioned place preference paradigm and those utilizing active self-administration. In addition, the primary dependent variable measured in the conditioned place preference paradigm does not directly measure drug-seeking behavior but, rather, the motivation for drug-associated environments. Despite these limitations, the conditioned place preference paradigm undoubtedly has provided useful information on the neural substrates that underlie drug-environment conditioning and their contribution to addictive behaviors, as discussed later in this chapter.
Another important point to address is what exactly is learned in conditioned place preference? We know that temporal contiguity is necessary and sufficient for learning. In conditioned place preference experiments, there tends to be a perfect predictive relationship between the context conditioned stimulus (CS) and administration of the drug. In addition, the drug unconditioned stimulus (US) and context CS always co-occur. Thus, the rules of temporal contiguity are met and learning occurs in this model. However, the CS is a complex, multimodal stimulus (a context) that includes distinct olfactory, visual, tactile, auditory, and spatial elements. Does each element of the stimulus enter into an independent association with the drug US, or do all of the distinct elements combine to form a single, configural stimulus that then becomes associated with the US? These relationships are somewhat unclear; thus we really do not know how the context CS is neurally encoded. This is problematic for interpretation of the neurobiological underpinnings of drug conditioning in this model. In addition, drug self-administration is a model typically used to identify the reinforcing efficacy of drugs of abuse (see subsequent text). Is conditioned place preference isomorphic with self-administration? In some cases, drugs of abuse elicit place preference and are self-administered, but this is not always the case. Thus it appears that these two models might tap into different neurobiological systems that govern the ability of stimuli to elicit drug-motivated behavior.
It has long been known that exposure to drugs of abuse can lead to biological changes that are governed by environmental stimuli. Indeed, environmental stimuli that are contiguously and consistently paired with administration of drugs of abuse begin to take on value and become quite powerful in their ability to modulate the biological and behavioral effects of the drug. When individuals with substance use disorders are asked to consider factors that contribute to relapse, environmental stimuli are identified as being equally or more powerful than other influences, including mood or impulsive choices. One biological phenomenon that occurs is tolerance, in which repeated use of a drug over time results in a decreasing effect of the drug, and drug-associated stimuli elicit conditioned responses that attenuate the drug effect. For example, early studies have shown that tolerance to heroin can be conditioned to the environment in which the drug is normally consumed. If the drug is taken in a novel environment, this conditioned tolerance will not protect against a high dose of the drug, and overdose is likely to occur to a dose that would be tolerated in the conditioned environment. This phenomenon can be modeled in animals. Specifically, rats were exposed to either heroin or placebo in a colony room or a noisy room, repeatedly for 30 days. Following this exposure, rats were given a high dose of heroin (15 mg/kg) on a test day either in the same room they had received the injections previously, or in a novel room in which they never received heroin exposure (either the colony or noisy room). Mortality rates of rats that received the high dose of heroin in the room in which they had previously received placebo were quite high (96%), whereas rats that received the high dose of heroin in the room in which heroin was previously administered had a relatively low mortality rate (32%). This experiment demonstrates the power of environmental stimuli in governing the biological responses underlying drug use.
Conditioned Cue Enhancement of Drug Self-Administration
One of the most widely used paradigms to study drug addiction in animals is the intravenous self-administration paradigm ( Fig. 8.2 ). In the case of rodents, a rat or mouse is surgically implanted with an indwelling intravenous catheter into the jugular or femoral vein, which exits the skin on the dorsal side of the animal and is connected to a vascular access port. Following recovery from surgery, the animal is placed in a self-administration apparatus chamber equipped with one or two levers that are interfaced with a computer and a syringe pump. In lieu of levers, some investigators utilize a nose-poke hole on the wall of the self-administration apparatus, whereby a nose-poke into the correct hole triggers the delivery of a reinforcer. A positive reinforcer is defined as a stimulus that increases the likelihood that the response will occur again in the future (e.g., an addictive drug), whereas a negative reinforcer is defined as a stimulus that decreases the likelihood that the response will occur again (e.g., an aversive stimulus such as an electric shock). To learn the operant task (i.e., lever-press or nose-poke), the animal is often initially trained to perform the task in order to receive a natural reinforcer such as a food or sucrose pellet. (The animal is mildly food-restricted to increase its motivation to seek food during initial training.) However, not all investigators use this initial food restriction and training, since it changes the nutritional and metabolic state of the animal. Instead, some investigators may choose to capitalize on the intrinsic exploratory nature of rodents, since over time the animal will eventually exert the correct operant response, receive an intravenous drug infusion, and, with repeated training sessions, learn that this correct response results consistently in the delivery of the drug solution.
The drug solution is delivered by a computer-controlled syringe pump located outside the self-administration apparatus. The pump contains a drug solution that is connected to a single-channel liquid swivel, which allows free rotation of the animal while maintaining a continuous flow of fluid. Plastic tubing is then housed in a stainless steel spring tether and is attached to the animal via a vascular access port implanted on the dorsal side of the animal, which is connected to the indwelling venous catheter.
In the case of alcohol, intravenous self-administration procedures are used less frequently, since this method lacks the face validity and pharmacokinetics of human oral alcohol consumption, and the ability of intravenous ethanol to function as a reinforcer is less reliable. Thus most animal models of alcohol self-administration utilize an experimental apparatus by which—instead of a syringe pump delivering the drug solution intravenously—a dilute ethanol solution (usually 8%–12% v/v) is delivered into a receptacle located near the lever or nose-poke orifice, where the animal can consume it orally. However, because of the aversive orosensory nature of ethanol, many researchers often initially train animals to consume alcohol solutions sweetened with sucrose or saccharin to increase its palatability. Then, slowly over a period of weeks, the concentration of the sweetener is gradually reduced until eventually the animal performs the operant task to consume an unsweetened ethanol solution.
There are many advantages of the operant self-administration paradigm as a model for human drug-taking behavior, including: (1) the drug is administered voluntarily by the animal (as opposed to passive administration by an experimenter); (2) the drug-taking behavior can be temporally examined within and between self-administration sessions; (3) candidate therapeutic pharmacological compounds or other experimental manipulations can be administered to determine their effects on drug self-administration; (4) the number of responses that must be exerted by the animal to receive the drug can be gradually increased (called a “progressive ratio”) until the animal gives up and no longer performs the operant task (called the “breakpoint”)—this method is used to measure the level of motivation to self-administer the drug as well as the efficacy of the reinforcer, and, finally; (5) the procedure is amenable to the study of relapse-like behavior (see “Cue- and Context-Induced Reinstatement of Drug-Seeking Behavior”).
One additional advantage of operant self-administration procedures is their amenability to the study of the role of conditioned cues in the reinforcing effects of drugs of abuse. In addition to delivery of the drug, many researchers also use environmental cues such as the presentation of stimulus light, auditory tone, olfactory cue, or combinations thereof that are simultaneously paired with the intravenous delivery of the drug solution. Over successive self-administration sessions, the animal learns to associate these cues with the availability of the drug and its pharmacological effects. It should be noted that these cues act as conditioned reinforcers that are typically delivered contingent upon a lever press. Indeed, noncontingent conditioned reinforcers do not elicit an increase in motivated behavior. Stimuli can also act as discriminative stimuli or occasion setters, which are noncontingent and involve the array of environmental stimuli that occur in conjunction with drug use. These types of stimuli modulate the response-eliciting ability of discrete conditioned stimuli paired with drug self-administration or serve a discriminatory function that predicts the availability of a drug of abuse upon the completion of a particular emitted response. Studies have shown that, for most drugs of abuse, the presence of drug-associated cues greatly increases the number of operant responses exerted per test session, compared with when the drug is self-administered in the absence of such cues (see Fig. 8.2 ). a
a References 17, 23, 38, 48, 54, 62, 73, 74, 80, 113.These findings suggest that in addition to the primary reinforcing effects of the drug itself, drug-associated stimuli (also termed secondary reinforcers or conditioned stimuli) regulate drug self-administration behavior, a phenomenon referred to by experimental psychologists as stimulus control of behavior. This stimulus control has also been demonstrated in human cocaine users in a laboratory setting. In the case of psychostimulants, this enhancement of drug reinforcement by drug-associated cues has been hypothesized to be a result of the augmentation of the impact of sensory information caused by this class of drugs.
The power of conditioned cues can also be seen in their ability to modulate intake of the drug itself in their presence or absence. Stimulus control, when instrumental behavior comes under control of a particular stimulus (e.g., a conditioned reinforcer such as a light or tone discrete stimulus), is evident by the ability of these stimuli to drive motivated behavior. Animals will respond one way in the presence of the stimulus and in a different way in the presence of another stimulus. This demonstrates the power of these stimuli to control behavior. In the self-administration model, animals are typically given short access (∼12 hours/day) to take the drug. This is believed to model maintenance of drug use rather than the dysregulated intake that occurs in human addicts. Ahmed and colleagues thus developed a model of escalated drug use that was designed to capture this dysregulation in the spiral of addiction that occurs in humans. Although it is thought that this models dysregulated drug use behavior, these data typically show a plateau of intake within the sessions. If this truly models dysregulation, in theory it should never plateau. Thus, Beckmann and colleagues designed experiments to test whether it truly is dysregulation of intake, or rather a form of learning when animals are switched from short-access (1 hour/day) to long-access (6 hours/day) sessions. In these experiments, cocaine intake came under stimulus control when a house light was illuminated on 6-hour sessions, and not illuminated on 1-hour sessions, which alternated every other day. On long-access (6-hour) days, animals showed escalation of intake, and on short-access (1 hour) days, animals showed consistent levels of intake. These experiments not only showed that animals were simply acquiring a new form of temporal discrimination learning (thus modifying intake based on acquisition of the length of session), but also that escalation (dysregulated drug intake) of drug self-administration can come under stimulus control. How can animals be dysregulated and regulated in their intake simultaneously? As well, how can dysregulated intake come under stimulus control if there is no learning mechanism involved?
Second-Order Schedules of Reinforcement
Another experimental paradigm that exemplifies the ability of drug-associated cues to exert stimulus control over behavior is the second-order schedule of reinforcement . In this paradigm, animals are initially trained to self-administer a drug of abuse intravenously (or orally, in the case of alcohol) as described in the previous section; each operant response results in drug delivery and the simultaneous presentation of a discrete cue (i.e., a light, tone, and/or olfactory stimulus). After successful training of the animal under this primary schedule reinforcement, the contingency of drug delivery upon completion of the operant task is removed, such that only the drug-associated stimulus is presented following each operant response. Thus, each lever press or nose-poke results in presentation of the drug-associated cue stimulus (secondary reinforcer) but no drug delivery (primary reinforcer). The primary advantage of this paradigm is that it allows the investigator to examine drug-seeking behavior in the absence of drug delivery, similar to the cue- and context-induced reinstatement discussed in the next section. Thus, the effect of pharmacological or neurobiological manipulations on responding for the secondary reinforcer can be performed without the potential confound of the psychoactive effects of the primary reinforcer. Acquisition of responding on a second-order schedule can be enhanced by non–response-contingent exposure to a sensitizing regimen of the drug (i.e., cocaine) following the primary reinforcement phase ( Fig. 8.3 ).
However, in order to avoid the extinction of drug-seeking behavior due to the absence of primary reinforcement, a response-contingent delivery of the drug solution must be given at a fixed time interval (i.e., every 30 or 60 min) after the completion of a certain number of operant responses, or at the end of the test session. This allows the animal to receive the primary reinforcer and thus maintain the associations between the drug and responding for drug-associated cues.
Further evidence for the motivational salience of drug-associated cues lies in the fact that when animals are subject to extinction procedures (i.e., when the primary drug reinforcer is withheld in subsequent test sessions following responding under a second-order schedule of reinforcement), response-contingent presentation of the light/tone/olfactory stimulus during extinction trials results in enhanced responding and a slowing of the rate of extinction in rats trained to self-administer cocaine, suggesting that the drug-associated cues maintain their motivational salience despite the fact that the primary drug reinforcer is no longer available. This phenomenon has also been demonstrated during extinction following primary drug reinforcement. However, slowing of the rates of the extinction following second-order heroin reinforcement by response-contingent presentation of the drug-associated cues during extinction trials has not been observed, suggesting that discrete heroin-associated cues exert a lesser degree of stimulus control over behavior than those associated with cocaine.
Cue- and Context-Induced Reinstatement of Drug-Seeking Behavior
Relapse is one of the most problematic aspects in the treatment of drug addiction, as it can occur months or years following the last episode of drug intake. Fortunately animal models have been developed that appear to mimic the phenomenon of relapse in humans. The most widely used animal model of relapse is the reinstatement paradigm. b
b References 10, 11, 14, 35, 87, 106.In this paradigm, animals are trained to self-administer a particular drug of abuse as described in the section “Conditioned Cue Enhancement of Drug Self-Administration.” Following stabilization of patterns of self-administration, animals are then subject to extinction training, where the operant response that previously resulted in drug delivery either has no consequences or results in the delivery of a non-reinforcing substance such as saline. During extinction training, the animal learns that the operant response no longer results in drug delivery and subsequently decreases the number of operant responses exerted. Once specific extinction criteria have been reached (for example, the number of operant responses performed during an extinction trial is less than 20% of those that were observed prior to the commencement of extinction training), the animal is then exposed to one of three types of stimuli that are known to trigger relapse in human addicts: brief exposure to the drug (drug priming), exposure to drug-associated cues, or stressors. The animal then exhibits a significant increase in the number of operant responses that previously resulted in drug delivery; in other words, drug-seeking behavior has been reinstated . It should be noted, however, that in the reinstatement model, performing the operant task does not actually result in drug delivery; the behavior is not reinforced by the drug, and, therefore, the reinstatement of drug seeking is relatively short-lived. Herein lies one of the fundamental (and often criticized) aspects of the reinstatement paradigm where it diverges from the human condition of relapse, since in humans drug-seeking behavior is usually followed by drug self-administration. In the reinstatement paradigm, execution of the operant response does not result in drug availability and self-administration. Nevertheless, the reinstatement paradigm offers a particularly unique method for studying the neural basis of relapse, since drug-seeking behavior is inherently parsed out from actual drug-self-administration behavior, and the behavior of the animal can be observed and recorded in the absence of psychomotor-altering effects of the drug itself.
With regard to the study of the influence of conditioned cues on drug-seeking behavior, the reinstatement paradigm offers the possibility of studying two distinct phenomena. First, if the discrete conditioned reinforcers (i.e., a tone, light, or olfactory stimuli) that were presented to the animal during each drug delivery prior to extinction procedures are reintroduced to the animal in a response-contingent manner, presumably the animal expects that the drug is now available and exerts a significant increase in the number of operant responses that previously resulted in drug delivery. Alternatively, some investigators present the drug-associated cues in a non–response-contingent manner (although this does not reinstate behavior ). Regardless, this phenomenon is known as cue-induced reinstatement , and has been used extensively to study the role of discrete drug-associated cues in the control over drug-seeking behavior ( Fig. 8.4 ).