Simulation Basics

Chapter 11
Simulation Basics


11.1 Introduction


The use of simulation is a natural extension of model development in that simulation allows us to use our models to ask relevant questions about various study design features or conditions (i.e., simulation scenarios) that we have yet to study and to determine the likely impact of these scenarios on study outcome and subsequent clinical or regulatory decision making. If model development provides us with a quantification of the parameters that govern a (perhaps) mechanistic explanation for a system and the impact of drug on that system, then simulation allows us to ask what if ?-type questions to more fully explore that system. This use of simulation implies the existence of a predictive model and is the main focus of the remainder of this chapter. There are, however, several other important uses of simulation, including the demonstration and visualization of various model features and assessment of model performance. The area of simulation-based diagnostics which utilizes simulation techniques to evaluate model performance has exploded recently, starting with the use of simulations in assessing goodness of fit for models of noncontinuous endpoints and extending dramatically with the concept of visual predictive check (VPC)-type procedures and the numerous variations thereof for model evaluation (see Chapter 8 for more on this topic) (Karlsson and Savic 2007). Given the typical historical increases in computing capacity (a critical factor in the successful implementation of simulation strategies) and the ever-increasing degree of complexity in pharmacometric and systems pharmacology models, it is logical to expect continued expansion of the use of simulations as an integral element of model-based drug development.


11.2 The Simulation Plan


The implementation of a simulation effort should be treated much like the implementation of a model development or data analysis effort as previously described, with careful planning and forethought. As such, an important first step is the preparation of a formal simulation plan, with review and approval by appropriate team members. The process of thinking through each facet of the simulation effort, from the technical aspects of how the simulation will be performed to the manner in which the results will be illustrated for interpretation, is a necessary initial step, and one which often eliminates wasted effort (and rework) later.


Furthermore, simulation planning is an ideal opportunity for team members with diverse backgrounds to collaborate in providing valuable and necessary input to make the simulations relevant and useful. Clinicians knowledgeable in the treatment of the disease under consideration should be consulted to ensure that the characteristics of the patient population to be simulated match those of the typical patients for whom the drug is intended. Team leaders and clinical research associates should be consulted to ensure that the assumed study design and execution features, including the data to be collected and sampling schedule, will accurately reflect what can be implemented and expected if a trial like the one to be simulated was actually conducted. Statisticians should be consulted to ensure that the planned study design features such as sample size, endpoint calculations, and the primary statistical analyses can be similarly implemented in the simulation scenarios, so that the estimated probabilities of success with various strategies can be utilized with confidence to facilitate real decision making during a drug development program.


11.2.1 Simulation Components


There are several main components to be considered and specified in the use of simulations in a modeling and simulation framework. These components concern not only the model to be used and the extent of variability to be introduced (often referred to as the input–output model), but how the model will be used in terms of assumptions about the population to be simulated (the covariate distribution model), the features of the trial design and expected execution characteristics (the trial execution model), the number of replicates to be generated, and how the simulated data will be analyzed for decision-making purposes (Kimko and Dufull 2007). These components will be discussed sequentially in the subsections that follow.


11.2.2 The Input–Output Model


When simulation is performed following the development of a pharmacometric model, the specification of the input–output model may be one of the easiest steps. The estimated model, including its associated fixed- and random-effect parameter estimates can be directly used for the input–output model of the simulation. When all components are known to the simulation scientist, and the model code is available, this is a trivial step (see example in Section 11.2.2.1). However, when a model described in the literature is to be used in a simulation scenario, portions of the necessary information for complete specification of the model are often missing or uncertain. In this case, assumptions will need to be made in order to execute the simulation strategy. When assumptions are required about model features that are critical to the results and interpretation of findings, these assumptions must be clearly and carefully stated with the presentation of the results. Furthermore, consideration should be given to varying such settings or testing differing assumptions in order to protect against invalid or irrelevant results. For example, if a pharmacokinetic/pharmacodynamic (PK/PD) model to be used from the literature was developed in a different patient population than the population of interest for the simulations, it may be that certain baseline features, critical to the model-predicted outcomes, are not known for the population of interest. Various scenarios could be considered where the baseline was set equal to the baseline distribution in the other patient population, the mean or median of the baseline distribution is set equal to the mean or median of the other population, but the variability around that mean is increased in the new population to be simulated, or the features of the baseline distribution are derived from other publications of similar populations or from publicly available databases of similar patients. Testing of various scenarios with regard to key assumptions is, therefore, recommended in many situations. The results of the various scenarios can then be presented separately or weighted and pooled with appropriate input from key team members.


If a pharmacometric model is intended for use in simulations, the fixed-effect parameters are generally fixed to the final estimates obtained when the model was estimated using the original model development dataset and the variances of the interindividual variability (IIV) in the parameters and the residual variability are fixed to the variance estimates from the model estimation. The assumptions regarding the variability estimates (elements of omega and sigma, or etas and epsilons) are that the random-effect model terms are normally distributed with mean 0 and variance as specified by the user. Thus, when the simulation step is implemented, new realizations are generated for the eta and epsilon terms for each virtual subject (etas for each parameter with IIV; level 1 error) and for each sample from that subject (epsilons for each endpoint or dependent variable (DV) to be simulated according to the specified $ERROR model; level 2 error) based on these assumptions and the parameter estimates specified. Thus, the resulting simulated parameter distributions should roughly mimic the original estimated parameter distributions.


The variability structure is a critically important feature of models to be used as input–output models in simulations. The specification of the random effects in a nonlinear mixed effects model for simulation can play a fundamental role in the output obtained and, therefore, the simulation findings. When pharmacometric models intended for use in simulations are developed using a diagonal structure for the omega matrix, the inherent underlying assumption is that no covariances (correlations) exist between the individual-specific discrepancies (random effects) for various pairs of parameters in the model. While many so-called final pharmacometric models rely on this assumption, simple pair-wise scatterplots of the individual eta estimates will reveal that this is often not so. This is not to say that such models are necessarily invalid, however, as the inclusion and precise estimation of such covariance terms in an already complex model without causing overparameterization may be a difficult and elusive task, particularly for the classical estimation methods. One advantage of some of the newer estimation methods is that the full omega matrix is more easily obtained.


If the inclusion of a full omega block structure in a model to be used for simulation can be estimated with successful model convergence, and even if the estimation of some of the off-diagonal terms of the omega matrix are associated with very poor precision, the use of this more complete model in a simulation scenario may be preferable as it should provide a more realistic outcome than the model with the diagonal omega matrix structure. Even if the covariance between some of the pairs of IIV terms is estimated to be moderate or small, incorporation of this assumption instead of an assumption of zero covariance will preclude or limit the simulation of certain pairs of estimates that would be highly implausible. When the covariance between two parameters (say CL and V) is moderate to high, the corresponding IIV in a related parameter (such as t1/2, whose calculation is based on the other two parameters) will be appropriately lower if the covariance is accounted for in the simulation versus if it is ignored and the two original parameters are assumed to be independent. For these reasons, the use of a full omega block structure in models to be used for simulations, whenever possible, is recommended.


11.2.2.1 Coding a Simulation


In the following example, a simple PK model is to be used for simulation.

$PROBLEM sim-example
$DATA /home/user/data/dataset.csv IGNORE=#
$INPUT ID DATE=DROP TIME AMT ADDL II DV LGDV CMT EVID MDV TSLD NUM
$SUBROUTINES ADVAN2 TRANS2
$PK

TVKA=THETA(1)
KA=TVKA
TVCL=THETA(2)
CL=TVCL*EXP(ETA(1))
TVV=THETA(3)
V=TVV*EXP(ETA(2))
S2=V/1000

$ERROR

EP1=EPS(1)
IPRED=F
IRES=DV-IPRED
W=F
IWRES=IRES/W
Y=IPRED+EPS(1)*W

$THETA (9.77476E-01) ;–th1- KA: absorption rate (1/hr)
    (3.06802E+00) ;–th2- CL: clearance (L/hr)
    (6.01654E+01) ;–th3- V: volume (L)

$OMEGA (8.17127E-02) ;–eta1- IIV in CL [exp]
    (6.22633E-02) ;–eta2- IIV in V [exp]

$SIGMA (4.77067E-02) ;–eps1- Constant CV [ccv]

$SIMULATION (9784275) ONLYSIM

$TABLE ID TIME CL V KA ETA1 ETA2 EP1
     FILE=sim-example.tbl NOPRINT

Note that the $PK and $ERROR blocks in this simulation control stream are largely the same as those in a typical control stream intended for estimation. Furthermore, if the purpose of the simulation is to better understand the features or behavior of a particular model with the features (e.g., number of subjects, doses, and sample collection events) of the analysis datasets, then the analysis dataset may also be used without modification. Using this method, simulated observations, based on the model and parameter estimates, will be generated for each real observation in the original analysis dataset. The lines of code in the control stream shown in bold font are elements that are unique to simulation.


The $ERROR block contains the line “EP1 = EPS(1).” In order to output the simulated epsilon values, if they are of interest, the EPS(1) variable must be renamed; by doing so, the new variable EP1 may be output to a table file (whereas EPS(1) may not). Also note that the $THETA record contains no lower or upper bounds for any of the parameters. While the addition of bounds would not change the behavior of this code, they are unnecessary, as only the initial estimates of each parameter are used for the simulation. The use of the FIXED option with each estimate is similarly unnecessary. In this example, the values used for $THETA, $OMEGA, and $SIGMA are the final parameter estimates obtained from a previous estimation of this dataset, although they need not necessarily have been obtained this way.


The power of simulations and the implementation of simulations in a modeling and simulation framework, lies in the ability to ask questions of a model and consider what-if scenarios; some of these scenarios may involve considering alternative parameter estimates for some or all parameters. The alternative values would be implemented via the $THETA, $OMEGA, and $SIGMA records as shown in the example. Some changes in scenario may require defining a new dataset structure to change the number of subjects, dose amounts or frequency, timing of sample collections, or other elements of the dataset that reflect alternate events than those described in the original dataset.


In order to invoke the simulation process, the $SIMULATION record is included, in this case, in place of the $ESTIMATION and $COVARIANCE records which would be included in a more typical control stream for estimation. Two options are specified in this example with $SIMULATION, a required seed value provided in parentheses, and ONLYSIM. The seed value initiates the pseudo-random number generator. Details of the seed value and its usefulness will be discussed in more detail in Section 11.3.1. The ONLYSIM option indicates to NM-TRAN that only simulation is to be performed with this control stream, that is, no estimation is to be performed. In addition, by specifying ONLYSIM, the values of the random variables CL, V, ETA1, ETA2, and EP1 that are output to the table file will be the simulated values (those most likely of interest in this situation) and not the typical values, which would be output without the ONLYSIM option (Beal et al. 1989–2011).


While the NONMEM report file resulting from a simulation is not generally of particular interest, the table file output is the primary source of results. The following lines illustrate the information reported from a simulation in the NONMEM report file, where the monitoring of the search output would be found in an estimation run:

 SIMULATION STEP PERFORMED
    SOURCE 1:
    SEED1: 747272046 SEED2:    0
NONMEM(R) Run Time: 0:00.48

As previously discussed, the table file contains the simulated values of the random parameters that are requested for display, such that CL, V, ETA1, and ETA2, in this example, will contain unique simulated values for each subject in the dataset. Similarly, EP1 will contain the unique simulated values for each record in the dataset. Typically of most interest, the simulated values for each observation may be found in the DV column in the table file. When DV, PRED, RES, and WRES are appended to the table file output, by default, the DV column will contain the simulated values in place of the original observed values. The PRED column will contain the population predicted values for each subject and sampling timepoint, based on the model and typical value parameters provided. The RES and WRES columns, although provided by default, are generally ignored from a simulation run, as the RES contains the difference between the simulated observation (in DV) and the population prediction (in PRED) and the WRES column consists of only 0 values. Appendix 11.1 provides a portion of the original data file and the corresponding portion of the table file output from the simulation discussed earlier.


With most simulation applications, it is of interest to simulate with replication, that is, to simulate several hundred datasets and explore behavior across the more sizable sample. To accomplish this type of replication, another option is available on the $SIMULATION record: SUBPROBLEMS. The SUBPROBLEMS option, as shown below, indicates that the entire simulation process should be repeated a specified number of times (in this case, 500 times). The table file output then contains the concatenated output of all of the subproblems, listed one after another (Beal et al. 1989–2011).

 $SIMULATION (9784275) ONLYSIM SUBPROBLEMS=500 

As discussed in Section 8.8.1, when describing the VPC procedure for model evaluation, one may wish to keep track of the particular replication of the simulation. To do so, simply add the code:

 REP = IREP 

to the $PK block and request the variable REP in the table file to provide the replication number, which may be useful in postprocessing the output.


11.2.3 The Covariate Distribution Model


Another critical component in any simulation scenario is the covariate distribution model. The covariate distribution model defines the characteristics of the virtual patient population to be simulated. The characteristics that need to be simulated consist of only those which have an impact on the input–output model(s) in some way. Thus, any patient characteristic that is included in either the PK or PK/PD model as a covariate effect will need to be assigned for each virtual patient in the simulation population. If the PK/PD model includes or relies on a baseline effect that is related to a subject characteristic, this will need to be assigned as well. Very careful consideration should be given to simulation of the patient characteristics in order to make the simulation results as meaningful as possible. There are various methods that could be implemented to define the covariate distribution model. These include sampling patient characteristics for each virtual patient from available distributional information, resampling patient characteristics from a set of characteristics or patient covariate vectors in an available dataset, and the use of information available from public databases containing information regarding patient characteristics. Each method has certain advantages and disadvantages, which will be discussed in turn, in the sections that follow.


11.2.3.1 Sampling of Covariate Values from a Distribution(s)


With a limited set of patient characteristics to be simulated, one could consider simply using the distributional characteristics (mean, standard deviation, range) of the covariates in the model development dataset or some other known and appropriate population or set of data to randomly assign values to the virtual population to be simulated. Many software packages, such as SAS® and R, have functions that permit random sampling from a specified distribution. Similar to the discussion in Section 11.2.2

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jun 21, 2016 | Posted by in PHARMACY | Comments Off on Simulation Basics

Full access? Get Clinical Tree

Get Clinical Tree app for offline access