Assessment components
Determinants of training effectiveness
Pre-training assessment
Establish whether trainees can be reasonably expected to benefit from the training
Indicates whether or not remediation should occur prior to training.
Reactions
Determine whether trainees find the training enjoyable
Determine whether trainees believe that the training has value to their development
Determine whether trainees display high degrees of self-efficacy
Motivation
Determine whether the system has intrinsic motivational value
Learning
Determine whether appropriate learning has occurred
Training performance (behavior)
Determine whether appropriate skills have been mastered and can be demonstrated, including psychomotor skills, psychomotor fluency, judgment/decision making, stress inoculation
Transfer of Training
Determine whether learning in training has desired effect on real world performance using a near transfer paradigm
Results
Determine whether the investment in training was justified in terms of achieved outcomes (i.e., elimination or drastic reduction of live tissue use)
Pre-training Assessment
Pre-training assessment determines whether students have the cognitive or affective prerequisites for training, such as aptitudes (cognitive and physical ability), prerequisites (prior knowledge and experience), and attitudes (motivation for and instrumentality of training). This type of assessment can indicate the need for remedial (pre-training) training, or for tailoring the training to specific needs (e.g., by starting with less challenging scenarios).
Reactions/Motivation
Reactions/motivation speaks to students’ beliefs about the value or utility of the training, and is a predictor of performance and transfer of training, as is self-efficacy, or confidence in one’s abilities. Self-efficacy can be measured using self-efficacy scales [33], which quantify confidence along an ascending scale (from 0 – cannot do, to 100 – can almost certainly do) [34].
Learning and Training Behavior
Learning reflects cognitive change, and is typically assessed via recognition or recall tests. When dealing with complex higher-order skills (e.g., MAS), these are not sufficient to capture whether appropriate learning has occurred. Rather, declarative and procedural knowledge assessments are needed to ensure that a sufficient knowledge base to support higher-order performance is established in trainees. Training behavior, or demonstration of skill mastery, is more challenging to measure than learning because it usually involves a concrete demonstration, which is not amenable to written tests. In MAS, critical psychomotor skills can be assessed using metrics obtained from digital or sensor technologies embedded in simulations. Psychomotor fluency or automaticity can also be assessed with instrumented simulations and video recording of performance. Assessment of judgment/decisionmaking can be accomplished by presenting students with case scenarios, in which they are required to rapidly recognize a situation and take appropriate action under a variety of stressful conditions (e.g., requiring them to perform at faster than normal speeds). These methods have been developed in military medical simulation environments.
Transfer of Training
Transfer of newly learned material to the operational environment is a complex phenomenon that encompasses a variety of factors [35]. A consistent finding from past work indicates that even when trainees have mastered the competencies for effective performance, it does not necessarily mean they will use those skills in practice. These findings complicate the assessment of transfer because it is often not clear why trainees fail to transfer what they learned (i.e., it could be because the training was inadequate or due to an environmental factor). One way to make transfer of training more likely is to create a highly realistic, authentic (e.g., VR) environment in which to help trainees gain self-efficacy.
Organizational Results
The final level in Kirkpatrick’s hierarchy involves determining whether the organization’s goals (e.g., safety, productivity, reduced costs, reduction of errors, quality/quantity of performance, profit, job satisfaction, personal growth) were served by the training. An organizational goal of the US military, for example, is to reduce the use of live animals used in medical training. An organizational goal shared by all is improved patient safety, which is facilitated by improved training.
Formative and Summative Assessment
Training and assessment are often viewed as activities that should be integrated if either is to be effective. In their paper on assessment of clinical competence, Wass et al. [36] discuss the need for both formative and summative assessment in medical education. In formative assessments, students gain knowledge from tests and are given immediate feedback that they can use to correct their mistakes and refine their knowledge and skills. Summative assessments judge student performance for the purpose of evaluating clinical competence, and necessarily include definitions of failure and success. Wass and colleagues stress the importance of using both types of assessment; “if assessment focuses only on certification and exclusion,” they assert, “the all important influence on the learning process will be lost” [36].
This sentiment is echoed by Handfield-Jones et al. [37], who describe formative assessment as a way to enhance the learning process and summative assessment as tool to use when intermediate or target endpoints have been identified, and others [38]. Frequent assessments, they maintain, provide students with many points at which they can correct and enhance their performance, and advocate the use of both formative and summative assessments. “If a resident does not receive or accurately process faculty formative or summative evaluations,” state Peyre et al., “then they [sic] can develop an inaccurate perception of their [sic] abilities” (more on self-assessment below) [39].
Curriculum-Based Context for Assessment
Surgical training development ideally begins with definition of a set of desired outcome measures that describe what the learner should be able to accomplish upon completion of the course. Next, each procedural step and assessment mechanism should be defined (see list below). Developing competency benchmarks for each outcome measure is a critical step in ensuring that learners attain a minimum level of competency based upon established, quantifiable definitions of success. These benchmarks can be objectively defined as follows:
Have experts in the procedure perform the tasks until their learning curves plateau
Determine the mean and standard deviation of the average score of the experts
Define the benchmark for novices as one standard deviation below the expert mean
Define the benchmark for experienced practitioners as the expert mean
Define competence as scoring at or above the benchmark on two consecutive trials
Surgical training necessarily consists of both cognitive and technical components. As suggested above, effectiveness of cognitive training can be assessed by (1) an initial pretest of requisite principles; (2) didactic instruction, including error identification; and (3) a post-test. Ideally, a perfect score should be attained on the post-test before the technical/psychomotor skills training portion begins; this enables errors made during the skills component to be attributable to the need for more practice rather than for rehashing of principles.
Effectiveness of the skills component should also begin with a pretest, then an initial runthrough of the task by the learner, followed by as many repetitions of the task as needed until the agreed-upon level of competency is attained. In the event that training to competency is not possible, a post-test can be given when the learner has completed a predetermined number of trials to ascertain if he or she has successfully attained the desired skills.
Technical Skills Assessment in MAS
As discussed above, the technical skills needed for performing MAS are unique to MAS. In a 2002 study, Payandeh et al. explored the definition of metrics for assessing laparoscopic surgical skills [40]. In one study, novice and expert surgical residents were videotaped while performing four MAS tasks (suturing, tying knots [41], cutting suture, and dissecting tissue) in an animal lab, and their performance was analyzed [42]. Five basic motions were identified, i.e., (1) reach and orient, (2) grasp and hold/cut, (3) push, (4) pull, and (5) release. Time duration to complete these tasks was used to isolate and compare subtasks, e.g., dissecting tissue had the two subtasks of pulling tissue taut and snipping tissue. From this analysis, the time to complete the pull taut tissue subtask emerged as an important differentiator between novice and expert performance. A similar conclusion was reached when time to complete the seven subtasks of the suture task (position needle, bite tissue, pull needle through tissue, reposition, re-bite tissue, re-pull needle, and pull suture through) was analyzed. “Hence, any training system,” the authors concluded, “should have components that train novices in these subtasks with the aim of reducing the time taken during the subtasks.”
In addition to time being a key metric for MAS assessment, hand-eye coordination tasks (dexterity) were shown to be important in MAS training. Dexterity was measured by total deviation of the path from the nominal trajectory, force profile on the object created by the novice, and total number of contact discontinuities created when tracing the nominal path.
In Liang & Shi’s evaluation of a virtual surgery training system (not specific to MAS), three categories of surgical skill evaluation criteria emerged: (1) efficiency, (2) safety, and (3) quality. Efficiency refers to the metrics of time to completion and redundancy of movement, safety refers to the ability to avoid “unwanted collisions” between instruments and anatomy, and quality refers to measurement of trainee performance relative to standards of success and failure [45].
Learning from Mistakes
Another important factor in learning is identifying and remedying errors. As has been noted by others [49], errors may stem from failures of perception [50], knowledge, decisionmaking, and execution. A common serious error in MAS is injury of the bile duct during laparoscopic cholecystectomy [51], where perception errors in identifying the cystic and common ducts have caused lethal errors [52, 53]. Medical errors, as touched on briefly above, have been a worldwide problem of epidemic proportions. As for the subset that is surgical errors, more than half are due to technical mistakes [54]. The literature overwhelmingly concludes that as many as half of all surgical adverse events are preventable [55–61].
The ability of simulation-based training to reduce MAS errors has been known for some time. In 2002, a study by Seymour et al. [17] showed that, compared to controls, residents trained on a virtual reality (VR) laparoscopic cholecystectomy simulator performed gallbladder dissection 29 % faster, were nine times less likely to transiently falter (0.25 vs. 2.19 “lack of progress” errors), were five times less likely to injure the gallbladder or burn nontarget tissue (1 vs. 5 errors), and had six times fewer (84 % fewer, 1.19 vs. 7.38) mean errors (Fig. 11.1). These findings were confirmed in 2007 by Ahlberg et al. [62] In 2008, significantly improved performance in laparoscopic suturing after simulation-based training was observed in senior surgery residents, who had 31 % fewer errors (25.6 vs. 37.1) than controls, performed 34 % faster (8.8 min vs. 13.2 min) [63], and had 35 % fewer excess needle manipulations (18.5 vs. 27.3)” [63]. More recently, a 2013 assessment indicated that VR simulation-based laparoscopic skills training improved performance, and was especially effective in reducing procedural errors [64]. A systematic review of the entire body of literature on simulation-based laparoscopic surgery training to May 2014 indicated “that simulation-based training can… increase translation of laparoscopic surgery skills to the OR.” Specific improvements included enhanced economy of movement, more accurate suturing, reduced time, and fewer errors [65].
Fig. 11.1
Error reduction correlated with simulation-based training (Reprinted with permission Seymour et al. [17])
MAS Self-Assessment
The above discussion gives a framework for thinking about self-assessment in MAS. As stated by Peyre et al., “Accurate self assessment of laparoscopic technical skill raters [66]. In another study, trainees were more self-critical and the opposite was true [39]. In other studies, surgeons’ self-assessments were similar to those of raters [67, 68]. These disparities may be due to factors such as level of experience; more senior residents, for example, may be less accurate in self-assessment than junior residents [39].
Use of global rating scales can assist in self-assessment by introducing objectivity. In the Peyre et al. study, the Objective Structured Assessment of Technical Skills (OSATS) global rating scale [69] was used for self- and faculty assessments of performance of laparoscopic obstetrical and gynecology procedures [39]. The 10 technical skills in the OSATS scale were (1) preparedness, (2) respect for tissue, (3) time and motion, (4) knot tying, (5) instrument handling, (6) knowledge of instruments, (7) use of assistants, (8) anatomy, (9) knowledge of procedure, and (10) overall performance. In development can accelerate a trainee’s learning growth and guide life-long development for surgeons.” [39] The accuracy of self-assessment, however, is an open question. In one study, for example, surgeons’ self-assessment scores after performing a laparoscopic colectomy were significantly higher than those of trained a 2011 study using OSATS, surgeons performing a laparoscopic cholecystectomy on a simulator were found to be able to accurately self-assess their technical, but not their nontechnical, skills [70].
Revalidation
In the UK and other European countries, a process of revalidation is already taking place. Revalidation is the process by which all licensed doctors have to demonstrate to the regulatory authority (General Medical Council (GMC) in the UK) that they are up to date and fit to practise. Revalidation is now a legal requirement for all doctors since its launch in the UK in December 2012, underpinned by dedicated legislation called the Medical Professional (Responsible Officers) (Amendment) Regulations 2013 [71].
The purpose of revalidation is to promote patient safety and improve the quality of patient care. It is also intended to strengthen continuing professional development and reinforce systems that identify doctors who encounter difficulties and require support. In surgery, it is intended to identify surgeons who are underperforming and surgeons are expected to provide a core set of supporting information over each revalidation cycle at appraisal. A responsible officer will assess the information from appraisal, and will then make a revalidation recommendation to the GMC, normally every five years. For full detail on the revalidation process please see the UK Revalidation Guide for Surgery [71].
Supporting Information for Surgical Revalidation
Continuing Professional Development (CPD)
The surgical Royal Colleges and specialty associations in the UK have written a CPD Summary Guide for surgery, which includes a CPD checklist, which can be used as an aid during appraisal discussions [72]. Surgeons should record their CPD in a consistent and structured way and meet some minimum requirements:
Collect a minimum of 50 credits per year = 250 credits every 5 years
1 credit = 1 h of CPD
CPD programme should be set and reviewed at appraisal.
Measurement of Clinical Outcomes
The measurement of clinical outcomes of care is complex with several different methods available:
National clinical audits specifying personal outcomes
Outcomes derived from routinely collected data, e.g. Hospital Episode Statistics
National clinical audits specifying the surgical team/unit’s outcomes
Local audit of outcomes
Structured peer review of outcomes
Recertification
In the USA, the concept of time-unlimited licensure was challenged as the public demanded some reassurance that specialists were keeping abreast of developments within their field. The American Board of Colon and Rectal Surgery (ABCRS) issued its first series of time-limited licensures in 1990, which were eventually finalized to last 10 years. The first recertification examination was administered in 1991 and by 2011, 924 participants had participated in the examination process with a 97 % pass rate [73].