Development and validation of a simulation-based assessment of operative competence for higher specialist trainees in general surgery

Toale, Conor; Morris, Marie; Roche, Adam; Voborsky, Miroslav; Traynor, Oscar; Kavanagh, Dara

doi:10.1007/s00464-024-11024-1

Development and validation of a simulation-based assessment of operative competence for higher specialist trainees in general surgery

Open access
Published: 17 July 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Surgical Endoscopy Aims and scope Submit manuscript

Development and validation of a simulation-based assessment of operative competence for higher specialist trainees in general surgery

Download PDF

Conor Toale ORCID: orcid.org/0000-0002-4858-4813¹,
Marie Morris¹,
Adam Roche²,
Miroslav Voborsky²,
Oscar Traynor¹ &
…
Dara Kavanagh¹

99 Accesses
Explore all metrics

Abstract

Background

Simulation is increasingly being explored as an assessment modality. This study sought to develop and collate validity evidence for a novel simulation-based assessment of operative competence. We describe the approach to assessment design, development, pilot testing, and validity investigation.

Methods

Eight procedural stations were generated using both virtual reality and bio-hybrid models. Content was identified from a previously conducted Delphi consensus study of trainers. Trainee performance was scored using an equally weighted Objective Structured Assessment of Technical Skills (OSATS) tool and a modified Procedure-Based Assessment (PBA) tool. Validity evidence was analyzed in accordance with Messick’s validity framework. Both ‘junior’ (ST2–ST4) and ‘senior’ trainees (ST 5–ST8) were included to allow for comparative analysis.

Results

Thirteen trainees were assessed by ten assessors across eight stations. Inter-station reliability was high (α = 0.81), and inter-rater reliability was acceptable (inter-class correlation coefficient 0.77). A significant difference in mean station score was observed between junior and senior trainees (44.82 vs 58.18, p = .004), while overall mean scores were moderately correlated with increasing training year (rs = .74, p = .004, Kendall’s tau-b .57, p = 0.009). A pass-fail score generated using borderline regression methodology resulted in all ‘senior’ trainees passing and 4/6 of junior trainees failing the assessment.

Conclusion

This study reports validity evidence for a novel simulation-based assessment, designed to assess the operative competence of higher specialist trainees in general surgery.

Graphical abstract

Why do residents fail simulation-based assessments of operative competence? A qualitative analysis

Article Open access 29 August 2023

Training and assessment using the LapSim laparoscopic simulator: a scoping review of validity evidence

Article 19 September 2022

The impact of surgical simulation on patient outcomes: a systematic review and meta-analysis

Article Open access 13 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The traditional Halstedian apprenticeship training model in surgery has undergone significant revolution in recent decades [1]. Evolving patient expectations regarding the role of surgical trainees in their care [2], an increased emphasis on theater efficiency [3] and well-explored concerns regarding the perceived operative competence and confidence of graduating trainees [4,5,6], has led to a re-evaluation of this training paradigm. ‘Competency-based’ approaches to outcome-driven training and assessment are well established across training jurisdictions [7, 8], leading to the development of nominally time-independent postgraduate surgical programs [9]. The surgical specialties pose a unique challenge in this regard, due to the requirement for robust and reliable methods of teaching and assessing competence in operative skill [10].

From August of 2021, surgical training in Ireland and the United Kingdom (UK) has become explicitly outcomes based [11]. Trainees are measured against high-level outcomes, (i.e., ‘Capabilities in Practice’). Some trainees who demonstrate accelerated development may complete training more rapidly than the indicative time [11]. A somewhat similar system of high-level ‘milestones’ and derived ‘Entrustable Professional Activity’ assessments now operates in the USA context [8]. In order to transition to a truly competency-based, time-independent training paradigm, operative skill will need to be assessed using objective, standardized, and validated approaches. The primary tool for measuring operative competence in the UK and Ireland context remains the Procedure-Based Assessment, a workplace-based procedure-specific assessment tool [12]. Though this tool has accrued substantial validity evidence, concerns exist regarding the opportunistic way in which these assessments are undertaken by trainees, as well as their potential for rater subjectivity [13]. Operative experience targets, though a traditional proxy measure of operative competence, have been de-emphasized in the updated curriculum and lack validity evidence in the UK and Irish context [13].

The role of simulation-based training and assessment in surgery has evolved in tandem with competency-based education [14]. Simulation is commonplace in surgical training curricula [15] and is increasingly being deployed as an assessment modality [16, 17]. Simulation has most notably been used in high-stake assessments through the Colorectal Objective Structured Assessment of Technical Skill (COSATS) by the American Board of Colon and Rectal Surgery [16, 18]. Such simulation-based assessments have not yet been used in a high-stakes setting for higher specialist training in Ireland or the UK to date. We hypothesize that simulation could be used as a reliable, valid, objective, and standardizable method of summative operative competence assessment for trainees undertaking higher specialist training in general surgery.

This study sought to develop and validate a pilot simulation-based assessment of operative competence based on the common training curriculum of the UK and Ireland; the Joint Committee on Surgical Training (JCST)/Intercollegiate Surgical Curriculum Program (ISCP) curriculum in general surgery. Herein, we describe the approach to assessment design, development, and pilot testing and report on the validity of this assessment in line with Messick’s validity framework [19].

Methods

Initial development of the examination framework

The approach to assessment development and validation was derived from a primer for the validation of simulation-based educational assessment published by Cook and Hatala [20]. The primary interpretation of the intended construct was defined as “learners have operative skill sufficient to operate safely as an independent practitioner in General Surgery.” With reference to the UK and Ireland surgery curriculum, this construct relates to the following Capabilities in Practice (CiPs), Generic Professional Capabilities (GPCs), and syllabus requirements: ‘manages an operating list’ (CiP), ‘professional skills’ (GpC), and both ‘technical skills—general’ and ‘technical skills—index procedures’ (syllabus). The intended decision related to this examination was derived through an iterative process of literature review [21], semi-structured interviews with key stakeholder representatives [22], and a survey of surgical trainees regarding their perceptions and experience of simulation-based assessment (unpublished data). This pilot assessment was derived to explore whether scores achieved could inform, as part of a multi-faceted assessment framework, whether a given trainee has sufficient procedural competence to proceed to the next stage of training or independent practice.

This assessment was designed based on the well-established Objective Structured Assessment of Technical Skill (OSATS) examination [23] as followed by other high-stakes simulation-based assessments [16, 18]. Therefore, the initial design of this assessment was that of a time-limited eight-station framework [23]. Operative procedures for use in this assessment were selected using a modified Copenhagen Academy for Medical Education and Simulation Needs Assessment Framework (CAMESNAF) [24]. This resulted in a prioritized list of procedures suitable for simulation-based assessment of general surgery trainees at the end of phase 2 (ST6), from which eight were chosen for inclusion in the assessment model [25]. Future, varying assessment iterations could be created using the procedural lists generated from this Delphi process.

Assessment instruments

Performance was assessed at each station using a modified Procedure-Based Assessment (PBA) tool (consisting of a task-specific checklist and a global rating scale) [26] and the OSATS tool [27]. The OSATS tool is the most commonly used tool in surgical assessment research [28]. The PBA tool consists of a procedure-specific checklist containing pre-, intra- and post-operative domains, along with a global rater from 1a to 4b. A combination of an equally weighted checklist tool and OSATS tool is currently used by RCSI’s ‘Operative Surgical Skill’ assessments which inform decisions on progression from Core to Higher specialist training [29]. These annual assessments undergo rigorous annual standard setting and validity testing, although focus on core basic surgical skills and tasks rather than procedures. Due to time and feasibility constraints, key component tasks, rather than entire procedures, were simulated for assessment in some stations in this assessment. For this reason, the intra-operative domain of the PBA tool was adapted by a steering group of simulation and surgical education researchers (CT, MM, DOK). Items on the modified checklist were scored as ‘done,’ ‘partially done,’ or ‘not done,’ with total scores standardized and weighted equally to OSATs scores to calculate the total final score. An example assessment tool is available in the supplemental material (Fig. S1, supplemental data). The modified PBA checklists were trialed by a consultant surgeon and surgical trainee to ensure that they could be completed on the proposed stations in the indicated time frame. Data from this trial were not included in subsequent analysis. As assessment tool modifications can have impacts on their validity, this study also seeks to generate validity data for these modified tools.

The LapSim™ simulator was used to assess performance in three of the eight stations. This simulator reports a number of computer-measured metrics (Table S1, supplementary data). LapSim™ recorded metrics did not contribute to candidate scores, although statistical analyses are reported herein.

Participants

Previous literature has suggested that 8 stations provide a reliable indicator of performance in a multi-station OSCE [23]. While this pilot assessment may inform future sample size considerations in larger iterations, initial sample size was determined based on the number of participants required to demonstrate a difference in outcomes between ‘senior’ and ‘junior’ general surgery trainees (trainees in their last or first four years of surgical training, respectively). Studies reporting mean PBA scores awarded with standard deviations are lacking since the introduction of a revised global rating scale in 2017 [13]. Furthermore, a novel, modified PBA tool for use in simulation-based assessment is being used in this study. For these reason, power calculations could only be conducted using the OSATS tool. Sample size was determined based on anticipated differences in overall scores across the eight-station OSCE format and not at the individual station level. Mean OSATS scores and standard deviations were derived from a prior study using a simulation-based assessment to assess surgeons at the end of training (mean score 0.75, SD 0.06) [30]. Assuming that a score difference of 20% or more was a relevant difference between groups, a sample size of six participants per assessment group was sufficient to minimize the risk of a type II error (α = 0.05). Again, it is important to emphasize the pilot nature of this assessment, and the lack of data reporting mean scores across simulated PBA assessment tools let alone the modified checklists used in this study.

Procedure

Ethical approval was granted by the University of Medicine and Health Sciences at the Royal College of Surgeons in Ireland. The assessment took place at the National Surgical Training Center, Royal College of Surgeons in Ireland, which contains a purpose-built ‘wet lab’ for simulation-based technical skills training and assessment. Examiners were familiarized with the format, tools, and models prior to the assessment, and pre-assessment standard setting was conducted using a modified Angoff method [31]. Candidates were informed of the assessment format in a 30-min briefing session prior to commencement. This orientation session also allowed for introduction to the virtual reality models used (LapSim, Surgical Science, Sweden). Examiners were not informed of the trainee’s level of training. Stations utilized a combination of bio-hybrid models (ileostomy reversal, operative management of a fistula-in-ano, pilonidal sinus excision, ventral hernia repair, emergency laparotomy, and management of a blunt liver injury) and virtual reality models (laparoscopic appendicectomy, laparoscopic cholecystectomy, and right hemi-colectomy—vessel ligation). The LapSim simulator (Surgical Science, Sweden) was chosen for use in the virtual reality stations given the substantial validity evidence published with respect to its use in training and assessment [32]. Candidates rotated through the stations and were assessed by a single assessor using the modified PBA and OSATs tools. Two stations (ileostomy reversal and emergency laparotomy) were also scored by a second, independent rater. Performance in each station was video recorded in an anonymized fashion for quality assurance purposes. Scores were entered immediately to a secure database (QPERCOM). Instructions regarding the procedure or task to be completed were given to participants before each station. Video capture was used for quality control purposes. Rater training was conducted before the assessment. All assessors were registered consultant surgeon trainers with the Royal College of Surgeons in Ireland.

Statistical analysis

Validity evidence was collated according to Messick’s validity framework [19]. Internal consistency was assessed using Cronbach’s alpha. Inter-rater reliability was calculated using intra-class correlation coefficients. Reliability was further explored using generalizability theory [33]; variance component analysis was used to determine the relative contribution of trainees, assessors, and stations to observed score variance allowing for the estimation of the ‘true’ or ‘intended’ variance, i.e., the proportion of score variance attributable to individual trainees. For the purpose of this analysis, trainees, assessors, and stations were thought of as a sample from an infinite universe of potential trainees, assessors, and stations [34]. The Consequential validity evidence was explored by generating a pass-fail cut-off score, using borderline regression methodology [35]. A linear regression model was used to plot the global competency score against both OSATS and PBA checklist scores. The global score representing the ‘borderline’ candidate was inserted into the linear equation, and the corresponding total OSATS and PBA checklist scores were extrapolated, representing the true pass score for each station. The pass score for the assessment as a whole was the sum of the pass scores for each station. OSATS and modified PBA scores were equally weighted. Relationships with other variables (training year) were performed using Pearson correlation coefficients. Findings are reported according to the domains of Messick’s unified validity framework; response process and content domains are outlined above, and no further supporting evidence according to these domains is further outlined.

Results

Thirteen candidates undertook the assessment across eight 15-min stations. Candidates were divided into ‘junior’ trainees in their first 4 years of surgical training (ST1–ST4, N = 6) and ‘Senior’ Trainees in their last 4 years of training (ST5–ST8, N = 7). Mean scores awarded are outlined in Table 1.

Table 1 Mean scores awarded across assessment stations

Full size table

Internal structure

Inter-station reliability for the total score awarded was assessed using Cronbach’s alpha (α), calculated as 0.81 for the assessment overall. Removing station 1(management of a fistula-in-ano) would result in a higher α (Table 2). Internal consistency of individual assessment tools is outlined in Table S2, supplementary data. Correlations between individual station scores and the total score and between scores awarded at each station, are reported in Tables S3 and S4 (supplementary data).

Table 2 Assessment metrics for each station

Full size table

Inter-rater reliability was assessed for two stations: emergency laparotomy (Station 7) and ileostomy reversal (Station 2). On intra-class correlation coefficient analysis, inter-rater reliability was 0.77. Individual station scores variably correlated with the total score awarded. Using Generalizability Theory, a reliability coefficient (G) of 0.7 was calculated for the overall 8-station assessment. Fourteen hypothetical stations would be required to increase the reliability coefficient to > 0.8. Further, data relating to the model construction and variance component analysis are outlined in Table 3.

Table 3 Generalizability theory analysis

Full size table

Relationships with other variables—differentiating between junior and more senior trainees

A significant difference in mean station score was observed between ‘Junior’ and ‘Senior’ trainees (44.82 vs 58.18, p = 0.004) (Fig. 1). Mean scores were moderately correlated with increasing training year (rs = 0.74, p = 0.004, Kendall’s tau-b 0.57, p = 0.009) (Fig. 2).

At an individual station level, five of eight stations could differentiate between ‘Junior’ and ‘Senior’ trainee candidates (Table 4).

Table 4 Mean score differences for ‘Junior’ vs. ‘Senior’ trainees across individual stations

Full size table

A number of simulator-measured metrics were further recorded for the three laparoscopic skills (LapSim™) stations (Table S1, supplementary data). Correlations between simulator- and assessor-recorded metrics are outlined in Tables S5, S6, and S7 (supplementary data).

Consequences

A pass-fail standard for each station was generated using borderline regression methodology. Passing scores and passing rates for individual stations are outlined in Table 5. There was no difference in the pass rate (9/13) when using a compensatory model (where the candidate can fail an unlimited number of stations once the overall pass mark is achieved) or a conjunctive model (where the candidate is required to achieve the overall pass mark and pass at least 50% of stations). Either method resulted in all ‘senior’ trainees passing and 2/6 of ‘junior’ trainees passing. Only one trainee passed all eight stations.

Table 5 Borderline regression passing scores and passing rates

Full size table

Feasibility

Two rounds of assessment were conducted. Each session was conducted over 2 h. Models and required equipment for open stations cost an estimated €4680, although much of the reusable equipment was already available within the simulation laboratory at RCSI. Rental and support costs for the LapSim laparoscopic simulators were €4305. Staffing costs were estimated at an additional €4000, bringing the estimated cost of the assessment to €12,985.

Discussion

This study outlines validity evidence for a novel pilot simulation-based assessment of operative competence, designed for assessment of higher specialist trainees in General surgery and centered around operative competency expectations derived from the intercollegiate surgical curriculum program of Ireland and the UK. The principles of assessment design and validity testing are not specific to this jurisdictional context and can be used to generate simulation-based assessments within competency-based training curricula internationally. High internal consistency metrics were generated, with acceptable inter-rater reliability. While a sufficient number of observations were observed to reliably rank candidates (G > 0.8 with 6 stations), 14 stations would be required in order to reliably determine whether a candidate met the desired level of operative competence across procedures, due to the observed absolute error variance [34]. The assessment can differentiate between trainees according to level of training, with more senior trainees outperforming their junior trainee counterparts across management of a fistula in-ano, ileostomy closure, right hemi-colectomy, laparoscopic appendicectomy, and emergency laparotomy stations. Mean station scores were moderately and significantly correlated with increasing training year (R² = 0.6). Using borderline regression methodology, a pass-fail mark was generated which resulted in all trainees in their last 4 years of training passing the assessment and 4/6 more junior trainees failing. The estimated cost of delivering this pilot assessment was €12,985.

Only one participant passed all eight stations. Given the absolute curricular (and indeed, clinical) requirements for competence across several key index procedures, it is arguable that an alternate approach to standard setting and pass/fail cut-off is required. In particular, it could be argued that candidates should be required to pass all stations in order to pass the entire assessment, without the capacity for inter-station score compensation. The ‘patient safety’ method of standard setting [36, 37] is an approach to developing defensible minimum passing standards whereby incorrect performance in critical items or procedures leads to failure without recourse to compensation.

The implementation of competency-based education principles across surgical training programs requires robust methods of trainee assessments. The newly updated general surgery curriculum of the UK and Ireland emphasizes two critical stages during higher specialist training with corresponding operative competency expectations: phase 2 outlining expectation in core elective and emergency procedures and phase 3 outlining further competency expectations in a given trainee’s sub-specialty of interest. Assessment across these procedures is primarily conducted in the workplace. In other jurisdictions, high-stakes simulation-based assessments have been designed to inform end-of-training certification decisions [16,17,18, 38]. This study suggests that a similar assessment modality has sufficient validity evidence to support its use in the assessment of trainees in Ireland. However, it is important to emphasize that this study reports pilot data with a limited number of trainee participants and does not yet support the implementation of such a modality in informing high-stakes assessment outcomes. The role, if any, of simulation in informing high-stakes training decisions such as progression or end-of-training certification remains to be fully elucidated, particularly in the context of an existing longitudinal multi-faceted assessment program. Furthermore, this study does not further explore the potential value of this assessment, or similar assessments, in a purely formative context.

In a 2012 Delphi study of members of the Association for Surgical Education, clarifying the role of simulation for the certification of residents and practicing surgeons was identified as a high priority for surgical education research [39]. Simulation-based assessments (SBA) currently play varying, though often significant, roles in the training and certification process across jurisdictions. For example, passing the fundamentals of laparoscopic surgery assessment is a certification requirement in certain specialties in the USA [40,41,42,43,44]. For practicing surgeons, the General Medical Council of the UK use simulated procedures, with cadaveric models, to inform performance assessments to support fitness-to-practice decisions [45]. At end-of-training certification level, studies to date suggest that SBA assesses a different construct to knowledge-based assessments and may therefore add validity to the certification process [18, 46]; a low positive correlation only (r = 0.25) was observed between scores awarded to trainees in a simulation-based certification exam and oral examination scores by de Montbrun et al. [18]. Despite published validity evidence for such assessments according to modern concepts of construct validity, the acceptability of such assessments remains a barrier to their widespread implementation. Stakeholders report lack of evidence and financial considerations as factors contributing to lack of SBA implementation [47].

These concerns, particularly with respect to cost, are not unfounded. The estimated costs of the reported assessment approached €13,000. This assessment took place in a well-equipped national training center with purpose-designed facilities and trained personnel. The implementation of such assessments across programs and local contexts will therefore likely vary. Furthermore, studies have yet to determine the appropriate role and weighting of simulation-based assessments in the context of programmatic assessment. Potential drawbacks of SBA when compared to workplace-based assessment include insufficient model fidelity, difficulty in simulating case complexity or variation, and further challenges associated with conducting longitudinal, repeated assessments. Finally, the acceptability of such assessments to key stakeholders in surgical education should be further explored by future studies prior to the implementation of any high-stakes assessment.

Limitations

This study is limited by its small sample size. In particular, pass rates calculated by borderline regression methodology are limited due to low numbers or ‘borderline’ performance scores, particularly in some individual stations. Initial power calculations were derived using published data on the OSATS tool, with limited published data available on mean PBA scores across procedures. The ultimate scoring tool used an equally weighted OSATs and PBA tool. Derived procedures, assessment tools, and competency expectations may differ across jurisdictions, training stages, and local contexts. For example, it may be inappropriate to assess the competence of breast surgery trainees to consultant-ready standard in colorectal procedures, such as ileostomy reversal or fistula-in-ano surgery. During standard setting examiners were informed to assess candidates to the level of a day-1 general surgery consultant regardless of sub-specialty interest. In future iterations it may be required to further outline competency expectations for each procedure in more detail, and in particular emphasize the need to assess to the minimum standard of competence required for all trainees regardless of intended or declared sub-specialty interest. The somewhat unique highly centralized, nationally delivered and relatively well-resourced simulation-based training curriculum in Ireland may mean that further implementation challenges would be encountered by other jurisdictions. A further potential limitation is the use of a compensatory scoring method over a non-compensatory method. Given that trainees at the end of phase 2 are expected to display competent performance in a series of core general and emergency procedures, it may be more appropriate to design an assessment that ensures a baseline procedure-specific competence in all included procedural stations.

Conclusion

This study reports validity evidence of a novel simulation-based assessment of operative competence, designed for assessment of Irish higher specialist trainees in General surgery. Further larger-scale validation studies, along with studies exploring the role of such assessments in a longitudinal training and evaluation program, will be required prior to the implementation of SBA in general surgery training.

Data availability

Data is available upon request.

References

Hurreiz H (2019) The evolution of surgical training in the UK. Adv Med Educ Pract 10:163–168
Article PubMed PubMed Central Google Scholar
Brown E, Choi J, Sairi T (2020) Resident involvement in plastic surgery: divergence of patient expectations and experiences with surgeon’s attitudes and actions. J Surg Educ 77:291–299
Article PubMed Google Scholar
Aggarwal R, Darzi A (2006) Training in the operating theatre: is it safe? Thorax 61:278–279
Article CAS PubMed PubMed Central Google Scholar
Elfenbein DM (2016) Confidence crisis among general surgery residents: a systematic review and qualitative discourse analysis. JAMA Surg 151:1166–1175
Article PubMed PubMed Central Google Scholar
Poudel S, Hirano S, Kurashima Y, Stefanidis D, Akiyama H, Eguchi S, Fukui T, Hagiwara M, Hashimoto D, Hida K, Izaki T, Iwase H, Kawamoto S, Otomo Y, Nagai E, Saito M, Takami H, Takeda Y, Toi M, Yamaue H, Yoshida M, Yoshida S, Kodera Y et al (2020) Are graduating residents sufficiently competent? Results of a national gap analysis survey of program directors and graduating residents in Japan. Surg Today 50:995–1001
Article PubMed Google Scholar
Vu JV, George BC, Clark M, Rivard SJ, Regenbogen SE, Kwakye G (2021) Readiness of graduating general surgery residents to perform colorectal procedures. J Surg Educ. https://doi.org/10.1016/j.jsurg.2020.12.015
Article PubMed PubMed Central Google Scholar
Harris KA, Nousiainen MT, Reznick R (2020) Competency-based resident education: the Canadian perspective. Surgery 167:681–684
Article PubMed Google Scholar
(2014) The general surgery milestone project. J Grad Med Educ 6:320–328
James HK, Gregory RJH (2021) The dawn of a new competency-based training era. Bone Joint Open 2:181–190
Article PubMed PubMed Central Google Scholar
Schijven MP, Bemelman WA (2011) Problems and pitfalls in modern competency-based laparoscopic training. Surg Endosc 25:2159–2163
Article CAS PubMed PubMed Central Google Scholar
Cook T, Lund J (2021) General surgery curriculum—the Intercollegiate Surgical Curriculum Programme. https://www.iscp.ac.uk/media/1103/general-surgery-curriculum-aug-2021-approved-oct-20v3.pdf
Marriott J, Purdie H, Crossley J, Beard JD (2011) Evaluation of procedure-based assessment for assessing trainees’ skills in the operating theatre. Br J Surg 98:450–457
Article CAS PubMed Google Scholar
Toale C, Morris M, Kavanagh DO (2022) Assessing operative skill in the competency-based education era: lessons from the UK and Ireland. Ann Surg 275(4):e615–e625. https://doi.org/10.1097/SLA.0000000000005242
Article PubMed Google Scholar
Agha RA, Fowler AJ (2015) The role and validity of surgical simulation. Int Surg 100:350–357
Article PubMed PubMed Central Google Scholar
Milburn JA, Khera G, Hornby ST, Malone PSC, Fitzgerald JEF (2012) Introduction, availability and role of simulation in surgical education and training: review of current evidence and recommendations from the association of surgeons in training. Int J Surg 10:393–398
Article CAS PubMed Google Scholar
de Montbrun SL, Roberts PL, Lowry AC, Ault GT, Burnstein MJ, Cataldo PA, Dozois EJ, Dunn GD, Fleshman J, Isenberg GA, Mahmoud NN, Reznick RK, Satterthwaite L, Schoetz D Jr, Trudel JL, Weiss EG, Wexner SD, MacRae H (2013) A novel approach to assessing technical competence of colorectal surgery residents: the development and evaluation of the Colorectal Objective Structured Assessment of Technical Skill (COSATS). Ann Surg 258:1001–1006
Article PubMed Google Scholar
Halwani Y, Sachdeva AK, Satterthwaite L, de Montbrun S (2019) Development and evaluation of the General Surgery Objective Structured Assessment of Technical Skill (GOSATS). Br J Surg 106:1617–1622
Article CAS PubMed Google Scholar
de Montbrun S, Roberts PL, Satterthwaite L, MacRae H (2016) Implementing and evaluating a national certification technical skills examination: the colorectal objective structured assessment of technical skill. Ann Surg 264:1–6
Article PubMed Google Scholar
Association AER (1999) American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing
Cook DA, Hatala R (2016) Validation of educational assessments: a primer for simulation and beyond. Adv Simul 1:31
Article Google Scholar
Toale C, Morris M, Kavanagh D (2021) Perceptions and experiences of simulation-based assessment of technical skill in surgery: a scoping review. Am J Surg 222:723–730
Article PubMed Google Scholar
Toale C, Morris M, Kavanagh DO (2022) Perspectives on simulation-based assessment of operative skill in surgical training. Med Teach. https://doi.org/10.1080/0142159X.2022.2134001
Article PubMed Google Scholar
Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J, Hutchison C, Brown M (1997) Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg 84:273–278
CAS PubMed Google Scholar
Nayahangan LJ, Clementsen PF, Paltved C, Lindorff-Larsen KG, Nielsen BU, Konge L (2016) Identifying technical procedures in pulmonary medicine that should be integrated in a simulation-based curriculum: a national general needs assessment. Respiration 91:517–522
Article PubMed Google Scholar
Toale C, Morris M, Konge L, Nayahangan LJ, Roche A, Heskin L, Kavanagh DO (2024) Generating a prioritized list of operative procedures for simulation-based assessment of general surgery trainees through consensus. Ann Surg 279:900–905
Article PubMed Google Scholar
Mayne A, Wilson L, Kennedy N (2020) The usefulness of procedure-based assessments in postgraduate surgical training within the intercollegiate surgical curriculum programme; a scoping review. J Surg Educ 77:1227–1235
Article PubMed Google Scholar
Moorthy K, Munz Y, Sarker SK, Darzi A (2003) Objective assessment of technical skills in surgery. BMJ 327:1032–1037
Article PubMed PubMed Central Google Scholar
Vaidya A, Aydin A, Ridgley J, Raison N, Dasgupta P, Ahmed K (2020) Current status of technical skills assessment tools in surgery: a systematic review. J Surg Res 246:342–378
Article PubMed Google Scholar
de Blacam C, O’Keeffe DA, Nugent E, Doherty E, Traynor O (2012) Are residents accurate in their assessments of their own surgical skills? The American Journal of Surgery 204:724–731
Article PubMed Google Scholar
Schijven MP, Reznick RK, ten Cate OTJ, Grantcharov TP, Regehr G, Satterthwaite L, Thijssen AS, MacRae HM (2010) Transatlantic comparison of the competence of surgeons at the start of their professional career. Br J Surg 97:443–449
Article CAS PubMed Google Scholar
Dwyer T, Wright S, Kulasegaram KM, Theodoropoulos J, Chahal J, Wasserstein D, Ringsted C, Hodges B, Ogilvie-Harris D (2016) How to set the bar in competency-based medical education: standard setting after an Objective Structured Clinical Examination (OSCE). BMC Med Educ 16:1
Article CAS PubMed PubMed Central Google Scholar
Toale C, Morris M, Kavanagh DO (2022) Training and assessment using the LapSim laparoscopic simulator: a scoping review of validity evidence. Surg Endosc. https://doi.org/10.1007/s00464-022-09593-0
Article PubMed Google Scholar
Andersen SAW, Nayahangan LJ, Park YS, Konge L (2021) Use of generalizability theory for exploring reliability of and sources of variance in assessment of technical skills: a systematic review and meta-analysis. Acad Med 96:1609–1619
Article PubMed Google Scholar
Yudkowsky R, Park YS, Downing SM (2019) Assessment in health professions education. Routledge
Book Google Scholar
Kramer A, Muijtjens A, Jansen K, Düsman H, Tan L, van der Vleuten C (2003) Comparison of a rational and an empirical standard setting procedure for an OSCE: objective structured clinical examinations. Med Educ 37:132–139
Article PubMed Google Scholar
Yudkowsky R, Tumuluru S, Casey P, Herlich N, Ledonne C (2014) A patient safety approach to setting pass/fail standards for basic procedural skills checklists. Simul Healthc 9:277–282
Article PubMed Google Scholar
Barsuk JH, Cohen ER, Wayne DB, McGaghie WC, Yudkowsky R (2018) A comparison of approaches for mastery learning standard setting. Acad Med 93:1079–1084
Article PubMed Google Scholar
Sousa J, Mansilha A (2019) European panomara on vascular surgery: results from 5 years of FEBVS examinations. Angiologia e Cirurgia Vascular 15:171–175
Google Scholar
Stefanidis D, Arora S, Parrack DM, Hamad GG, Capella J, Grantcharov T, Urbach DR, Scott DJ, Jones DB (2012) Research priorities in surgical simulation for the 21st century. Am J Surg 203:49–53
Article PubMed PubMed Central Google Scholar
Seaman SJ, Jorgensen EM, Tramontano AC, Jones DB, Mendiola ML, Ricciotti HA, Hur HC (2021) Use of fundamentals of laparoscopic surgery testing to assess gynecologic surgeons: a retrospective cohort study of 10-years experience. J Minim Invasive Gynecol 28:794–800
Article PubMed Google Scholar
Bilgic E, Kaneva P, Okrainec A, Ritter EM, Schwaitzberg SD, Vassiliou MC (2018) Trends in the Fundamentals of Laparoscopic Surgery® (FLS) certification exam over the past 9 years. Surg Endosc 32:2101–2105
Article PubMed Google Scholar
Rooney DM, Brissman IC, Gauger PG (2015) Ongoing evaluation of video-based assessment of proctors’ scoring of the fundamentals of laparoscopic surgery manual skills examination. J Surg Educ 72:471–476
Article PubMed Google Scholar
Ritter EM, Brissman IC (2016) Systematic development of a proctor certification examination for the Fundamentals of Laparoscopic Surgery testing program. Am J Surg 211:458–463
Article PubMed Google Scholar
Xeroulis G, Dubrowski A, Leslie K (2009) Simulation in laparoscopic surgery: a concurrent validity study for FLS. Surg Endosc 23:161–165
Article PubMed Google Scholar
General Medical Council UK (2024) Performance assessments as part of an investigation or hearing. In: General Medical Council UK (ed.) Performance assessments as part of an investigation or hearing. GMC-UK. www.gmc-uk.org
Sousa J, Mansilha A (2021) European panorama on vascular surgery: results from five years of FEBVS examinations. Eur J Vasc Endovasc Surg 61:S41–S42
Article Google Scholar
Louridas M, Szasz P, de Montbrun S, Harris KA, Grantcharov TP (2016) International assessment practices along the continuum of surgical training. Am J Surg 212:354–360
Article PubMed Google Scholar

Download references

Funding

Open Access funding provided by the IReL Consortium. This work is supported by the Royal College of Surgeons/Hermitage Medical Clinic Strategic Academic Recruitment (StAR MD) program.

Author information

Authors and Affiliations

Department of Surgical Affairs, Royal College of Surgeons in Ireland, 121 St. Stephen’s Green, Dublin, Ireland
Conor Toale, Marie Morris, Oscar Traynor & Dara Kavanagh
SIM Centre for Simulation Education and Research, Royal College of Surgeons in Ireland, 123 St. Stephen’s Green, Dublin, Ireland
Adam Roche & Miroslav Voborsky

Authors

Conor Toale
View author publications
You can also search for this author in PubMed Google Scholar
Marie Morris
View author publications
You can also search for this author in PubMed Google Scholar
Adam Roche
View author publications
You can also search for this author in PubMed Google Scholar
Miroslav Voborsky
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Traynor
View author publications
You can also search for this author in PubMed Google Scholar
Dara Kavanagh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Conor Toale.

Ethics declarations

Disclosures

Conor Toale received grant funding as part of undertaking a higher degree, from the Royal College of Surgeons in Ireland/ Hermitage Medical Clinic Strategic Academic Recruitment (StAR MD) program. Marie Morris, Mr Adam Roche, Mr Miroslav Voborsky, Oscar Traynor, and Dara Kavanagh have no conflicts of interest to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Data from this study were presented at the Annual Congress of the Association of Surgeons of Great Britain and Ireland, 17–19 May 2023, Harrogate, UK.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 64 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Toale, C., Morris, M., Roche, A. et al. Development and validation of a simulation-based assessment of operative competence for higher specialist trainees in general surgery. Surg Endosc (2024). https://doi.org/10.1007/s00464-024-11024-1

Download citation

Received: 25 March 2024
Accepted: 30 June 2024
Published: 17 July 2024
DOI: https://doi.org/10.1007/s00464-024-11024-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Development and validation of a simulation-based assessment of operative competence for higher specialist trainees in general surgery

Abstract

Background

Methods

Results

Conclusion

Graphical abstract

Similar content being viewed by others

Why do residents fail simulation-based assessments of operative competence? A qualitative analysis

Training and assessment using the LapSim laparoscopic simulator: a scoping review of validity evidence

The impact of surgical simulation on patient outcomes: a systematic review and meta-analysis

Introduction

Methods

Initial development of the examination framework

Assessment instruments

Participants

Procedure

Statistical analysis

Results

Internal structure

Relationships with other variables—differentiating between junior and more senior trainees

Consequences

Feasibility

Discussion

Limitations

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Disclosures

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 64 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation