Abstract
Introduction
Surgical rating scales (SRSs) enable the surgeon to uniformly quantify surgical working conditions. They are increasingly used as a primary outcome in studies evaluating the effect of anaesthesia or surgery-related interventions on the quality of the surgical work field. SRSs are especially used in laparoscopic surgery due to a renewed interest in deep neuromuscular block. There are however no guidelines regarding the uniform use of SRS and the uniform reporting of results.
Methods
A systematic search was conducted in the databases of PubMed, Web of Science and Embase for studies that reported the use of an SRS to evaluate surgical conditions in laparoscopic surgery. Only original human research in English language with full text availability through the Leiden university library was considered for this review. The full texts of eligible abstracts were independently reviewed by the first and second author. The quality of SRSs and methodology of rating were systematically reviewed.
Results
The search yielded 2830 reports, of which 17 were identified using a surgical rating scale (SRS) in laparoscopic surgery. Ten of these reports used a unique SRS, these were systematically appraised for their quality. The overall quality of the SRSs was low: the majority of the scales were poorly described and lacked assessment of inter- and intra-rater reliability. In addition, considerable differences exist in the methodology of rating and the reporting of results.
Conclusion
There is substantial inconsistency in SRS quality, methodology, and results reporting. The uniform use of high-quality surgical rating scales is needed to improve the quality and reproducibility of future research.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Surgical rating scales (SRS) are increasingly used to rate the quality of surgical working conditions. A SRS enables the surgeon to translate his or her experienced but subjective impression of the quality of the operative conditions into a standardised rating. The use of SRSs has potential benefits in daily practice and research. First, it offers a uniform platform for the surgeon to negotiate with the anaesthetist whether or not to improve or consolidate surgical working conditions induced by the anaesthetic. Second, surgical rating scales may be used in research to evaluate interventions and new techniques aimed at improving the surgical working/operating conditions. Recent developments in the reversal of neuromuscular block by sugammadex have renewed the interest in the effect of deeper levels of neuromuscular block on surgical working conditions in laparoscopic surgery. In these studies, surgical rating scales are often used as primary outcome [1,2,3,4,5,6,7,8]. However, guidelines on the use of surgical rating scales do not exist as yet. This systematic review gives an overview of the use of SRSs in laparoscopic surgery and proposes guidance for future research.
Methods
The first author conducted a literature search assisted by the librarian of the Leiden University Medical Centre. The following query was used to search the pubmed database: (“rating scale” [tw] OR “rating scales” [tw] OR “Visual Analog Scale” [Mesh] OR Visual Analogue Scale* [tw] OR Visual Analog Scale* [tw] OR “scale” [tw] OR “scales” [tw] OR scaling* [tw] OR rating* [tw] OR scoring* [tw] OR “score” [tw] OR “scores” [tw] OR “scored” [tw] OR “grading” [tw] OR “grade” [tw] OR “graded” [tw]) AND (“surgical conditions” [tw] OR “surgical condition” [tw] OR “operating conditions” [tw] OR “operating condition” [tw] OR “surgical quality” [tw] OR “surgery quality” [tw] OR “surgical field” [tw]). Embase and Web of Science were searched with a similar query containing the following terms: “rating scale”, “visual analogue scale” (included Mesh term), “scale”, “rating”, “scoring”, “score”, “grading”, “surgical conditions”, “operating conditions”, “surgical quality”, and “surgical field”. The databases were searched on the 20th may 2017, without date range limit. The results were screened on title and abstract by the first author. Relevant full text articles were retrieved and the reference lists of these articles were screened for any additional missed papers (snow ball method). After this first selection, the full texts of the selected articles were reviewed by the first and second author for inclusion in the review.
Study inclusion criteria
Studies included in this systematic review were limited to original randomised controlled trials, English language and full text availability through the Leiden University full text access service. Articles were included if the study (1) described a method to evaluate a surgical working condition or operating field or (2) applied a surgical rating scale for evaluation of surgical conditions in laparoscopic surgery. Included publications were assessed for the following items: type of rating scale, description of the scale items, number of raters, scoring moments, validation methods, and reporting of results.
Exclusion criteria
Reports that did not score surgical conditions as a whole, but only specific subparts such as “satisfaction of the surgeon”, were excluded.
Quality assessment of the rating scales
In general, the quality of a measurement instrument is critically dependent on its construct validity and reliability [9, 10]. Construct validity refers to the quality of the data based on the scores from a measurement instrument and whether it adequately represents the underlying construct (i.e. the surgical working conditions) [9]. For construct validity, the following domains are considered important: scale content, internal structure, response process, correlation to other variables, and clinical consequences [9,10,11]. These domains reflect both the internal quality of the rating instrument (scale content, internal structure, correlation to other variables) and how the rating instrument is used in practice (scoring methodology; response/rating process). To uniformly assess the quality of the identified SRSs in this review, an appraisal score was constructed. We are not aware of any pre-existing scores for the appraisal of surgical rating scales. In the appraisal score, relevant previously mentioned domains were translated into the following psychometric items: (1) scale length, (2) description of the scale items, (3) test–retest reliability, and (4) correlation with other variables (see Table 1). The appraisal score only assesses internal SRS quality; the scoring methodology is discussed separately. All SRSs were independently reviewed by the first and second author with the use of the appraisal score. Discrepancies were resolved by consensus. We will briefly explain the separate items of the appraisal score.
Length of the SRS
An SRS length of 5–7 items is considered optimal. Test–retest reliability, internal consistency, and discriminating power of scales with 5–7 items are generally superior to short scales (2–4 item points) or very large scales (> 10 item points) [12, 13]. In our appraisal score, scales with a length of 5–7 items received one point. Scales that contained less than 5 or more than 7 items were not granted any points.
Description of scale items
In a well-described scale, each item in the scale has a grade (i.e. moderate or excellent) plus a detailed description of the specific aspects of the surgical working field for that grade. An example of an SRS with an adequate scale item description is the Leiden-surgical rating scale. This scale is presented in Table 2 [1]. Scales that have an adequate description of the scale items were granted one point in the appraisal score. Inadequate or absence of detailed description of the scale items resulted in zero points in the appraisal score.
Test–retest reliability
Test–retest reliability assesses the reproducibility of ratings by one rater (intra-observer reliability) or between two (or more) raters (inter-observer reliability). At best, an SRS was assessed for both. The appraisal score grants one point for intra-observer and one point for and inter-observer reliability verification. Hence, the maximum score in the appraisal score for this domain was two points.
Correlation with other variables
According to the domains of construct validity, a measurement instrument should be compared to another measurement instrument or variable that reflects the same underlying construct. In the appraisal score, if an SRS was compared with another scoring instrument or variable, it received one point. The absence of such a comparison would result in zero points.
The appraisal scoring system is given in Table 1. The maximum score that an SRS could receive was five points (excellent quality) and the lowest score was zero points (very poor quality).
Results
Included articles
The initial search yielded 2830 publications. After removing duplicates, non-English language and non-human research, we screened 873 abstracts of which 763 non-relevant publications were discarded. The full texts of 110 reports were reviewed. The snowball method yielded 14 additional relevant studies. After full text review of 124 selected articles, 15 reports were excluded because (1) the SRS was not used for assessment of surgical conditions, or (2) surgical conditions were not scored. Another 92 reports were excluded because of non-laparoscopic surgery (3). In total, 17 publications were included in this review. Figure 1 outlines the selection process. The unique SRSs were systematically judged for their quality with the use of the appraisal score. Overall, the quality of the majority of the SRSs was low (see Table 3).
Surgical rating scales used in laparoscopic surgery
Seventeen studies used a SRS for evaluation of surgical conditions in laparoscopic surgery [1,2,3,4,5,6,7, 15,16,17,18,19,20,21,22,23,24]. The length of the individual scales varied between 3-, 4-, 5-, 6-, 11-, and 100-point scales. Most surgical rating scales were 4- or 5-point scales (see Table 4).
Four-point scales are commonly used for evaluation of surgical conditions, predominantly laparoscopic gynaecologic surgery [2, 15, 16, 20]. However, in the quality appraisal, these 4-point scales were rated as poor-quality scales as the length of 4-point scales was considered suboptimal (< 5 items) and all lacked test–retest reliability assessment.
Taylor et al. used a 5-point SRS to assess surgical conditions during cholecystectomy in relation to bowel distension and the use of nitrous oxide [19]. This scale also lacked test–retest reliability assessment. Martini et al. developed their 5-point Leiden-surgical rating scale (L-SRS) for use in laparoscopic retroperitoneal urologic surgery (see Table 2) [1]. The scale was later successfully used in bariatric surgery [4]. The scale items are well described and incorporate visibility of critical structures, working space, and muscle contractions as determinants of the surgical working field [1]. The 5-point L-SRS was assessed for inter-rater reliability by the original research group [1, 4]. In addition, Nervil et al. assessed both inter and intra-rater reliability of a modified version of the 5-point L-SRS and an 11-point SRS [13]. Both the 5-point and 11-point SRS showed excellent intra-rater reliability and fair inter-rater reliability. Due to the lower inter-rater variability, the 5-point scale was considered superior [13]. The L-SRS scale is used by other research groups, including the use in laparoscopic donor nephrectomy [5, 21,22,23]. This endorses the utility of this scale. In laparoscopic donor nephrectomy, the L-SRS is used to titrate insufflation pressures to the lowest possible, while maintaining good operating conditions.
Methodology and results reporting
Most studies reported a mean SRS score and a distribution of the scores (see Table 4). Some only reported the percentage of unacceptable surgical conditions, which was generally the frequency of scores on the lower half of the surgical rating scale [3, 17, 20]. In addition, the number and moments of scoring differed considerably, with some studies scoring every 10 or 15 min [1, 2, 4, 7, 19, 20], while others scored one overall score at the end of surgery [3, 5, 8, 15, 17, 18, 21]. Some reports do not mention a scoring interval at all [16, 24]. In addition to the surgical rating scale, some have assessed other outcomes as well such as intra-abdominal space and the effect on insufflation pressures (see Table 4) [6, 15, 17, 20].
Discussion
Surgical rating scales (SRS) are increasingly used in clinical research. These scales are used to translate the subjective perception of the surgical field by the surgeon into a more objective and reproducible integer on a fixed scale. Surgical rating scales are a useful tool to investigate the effect of surgery- or anaesthesia-related interventions on surgical working conditions. To get an indication on the variety of SRS in use and their quality, we retrieved 17 relevant studies from the literature and identified 10 unique scales that are used in laparoscopic surgery. Since the introduction of sugammadex (a novel selective neuromuscular reversal agent), there has been a renewed interest in the application of deep neuromuscular block (NMB) in these types of surgery. This type of research relies heavily on the use of a SRS.
Based on our results, it is evident that the large number of rating scales in literature comes with significant heterogeneity. There is ample difference in the quality of the rating scales and second, there is no uniformity in the method of rating and reporting of the results. In general, the quality of the rating scales was low. Most encountered problems were absence of test–retest reliability assessment, absence of a comparison with a different scoring instrument, and poor definition of the scale items. Only the Leiden-surgical rating scale received the highest quality score (see Table 3).
The methods of rating (rating methodology) and the reporting of the results of each study were also reviewed and revealed significant differences (see Table 4). For example, the moment of rating (at fixed time points versus at the end of surgery) and the number or raters (one vs. multiple) differed per study or was not detailed in the “Methods” section. This methodologic heterogeneity may impact results considerably. For instance, a surgical rating that is obtained at fixed time points during a procedure, i.e. every 15 min, may give a completely different result compared to one “overall rating” rating at the end of a procedure [4, 5]. Furthermore, the reporting of the SRS results varied considerably, with some reporting means or medians of the SRS, and others only the distribution of the SRS.
In this review, we aimed to uniformly appraise the quality of the identified SRS. To be useful instruments, SRS should display good psychometric properties, such as reliability and validity, and be also easy to use [9,10,11]. To this end, we created an appraisal score that was used to review these aspects of each SRS (see Table 1). The appraisal score allowed us to uniformly assess the quality of each SRS. Note, however, that the appraisal score is not evidence for validity of the results obtained with the SRS. Both validity and reliability are not inherent properties of the rating instrument, but they rather reflect the interaction of the scale with the measure being tested. We are aware that our appraisal score may possess shortcomings and lacks formal validation. Therefore, others may judge the quality of the SRS differently. Finally, it is important to realise that only English language literature was searched and that useful, high-quality rating scales may exist in non-English literature. In addition, high-quality surgical rating scales may exist in non-laparoscopic surgery, however, this is beyond the scope of this review.
The use of poor-quality SRS combined with poor rating methodology for research is undesirable, and reduces the validity of the results. While we do not intent to recommend a preferred SRS for specific procedures, we do propose some guidance in the use of SRSs. If a good-quality SRS in the field of interest is available, researchers should strongly consider using that scale. The use of existing SRSs increases the comparability of research. If validated SRSs are unavailable for specific surgical procedures, investigators can either choose to validate a pre-existing non-validated scale, or develop and validate a new scale. Any new developed scale should be of high quality. The items mentioned in the appraisal score can act as a guideline for this. The validation procedure should assess both inter- and intra-rater reliability of a scale. In addition, the scale should be compared with other variables to increase its validity. See Table 5 for an overview of recommendations.
Finally, ratings should be obtained at predefined moments and researches should report the following in their methods and results: number of individuals involved in the scoring and their surgical experience, time-stamp of scoring, mean and/or median SRS values, mean/median scorings at each time-stamp, and the distribution of the scorings. Uniformity of these aspects will improve comparability and reproducibility of this type of research.
In conclusion, this review found that multiple surgical rating scales have been used in laparoscopic surgery to assess the quality of the surgical field. The majority of the scales are of low quality and the method of rating and reporting of results differed considerably. The uniform use of high-quality surgical rating scales is needed to improve the quality and reproducibility of future research.
References
Martini CH, Boon M, Bevers RF, Aarts LP, Dahan A (2014) Evaluation of surgical conditions during laparoscopic surgery in patients with moderate vs deep neuromuscular block. Br J Anaesth 112:498–505
Dubois PE, Putz L, Jamart J, Marotta ML, Gourdin M, Donnez O (2014) Deep neuromuscular block improves surgical conditions during laparoscopic hysterectomy: a randomised controlled trial. Eur J Anaesthesiol 31:430–436
Blobner M, Frick CG, Stauble RB, Feussner H, Schaller SJ, Unterbuchner C, Lingg C, Geisler M, Fink H (2015) Neuromuscular blockade improves surgical conditions (NISCO). Surg Endosc 29:627–636
Torensma B, Martini CH, Boon M, Olofsen E, Liem RS, Knook MT, Swank DJ, Dahan A (2016) Deep neuromuscular block improves surgical conditions during bariatric surgery and reduces postoperative pain: a randomized double blind controlled trial. PLoS ONE 11:e0167907
Baete S, Vercruysse G, Vander Laenen M, De Vooght P, Van Melkebeek J, Dylst D, Beran M, Van Zundert J, Heylen R, Boer W, Van Boxstael S, Fret T, Verhelst H, De Deyne C, Jans F, Vanelderen P (2017) The effect of deep versus moderate neuromuscular block on surgical conditions and postoperative respiratory function in bariatric laparoscopic surgery: a randomized, double blind clinical trial. Anesth Analg 124:1469–1475
Kim MH, Lee KY, Lee KY, Min BS, Yoo YC (2016) Maintaining optimal surgical conditions with low insufflation pressures is possible with deep neuromuscular blockade during laparoscopic colorectal surgery: a prospective, randomized, double-blind, parallel-group clinical trial. Medicine 95:e2920
Boon M, Martini C, Hellinga M, Bevers R, Aarts L, Dahan A (2016) Influence of variations in arterial PCO2 on surgical conditions during laparoscopic retroperitoneal surgery. Br J Anaesth 117:59–65
King M, Sujirattanawimol N, Danielson DR, Hall BA, Schroeder DR, Warner DO (2000) Requirements for muscle relaxants during radical retropubic prostatectomy. Anesthesiology 93:1392–1397
Cook DA, Beckman TJ (2006) Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 119:166 e167–e116
Messick S (1989) Validity. In: RL Linn (ed) Educational measurement, American Counsil on Education and Macmillian, New York
Keszei AP, Novak M, Streiner DL (2010) Introduction to health measurement scales. J Psychosom Res 68:319–323
Preston CC, Colman AM (2000) Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychol (Amst) 104:1–15
Nervil GG, Medici R, Thomsen JLD, Staehr-Rye AK, Asadzadeh S, Rosenberg J, Gatke MR, Madsen MV (2017) Validation of subjective rating scales for assessment of surgical workspace during laparoscopy. Acta Anaesthesiol Scand 61:1270–1277
Karlsten R, Kristensen JD (1993) Nitrous oxide does not influence the surgeon’s rating of operating conditions in lower abdominal surgery. Eur J Anaesthesiol 10:215–217
Madsen MV, Gatke MR, Springborg HH, Rosenberg J, Lund J, Istre O (2015) Optimising abdominal space with deep neuromuscular blockade in gynaecologic laparoscopy—a randomised, blinded crossover study. Acta Anaesthesiol Scand 59:441–447
Williams MT, Rice I, Ewen SP, Elliott SM (2003) A comparison of the effect of two anaesthetic techniques on surgical conditions during gynaecological laparoscopy. Anaesthesia 58:574–578
Koo BW, Oh AY, Seo KS, Han JW, Han HS, Yoon YS (2016) Randomized clinical trial of moderate versus deep neuromuscular block for low-pressure pneumoperitoneum during laparoscopic cholecystectomy. World J Surg 40:2898–2903
Rosenberg J, Herring WJ, Blobner M, Mulier JP, Rahe-Meyer N, Woo T, Li MK, Grobara P, Assaid CA, Fennema H, Szegedi A (2017) Deep neuromuscular blockade improves laparoscopic surgical conditions: a randomized, controlled study. Adv Ther 34:925–936
Taylor E, Feinstein R, White PF, Soper N (1992) Anesthesia for laparoscopic cholecystectomy. Is nitrous oxide contraindicated? Anesthesiology 76:541–543
Staehr-Rye AK, Rasmussen LS, Rosenberg J, Juul P, Lindekaer AL, Riber C, Gatke MR (2014) Surgical space conditions during low-pressure laparoscopic cholecystectomy with deep versus moderate neuromuscular blockade: a randomized clinical study. Anesth Analg 119:1084–1092
Yoo YC, Kim NY, Shin S, Choi YD, Hong JH, Kim CY, Park H, Bai SJ (2015) The intraocular pressure under deep versus moderate neuromuscular blockade during low-pressure robot assisted laparoscopic radical prostatectomy in a randomized trial. PLoS ONE 10:e0135412
Ozdemir-van Brunschot DMD, Braat AE, van der Jagt MFP, Scheffer GJ, Martini CH, Langenhuijsen JF, Dam RE, Huurman VA, Lam D, d’Ancona FC, Dahan A, Warle MC (2018) Deep neuromuscular blockade improves surgical conditions during low-pressure pneumoperitoneum laparoscopic donor nephrectomy. Surg Endosc 32:245–251
Ozdemir-van Brunschot DMD, Scheffer GJ, van der Jagt M, Langenhuijsen H, Dahan A, Mulder J, Willems S, Hilbrands LB, Donders R, van Laarhoven C, d’Ancona FA, Warle MC (2017) Quality of recovery after low-pressure laparoscopic donor nephrectomy facilitated by deep neuromuscular blockade: a randomized controlled study. World J Surg 41:2950–2958
Caldwell JE, Braidwood JM, Simpson DS (1985) Vecuronium bromide in anaesthesia for laparoscopic sterilization. Br J Anaesth 57:765–769
Kim HJ, Lee K, Park WK, Lee BR, Joo HM, Koh YW, Seo YW, Kim WS, Yoo YC (2015) Deep neuromuscular block improves the surgical conditions for laryngeal microsurgery. Br J Anaesth 115:867–872
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Disclosures
Martijn Boon, Christian H. Martini, Leon P. H. J. Aarts, Albert Dahan received speaker and/or consultancy fees from MSD Nederland BV.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Boon, M., Martini, C.H., Aarts, L.P.H.J. et al. The use of surgical rating scales for the evaluation of surgical working conditions during laparoscopic surgery: a scoping review. Surg Endosc 33, 19–25 (2019). https://doi.org/10.1007/s00464-018-6424-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00464-018-6424-5