Ruling out Appendicitis in Children: Can We Use Clinical Prediction Rules?

Purpose To identify available clinical prediction rules (CPRs) and investigate their ability to rule out appendicitis in children presenting with abdominal pain at the emergency department, and accordingly select CPRs that could be useful in a future prospective cohort study. Methods A literature search was conducted to identify available CPRs. These were subsequently tested in a historical cohort from a general teaching hospital, comprising all children (< 18 years) that visited the emergency department between 2012 and 2015 with abdominal pain. Data were extracted from the electronic patient files and scores of the identified CPRs were calculated for each patient. The negative likelihood ratios were only calculated for those CPRs that could be calculated for at least 50% of patients. Results Twelve CPRs were tested in a cohort of 291 patients, of whom 87 (29.9%) suffered from acute appendicitis. The Ohmann score, Alvarado score, modified Alvarado score, Pediatric Appendicitis score, Low-Risk Appendicitis Rule Refinement, Christian score, and Low Risk Appendicitis Rule had a negative likelihood ratio < 0.1. The Modified Alvarado Scoring System and Lintula score had a negative likelihood ratio > 0.1. Three CPRs were excluded because the score could not be calculated for at least 50% of patients. Conclusion This study identified seven CPRs that could be used in a prospective cohort study to compare their ability to rule out appendicitis in children and investigate if clinical monitoring and re-evaluation instead of performing additional investigations (i.e., ultrasound) is a safe treatment strategy in case there is low suspicion of appendicitis.


Introduction
The diagnosis of acute appendicitis in children remains challenging as symptoms can vary from mild abdominal pain to generalized peritonitis and septicemia. Historically, the diagnosis of appendicitis is mainly based upon clinical examination in combination with biochemical variables indicative for inflammation. A disadvantage of this diagnostic strategy was the relatively high negative appendectomy rate of 12.3-19%. 1,2 To reduce this, an evidence-based guideline was proposed in 2010 by the Association of Surgeons of the Netherlands, which makes preoperative imaging mandatory in patients with suspected appendicitis. 3 Ultrasound is the preferred initial diagnostic imaging modality in both the adult and pediatric population. 3 Implementation of this guideline resulted in a significant decrease of negative appendectomies to 2.2%-5%. 2,4 Currently in the Netherlands, in 99.7% of the adult patients' preoperative imaging studies are performed. 4 A consequence of the abovementioned policy is that the threshold to perform additional imaging studies is low in children presenting at the ER, especially since ultrasonography (US) can be performed quickly with minimal burden and harm for the patient. The downside of this lower threshold is the risk of potential inconclusive results from ultrasound, which may lead to exposure of children to harmful and expensive diagnostic procedures, such as CT scans, MRIs, or even diagnostic laparoscopies. [5][6][7] Instead of these invasive diagnostic procedures, literature suggests that watchful waiting could be considered after non-visualization of the appendix on ultrasound. 8 Selection of patients with high probability of acute appendicitis would help to reduce exposure to abovementioned invasive diagnostic procedures. Clinical prediction rules (CPR), such as the Alvarado score, 9 were initially designed to diagnose appendicitis, but may also be used to rule out appendicitis. CPRs mostly consist of variables from medical history, physical examination, and biochemical testing. Large heterogeneity exists between CPRs in terms of included variables and cutoff values. Several studies showed that the value of these CPRs to diagnose appendicitis is low, reflected by positive likelihood ratios ranging from 1.7-8.5. [10][11][12][13][14][15] Data regarding their value in ruling out appendicitis in the pediatric population are scarce. 10,16 The first objective of this study was to identify commonly applied CPRs through a literature search. The aim of the second part of the study was to investigate the value of the identified CPRs in ruling out appendicitis in the pediatric population in the Netherlands based on the negative likelihood ratios and thereby select CPRs that could potentially be used in a future prospective cohort study. Additionally, in order to determine if the use of imaging modalities could be reduced by adopting CPRs to rule out appendicitis, we determined the number of imaging procedures performed in patients that were qualified as low risk for the disease according to these CPRs.

Identification of the CPRs: Literature Review
Initially, a literature search (according to the PRISMA guidelines) was performed in the PubMed database to identify potential usable CPRs. 17 (Appendix 1) Studies were screened for title and abstract and subsequently assessed for full text by two independent reviewers. Disagreements were solved by consensus. In addition, references from the included articles were screened to identify other CPRs. No other databases than PubMed were screened for potential CPRs. Studies about CPRs that were developed to diagnose or exclude appendicitis were included in this review. A CPR was excluded if it contained variables only applicable to the adult population (e.g., points attributed to age > 50 years). CPRs consisting of more than 15 variables or variables that needed multiplication were considered as impractical in an emergency department and therefore were excluded. CPRs described in other languages than English and CPRs containing variables that were not routinely determined in our hospital, such as rectalaxillary temperature difference, were also excluded.

Study Design and Selection of Participants
A single-center historical cohort study was conducted in a general teaching hospital. All children younger than 18 years presenting at the emergency department between January 1st 2012 and December 31st 2014 with abdominal pain were eligible for inclusion. A consecutive sample of patients with a differential diagnosis of appendicitis, identified using the international classification of diseases (ICD) codes for acute abdomen, acute appendicitis, and general abdominal complaints was used. The treating physician assigned these codes at the time of presentation at the emergency department. Children with abdominal pain due to trauma, presentation of another main complaint than abdominal pain, those not cooperating with physical examination, and those referred to another hospital were excluded.

Data Extraction
Data were extracted from electronic patient files using a standardized form (Appendix 2), based upon the variables used in the identified CPRs. One author (PA) performed the data extraction and 10% of the database was randomly reviewed for completeness by another author (RG). Information on the following variables was extracted: General Gender, age (years), and date of presentation.
Imaging Variables Free fluid on ultrasonography (US), appendicolith on US, appendicular wall thickening (wall thickness > 0.7 cm) on US, appendicular abscess or suppuration on US, performance of computed tomography (CT) abdomen, and performance of magnetic resonance imaging (MRI) abdomen.
The following definitions were used in this study.
Appendicitis Intraoperative diagnosis made by the treating surgeon in combination with pathologically proven inflammation of the appendix was used as the reference standard. Patients with radiographically documented appendicitis who were managed by antibiotics alone did not have pathology reports and were therefore excluded from this study. 18 Non-appendicitis No recurrence of abdominal pain or diagnosis/treatment for appendicitis by 30 days after initial presentation without any specific treatment for appendicitis. Readmission was checked for the follow-up of all cases of non-appendicitis. Telephone follow-up was not performed and we included any patient that did not subsequently return to our hospital for re-evaluation as non-appendicitis. Children with negative appendectomy were classified as nonappendicitis as well.
CPR scores were only calculated if all of the required variables were included in the patients' records. CPRs were excluded from the analysis if the score could be calculated in less than 50% of the patients in the cohort. Cutoff values to rule out appendicitis, as presented in the original manuscript, were used to calculate the performance of the CPRs. When several cutoff values were reported, patients with a negative test result according to the lower original cutoff value of the CPRs were classified as low suspicion of appendicitis.

Data Analysis
IBM SPSS statistics version 22.0 was used for descriptive analysis of our data. The likelihood ratio of a negative test with its 95% CI was calculated for each CPR and displayed as value with 95% CI. A CPR with a value < 0.1 is considered as adequate to rule out appendicitis. 19 Secondary outcomes in terms of sensitivity and negative predictive value are displayed as % with 95% CI. Performed imaging studies are displayed as numbers and percentages.

Identification of CPRs: Literature Review
In total, 19 CPRs were identified, of which seven were excluded ( Fig. 1). Reasons for exclusion were multiplication of variables (n = 2), 20,21 consisting of > 15 variables (n = 1), 22 not applicable in children (n = 1), 23 and included variables not obtained routinely (n = 3). [24][25][26] Variables that were not routinely obtained in our hospital were a priori suspicion of appendicitis (low, intermediate, high), rectal-axillary temperature difference, and classification of rebound tenderness into light, medium, and strong.

Results of the CPRs Retrospectively Tested in our Cohort
311 patients were identified in the defined time period of which 20 were excluded for the following reasons: abdominal pain caused by trauma (13 patients), presentation of another main complaint other than abdominal pain (four patients), transfer to an academic hospital (two patients), and no cooperation with physical examination (one patient).
The general characteristics of the 291 included patients are listed in Table 1. In total, 87 (29.9%) patients were diagnosed with acute appendicitis. Table 2 shows the number of patients (%) for which each CPR could be calculated. The RIPASA, modified Lindberg, and Fenyö score were excluded from further analysis as less than 50% of patients' scores could be calculated mainly due to missing data. Aggravation with cough, progression of pain, and Rovsing's sign were the variables with most frequently missing values for the Fenyö score, modified Lindberg, and RIPASA score, respectively. The negative likelihood ratio, sensitivity, and negative predictive value for the CPRs are presented in Table 3, which divides the CPRs into those that are developed for the pediatric population and those for the adult population. The point estimate of the negative likelihood ratio of seven CPRs was < 0.1. These were the Ohmann score (0), Alvarado score (0.03, 95% CI, 0.00-0.20), MAS-Shera (0.03, 95% C, 0.00-0.23), PAS (0.07, 95% CI, 0.00-0.22), LRARR (0.07, 95% CI, 0.02-0.23), Christian score (0.08, 95% CI, 0.00-0.22), and LRAR (0.09, 95% CI, 0.04-0.25). Table 4 presents numbers of patients with low suspicion of appendicitis according to each of the seven CPRs with a negative likelihood ratio < 0.1 in whom additional imaging was performed. In 30-46% of these patients, additional imaging studies had been performed during diagnostic work-up to exclude appendicitis. Nine patients had a false negative test result according to at least one of these CPRs and were

Discussion
The aim of this study was to investigate the value of CPRs in ruling out appendicitis in our retrospective cohort in terms of negative likelihood ratio in order to select CPRs that could potentially be included in a future prospective cohort study.
In this study, seven CPRs had a negative likelihood ratio point estimate < 0.1, which therefore could impact clinical decision-making. 19 Therefore, these CPRs might be used in a future prospective cohort study comparing their ability to rule out appendicitis in children presenting with abdominal pain at the emergency department. Depending on the used CPR, in no more than 4% of the patients with a low suspicion of appendicitis, appendicitis was diagnosed within 30 days. In 30-46% of patients with a low suspicion of appendicitis, additional imaging studies had been undertaken.
Only a few studies have investigated the value of CPRs in ruling out appendicitis in children and they mostly expressed this value by sensitivity. The discriminatory power of a diagnostic test can best be displayed by likelihood ratios in our opinion, as it is not influenced by disease prevalence. 38 Recent systematic reviews, comprising 10-12 prospective derivation and validation studies with a total of around 4000 children, investigated the Alvarado score and PAS in the pediatric population and found negative likelihood ratios for these CPRs that were similar to our results; for the Alvarado score, negative likelihood ratios between 0.03 (95% CI, 0-0.36) and 0.38 (95% CI, 0.21-0.70) were found. Regarding the PAS, negative likelihood ratios ranging between 0 and 0.27 (95% CI, 0.20-0.43) have been reported. 10,16 Differences in negative likelihood ratios regarding the Alvarado score in the published negative likelihood ratios might be caused by different cutoff values that were used in the systematic reviews. 11,39 Furthermore, daily practice concerning the use of additional imaging might differ between countries. 39 Regarding the PAS, modest differences in negative likelihood ratio compared to the results in our study could be explained by the prospective nature of the included studies (versus our retrospective study) and by different inclusion criteria of the included population.
To our knowledge, we are the first to present negative likelihood ratios of other CPRs in addition to the Alvarado score and PAS in the same cohort. Furthermore, this study included multiple CPRs that do not incorporate extensive laboratory parameters. Multiple biochemical variables that are included in most CPRs, such as neutrophil count and leukocyte differentiation, are not routinely tested in the Netherlands when a child presents at the emergency department. Because of the identification and inclusion of both CPRs with and without extensive laboratory parameters, we were able to present a  complete overview of all CPRs that can potentially be used in future prospective studies comparing their ability in ruling out appendicitis. In order to present a complete overview of all potential CPRs, we determined a low cutoff value of at least 50% of available data per CPR for inclusion in our analysis. We do realize that this cutoff value is low, but the aim of this study was to identify CPRs, investigate their potential in ruling out appendicitis, and investigate their appropriateness in the current diagnostic work-up as performed in the Netherlands in order to select them for a prospective cohort study. This cutoff value was determined prior to the identification of CPRs, and we realize that the use of a higher cutoff value might have led to a more stringent selection of only those CPRs that are most appropriate in our population. The evidence-based guideline regarding the diagnosis and treatment of appendicitis, introduced by the Association of Surgeons of the Netherlands in 2010, emphasized reduction of the negative appendectomy rate. Imaging procedures are advocated to improve diagnostic accuracy and the consequence of this change has been the increased utilization of ultrasound as the initial imaging modality to evaluate abdominal pain in children in the Netherlands. 40 In 2010, preoperative imaging procedures were performed in 44% of patients presenting at the emergency room with abdominal pain in the Netherlands, compared to only 22% of the patients a decade earlier. 3,41 Currently in the Netherlands, in 99.7% of patients preoperative imaging is performed. 4 A recent study conducted in the USA found that 99.7% of pediatric patients underwent preoperative-imaging studies as well. 42 This differs significantly from the performance of preoperative imaging in the UK, where preoperative ultrasound and computed tomography (CT) were performed in 19.9 and 12.9% of patients respectively. 43 Ultrasound has a high frequency of inconclusive results, reported to range between 37 to 51% in the pediatric population. 7,44 Increased performance of ultrasound therefore results in increased use of costly and potentially harmful imaging studies, such as CT and MRI in pediatric patients.
In this study, in 30-46% of patients with a low suspicion of appendicitis according to these CPRs, additional imaging studies had been undertaken, whereas in no more than 4% of these patients (depending on the used CPR) acute appendicitis was diagnosed within 30 days. Nonetheless, because of the retrospective nature of this study, it might be possible that these additional imaging studies have not been solely  Data is displayed as value (% of total patients with low suspicion of appendicitis per CPR) performed to diagnose appendicitis, but also to exclude other potential diagnoses. Still, it raises the question whether or not watchful waiting should be considered for children with a low suspicion of appendicitis instead of additional imaging studies to rule out appendicitis. Opponents of this less aggressive diagnostic work-up mostly fear perforation of the appendix in case of complicated appendicitis. 45 However, several studies have not found clinical observation or re-evaluation to be associated with a significantly higher incidence of complicated appendicitis and perforation. 46,47 Time to presentation at the emergency department appears to be the main factor associated with perforation in children with appendicitis. 46,48 Furthermore, literature suggests that perforation can rarely be prevented, implicating that a correct diagnosis is more important than a rapid treatment strategy. 49 This study has several limitations. First, due to the singlecenter nature, generalizability might be reduced, although it was performed in a general teaching hospital. Second, the retrospective nature of this study might have led to selection bias and information bias. In case of wrong ICD code classification, patients might have been missed. We do realize that inclusion of 291 patients in 3 years' time seems to be low for a large teaching hospital. This low number of patients could be explained by the fact that we only used ICD codes of acute appendicitis, acute abdomen, and general abdominal pain, because inclusion of children presenting with, for example, mainly symptoms of urinary tract infection would artificially decrease negative likelihood ratios and overestimate the value of the CPRs. Patient files with missing data were left out of the analysis. As a result, for only 34-90% of the patients the CPRs could be calculated. However, the aim of this study was not only to determine the ability of CPRs to rule out appendicitis in our cohort but also to investigate their appropriateness within the current diagnostic work-up as performed in the Netherlands. As mentioned previously, this study was performed in order to select appropriate and useful CPRs for a future prospective cohort study that can compare their value in ruling out appendicitis in our cohort. Nonetheless, missing data could have led to a selection bias, whereby the results of a CPR may have been inflated, leading to a low negative likelihood ratio. For example, in our population the CPR with the optimal negative likelihood ratio (Ohmann score) could only be calculated for 50.2% of the total population. Third, the small sample size causes wide confidence intervals for the calculated accuracy statistics.
In addition, CPRs are prone to subjective interpretation by the treating physician (e.g., variables of physical examination). In a prospective study by Mandeville, an interobserver agreement was found of 88 and 83.5% for the Alvarado score and the PAS, respectively. 48 Another problem is the potential partial verification bias. Patients were classified in the nonappendicitis group if there was no recurrence of abdominal pain during 30 days after initial presentation. Patients who went to another hospital during these 30 days could have been missed. On the other hand, there is no nearby facility that is comparable to the general teaching hospital where this study was conducted. Therefore, it can be expected that patients attend the same emergency department as during their initial presentation. Another issue is that a number of patients were included in the appendicitis group despite the fact that intraoperative or histopathological findings were not obtained, potentially also leading to misclassification. However, these patients were included in the APAC study, in which a radiologically confirmed simple appendicitis was an inclusion criterion.
In conclusion, we identified seven CPRs that could potentially be used in a future prospective cohort study to compare their ability to rule out appendicitis in the pediatric population in the Netherlands and other countries with comparable diagnostic work-up. Further prospective studies are needed to investigate if imaging studies could safely be omitted and be replaced by clinical monitoring or re-evaluation in children with a low CPR score.
Author Contributions PA: Study concept and design; acquisition of the data, analysis, and interpretation of the data; and drafting of the manuscript.
RG: Study concept and design, analysis and interpretation of the data, drafting of the manuscript, and critical revision of the manuscript.
JL: Critical revision of the manuscript for important intellectual content and statistical expertise.
HC: Drafting of the manuscript and critical revision of the manuscript for important intellectual content.
RB: Drafting of the manuscript and critical revision of the manuscript for important intellectual content.
HH: Drafting of the manuscript and critical revision of the manuscript for important intellectual content.

Compliance with Ethical Standards
Conflict of Interest The authors declare that they have no conflicts of interest.

Appendix 1
Pubmed database search Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.