Intensive Care Medicine

, Volume 33, Issue 4, pp 619–624 | Cite as

Reproducibility of physiological track-and-trigger warning systems for identifying at-risk patients on the ward

  • Christian P. Subbe
  • Haiyan Gao
  • David A. HarrisonEmail author



Physiological track-and-trigger warning systems are used to identify patients on acute wards at risk of deterioration, as early as possible. The objective of this study was to assess the inter-rater and intra-rater reliability of the physiological measurements, aggregate scores and triggering events of three such systems.


Prospective cohort study.


General medical and surgical wards in one non-university acute hospital.

Patients and participants

Unselected ward patients: 114 patients in the inter-rater study and 45 patients in the intra-rater study were examined by four raters.

Measurements and results

Physiological observations obtained at the bedside were evaluated using three systems: the medical emergency team call-out criteria (MET); the modified early warning score (MEWS); and the assessment score of sick-patient identification and step-up in treatment (ASSIST). Inter-rater and intra-rater reliability were assessed by intra-class correlation coefficients, kappa statistics and percentage agreement. There was fair to moderate agreement on most physiological parameters, and fair agreement on the scores, but better levels of agreement on triggers. Reliability was partially a function of simplicity: MET achieved a higher percentage of agreement than ASSIST, and ASSIST higher than MEWS. Intra-rater reliability was better then inter-rater reliability. Using corrected calculations improved the level of inter-rater agreement but not intra-rater agreement.


There was significant variation in the reproducibility of different track-and-trigger warning systems. The systems examined showed better levels of agreement on triggers than on aggregate scores. Simpler systems had better reliability. Inter-rater agreement might improve by using electronic calculations of scores.


Observer variation Reproducibility of results Critical illness Scoring systems 



This study was funded by the UK National Health Service Research and Development Service Delivery and Organisation Programme (SDO/74/2004). The authors thank S. Ameeth, S. Collins, K. Ghosh, C. Rincon and J. Tobler for their help in preparing the study, obtaining consent from patients and collecting the data. We thank A. Pawley for entering data into electronic format and L. Gemmell for advising on the format and facilitating the setup of the study.

Supplementary material

134_2006_516_MOESM1_ESM.doc (148 kb)
Electronic Supplementary Material (DOC 149K)


  1. 1.
    Department of Health and NHS Modernisation Agency (2003) The National Outreach Report. Department of Health, LondonGoogle Scholar
  2. 2.
    Lee A, Bishop G, Hilman K, Daffurn K (1995) The medical emergency team. Anaesth Intensive Care 23:183–186PubMedGoogle Scholar
  3. 3.
    Subbe CP, Kruger M, Rutherford P, Gemmell L (2001) Patients at risk: validation of a modified Early Warning Score in medical admissions. Q J Med 94:521–526Google Scholar
  4. 4.
    Buist MD, Moore GE, Bernard SA, Waxman BP, Anderson JN, Nguyen TV (2002) Effects of a medical emergency team on reduction of incidence of and mortality from unexpected cardiac arrests in hospital: preliminary study. Br Med J 324:387–390CrossRefGoogle Scholar
  5. 5.
    Pittard AJ (2003) Out of our reach? Assessing the impact of introducing a critical care outreach service. Anaesthesia 58:882–885PubMedCrossRefGoogle Scholar
  6. 6.
    Stenhouse C, Coates S, Tivey M, Allsop P, Parker T (2000) Prospective evaluation of a Modified Early Warning Score to aid earlier detection of patients developing critical illness on a general surgical ward. Br J Anaesth 84:663PGoogle Scholar
  7. 7.
    Subbe CP, Hibbs R, Williams E, Rutherford P, Gemmel L (2002) ASSIST: a screening tool for critically ill patients on general medical wards. Intensive Care Med 28:S21Google Scholar
  8. 8.
    Harrison D (2004) KAPUTIL: Stata module to generate confidence intervals and sample size calculations for the kappa-statistic. Statistical Software Components S446501, Boston College Department of Economics. (Available at
  9. 9.
    Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research. Blackwell, Oxford, pp 704–707Google Scholar
  10. 10.
    Meade MO, Cook RJ, Guyatt GH, Groll R, Kachura JR, Bedard M, Cook DJ, Slutsky AS, Stewart TE (2000) Interobserver variation in interpreting chest radiographs for the diagnosis of acute respiratory distress syndrome. Am J Respir Crit Care Med 161:85–90PubMedGoogle Scholar
  11. 11.
    Altman D (1991) Practical statistics for medical research. Chapman and Hall, London, pp 401–409Google Scholar
  12. 12.
    Subbe CP, Davies RG, Williams E, Rutherford P, Gemmell L (2003) Effect of introducing the Modified Early Warning score on clinical outcomes, cardio-pulmonary arrests and intensive care utilisation in acute medical admissions. Anaesthesia 58:797–802PubMedCrossRefGoogle Scholar
  13. 13.
    Giuliano KK, Scott SS, Elliot S, Giuliano AJ (1999) Temperature measurement in critically ill orally intubated adults: a comparison of pulmonary artery core, tympanic, and oral methods. Crit Care Med 27:2188–2193PubMedCrossRefGoogle Scholar
  14. 14.
    Polderman KH, Christiaans HMT, Wester JP, Spijkstra JJ, Girbes ARJ (2001) Intra-observer variability in APACHE II scoring. Intensive Care Med 27:1550–1552PubMedCrossRefGoogle Scholar
  15. 15.
    Polderman KH, Jorna EMF, Girbes ARJ (2001) Inter-observer variability in APACHE II scoring: effect of strict guidelines and training. Intensive Care Med 27:1365–1369PubMedCrossRefGoogle Scholar
  16. 16.
    Chen LM, Martin CM, Morrison TL, Sibbald WJ (1999) Interobserver variability in data collection of the APACHE II score in teaching and community hospitals. Crit Care Med 27:1999–2004PubMedCrossRefGoogle Scholar
  17. 17.
    Rué M, Valero C, Quintana S, Artigas A, Álvarez M (2000) Interobserver variability of the measurement of the mortality probability models (MPM II) in the assessment of severity of illness. Intensive Care Med 26:286–291PubMedCrossRefGoogle Scholar
  18. 18.
    Lefering R, Zart M, Neugebauer EAM (2000) Retrospective evaluation of the simplified therapeutic intervention scoring system (TISS-28) in a surgical intensive care unit. Intensive Care Med 26:1794–1802PubMedCrossRefGoogle Scholar
  19. 19.
    Arts DGT, de Keizer NF, Vroom MB, de Jonge E (2005) Reliability and accuracy of Sequential Organ Failure Assessment (SOFA) scoring. Crit Care Med 33:1988–1993PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Christian P. Subbe
    • 1
  • Haiyan Gao
    • 2
  • David A. Harrison
    • 2
    Email author
  1. 1.Department of MedicineWrexham Maelor HospitalWrexham LL13 4TXUK
  2. 2.Intensive Care National Audit and Research CentreTavistock HouseLondon WC1H 9HRUK

Personalised recommendations