Intensive Care Medicine

, Volume 33, Issue 4, pp 619–624

Reproducibility of physiological track-and-trigger warning systems for identifying at-risk patients on the ward

  • Christian P. Subbe
  • Haiyan Gao
  • David A. Harrison

DOI: 10.1007/s00134-006-0516-8

Cite this article as:
Subbe, C.P., Gao, H. & Harrison, D.A. Intensive Care Med (2007) 33: 619. doi:10.1007/s00134-006-0516-8



Physiological track-and-trigger warning systems are used to identify patients on acute wards at risk of deterioration, as early as possible. The objective of this study was to assess the inter-rater and intra-rater reliability of the physiological measurements, aggregate scores and triggering events of three such systems.


Prospective cohort study.


General medical and surgical wards in one non-university acute hospital.

Patients and participants

Unselected ward patients: 114 patients in the inter-rater study and 45 patients in the intra-rater study were examined by four raters.

Measurements and results

Physiological observations obtained at the bedside were evaluated using three systems: the medical emergency team call-out criteria (MET); the modified early warning score (MEWS); and the assessment score of sick-patient identification and step-up in treatment (ASSIST). Inter-rater and intra-rater reliability were assessed by intra-class correlation coefficients, kappa statistics and percentage agreement. There was fair to moderate agreement on most physiological parameters, and fair agreement on the scores, but better levels of agreement on triggers. Reliability was partially a function of simplicity: MET achieved a higher percentage of agreement than ASSIST, and ASSIST higher than MEWS. Intra-rater reliability was better then inter-rater reliability. Using corrected calculations improved the level of inter-rater agreement but not intra-rater agreement.


There was significant variation in the reproducibility of different track-and-trigger warning systems. The systems examined showed better levels of agreement on triggers than on aggregate scores. Simpler systems had better reliability. Inter-rater agreement might improve by using electronic calculations of scores.


Observer variationReproducibility of resultsCritical illnessScoring systems

Supplementary material

134_2006_516_MOESM1_ESM.doc (148 kb)
Electronic Supplementary Material (DOC 149K)

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Christian P. Subbe
    • 1
  • Haiyan Gao
    • 2
  • David A. Harrison
    • 2
  1. 1.Department of MedicineWrexham Maelor HospitalWrexham LL13 4TXUK
  2. 2.Intensive Care National Audit and Research CentreTavistock HouseLondon WC1H 9HRUK