Journal of General Internal Medicine

, Volume 28, Issue 12, pp 1565–1572

Using Patients Like My Patient for Clinical Decision Support: Institution-Specific Probability of Celiac Disease Diagnosis Using Simplified Near-Neighbor Classification

  • Brian H. Shirts
  • Sterling T. Bennett
  • Brian R. Jackson
Original Research

DOI: 10.1007/s11606-013-2443-z

Cite this article as:
Shirts, B.H., Bennett, S.T. & Jackson, B.R. J GEN INTERN MED (2013) 28: 1565. doi:10.1007/s11606-013-2443-z



Interpretation of a diagnostic test result requires knowing what proportion of patients with a “similar” result has the condition in question. This information is often not readily available from the medical literature, or may be based on different clinical populations that make it nonapplicable. In certain settings, where correlated screening parameters and diagnostic data are available in electronic medical records, a representation of diagnostic test performance on “patients like my patient” can be obtained.


We sought to integrate patient demographic and physician practice information using a simplified nearest neighbor algorithm. We used this method to illustrate the relationship between tTG IgA test result and duodenal biopsy for celiac disease in a local diagnostic context.


We used a data set of 1,461 paired tissue transglutaminase (tTG) IgA and definitive duodenal biopsy results from Intermountain Healthcare with data on patient age and ordering physician specialty. This was split into a discovery set of 1,000 and a validation set of 461 paired results.


Accuracy of the local discovery data set in predicting probability of positive duodenal biopsy and confidence intervals around predicted probability in the test data compared to probabilities of positive biopsy implied from published logistic regression and from published sensitivity and specificity studies.


The near-neighbor method could estimate probability of clinical outcomes with predictive performance equivalent to other methods while adjusting probability estimates and confidence intervals to fit specific clinical situations.


Data from clinical encounters obtained from electronic medical records can yield prediction estimates that are tailored to the individual patient, local population, and healthcare delivery processes. Local analysis of diagnostic probability may be more clinically meaningful than probabilities inferred from published studies. This local utility may come at the expense of external validity and generalizability.


personalized medicine multifactorial analysis gluten sensitive enteropathy laboratory test information content evidence based medicine nearest neighbor 

Supplementary material

11606_2013_2443_MOESM1_ESM.docx (138 kb)
ESM 1(DOCX 137 kb)

Copyright information

© Society of General Internal Medicine 2013

Authors and Affiliations

  • Brian H. Shirts
    • 1
    • 2
  • Sterling T. Bennett
    • 1
    • 3
  • Brian R. Jackson
    • 1
    • 4
  1. 1.Department of PathologyUniversity of Utah School of MedicineSalt Lake CityUSA
  2. 2.Department of Laboratory MedicineUniversity of WashingtonSeattleUSA
  3. 3.Department of PathologyIntermountain Medical CenterMurrayUSA
  4. 4.ARUP Institute for Clinical and Experimental PathologySalt Lake CityUSA

Personalised recommendations