Skip to main content

Advertisement

Log in

Classifying clinical notes with pain assessment using machine learning

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Pain is a significant public health problem, affecting millions of people in the USA. Evidence has highlighted that patients with chronic pain often suffer from deficits in pain care quality (PCQ) including pain assessment, treatment, and reassessment. Currently, there is no intelligent and reliable approach to identify PCQ indicators inelectronic health records (EHR). Hereby, we used unstructured text narratives in the EHR to derive pain assessment in clinical notes for patients with chronic pain. Our dataset includes patients with documented pain intensity rating ratings > = 4 and initial musculoskeletal diagnoses (MSD) captured by (ICD-9-CM codes) in fiscal year 2011 and a minimal 1 year of follow-up (follow-up period is 3-yr maximum); with complete data on key demographic variables. A total of 92 patients with 1058 notes was used. First, we manually annotated qualifiers and descriptors of pain assessment using the annotation schema that we previously developed. Second, we developed a reliable classifier for indicators of pain assessment in clinical note. Based on our annotation schema, we found variations in documenting the subclasses of pain assessment. In positive notes, providers mostly documented assessment of pain site (67%) and intensity of pain (57%), followed by persistence (32%). In only 27% of positive notes, did providers document a presumed etiology for the pain complaint or diagnosis. Documentation of patients’ reports of factors that aggravate pain was only present in 11% of positive notes. Random forest classifier achieved the best performance labeling clinical notes with pain assessment information, compared to other classifiers; 94, 95, 94, and 94% was observed in terms of accuracy, PPV, F1-score, and AUC, respectively. Despite the wide spectrum of research that utilizes machine learning in many clinical applications, none explored using these methods for pain assessment research. In addition, previous studies using large datasets to detect and analyze characteristics of patients with various types of pain have relied exclusively on billing and coded data as the main source of information. This study, in contrast, harnessed unstructured narrative text data from the EHR to detect pain assessment clinical notes. We developed a Random forest classifier to identify clinical notes with pain assessment information. Compared to other classifiers, ours achieved the best results in most of the reported metrics.

Framework for detecting pain assessment in clinical notes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig 2

Similar content being viewed by others

References

  1. Simon LS (2012) Relieving pain in America: a blueprint for transforming prevention, care, education, and research. J Pain Palliative Care Pharmacother 26(2):197–198

    Article  Google Scholar 

  2. Dorflinger LM, Gilliam WP, Lee AW, Kerns RD (2014) Development and application of an electronic health record information extraction tool to assess quality of pain management in primary care. Transl Behav Med 4(2):184–189

    Article  PubMed  PubMed Central  Google Scholar 

  3. Hooten W, Timming R, Belgrade M. Assessment and management of chronic pain. Bloomington, MN: Institute for Clinical Systems Improvement; 2013. h ttps

  4. Tian TY, Zlateva I, Anderson DR (2013) Using electronic health records data to identify patients with chronic pain in a primary care setting. J Am Med Inform Assoc 20(e2):e275–e280

    Article  PubMed  PubMed Central  Google Scholar 

  5. Sinnott PL, Siroka AM, Shane AC, Trafton JA, Wagner TH (2012) Identifying neck and back pain in administrative data: defining the right cohort. Spine 37(10):860–874

    Article  PubMed  PubMed Central  Google Scholar 

  6. Plaisance L (2000) Pain—Clinical Manual. Home Healthcare Now 18(8):556

    Article  Google Scholar 

  7. Krebs EE, Carey TS, Weinberger M (2007) Accuracy of the pain numeric rating scale as a screening test in primary care. J Gen Intern Med 22(10):1453–1458

    Article  PubMed  PubMed Central  Google Scholar 

  8. Goetzke G, Johns T, Reid M, Borg J (2001) Carlson a. Chronic pain patient identification system, Google Patents

    Google Scholar 

  9. Maeng DD, Stewart WF, Yan X et al (2015) Use of electronic health records for early detection of high-cost, low back pain patients. Pain Res Manag 20(5):234–240

    Article  PubMed  PubMed Central  Google Scholar 

  10. Jordan KP, Timmis A, Croft P et al (2017) Prognosis of undiagnosed chest pain: linked electronic health record cohort study. BMJ 357:j1194

    Article  PubMed  PubMed Central  Google Scholar 

  11. Bui DDA, Zeng-Treitler Q (2014) Learning regular expressions for clinical text classification. J Am Med Inform Assoc 21(5):850–857

    Article  PubMed  PubMed Central  Google Scholar 

  12. Sellinger JJ, Wallio SC, Clark EA, Kerns RD, Ebert M, Kerns R (2010) Comprehensive pain assessment: the integration of biopsychosocial principles. Cambridge University Press New York

  13. Anderson D, Zlateva I, Lee A, Tian T, Khatri K, Ruser CB (2016) Stepped care model for pain management and quality of pain care in long-term opioid therapy. J Rehabil Res Dev 53(1):137

    Article  PubMed  Google Scholar 

  14. Haskell SG, Brandt CA, Krebs EE, Skanderson M, Kerns RD, Goulet JL (2009) Pain among veterans of operations enduring freedom and Iraqi freedom: do women and men differ? Pain Med 10(7):1167–1173

    Article  PubMed  Google Scholar 

  15. Affairs DoV. Chapter 264: pact primary care clinic (PPCC). 2015

    Google Scholar 

  16. Weed LL (1964) Medical records, patient care, and medical education. Irish J Med Sci (1926–1967) 39(6):271–282

    Article  Google Scholar 

  17. Cameron S, Turtle-Song I (2002) Learning to write case notes using the SOAP format. J Couns Dev 80(3):286–292

    Article  Google Scholar 

  18. South BR, Shen S, Leng J, Forbush TB, DuVall SL, Chapman WW (2012) A prototype tool set to support machine-assisted annotation. Paper presented at: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

  19. Hripcsak G, Rothschild AS (2005) Agreement, the f-measure, and reliability in information retrieval. J Am Med Inform Assoc 12(3):296–298

    Article  PubMed  PubMed Central  Google Scholar 

  20. Hripcsak G, Heitjan DF (2002) Measuring agreement in medical informatics reliability studies. J Biomed Inform 35(2):99–110

    Article  PubMed  Google Scholar 

  21. Ogren PV, Savova G, Buntrock JD, Chute CG (2006) Building and evaluating annotated corpora for medical NLP systems. Paper presented at: AMIA Annual Symposium Proceedings

  22. Ogren PV, Savova GK, Chute CG. Constructing evaluation corpora for automated clinical named entity recognition. Paper presented at: Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems2007

  23. Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF (2008) Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 35(128):44

    Google Scholar 

  24. SKI Learn. http://scikit-learn.org/stable/tutorial/index.html . 2016

  25. Bird S. NLTK: the natural language toolkit. Paper presented at: Proceedings of the COLING/ACL on Interactive presentation sessions2006

  26. Cunningham P, Delany SJ (2007) K-nearest neighbour classifiers. Multiple Classifier Syst 34:1–17

    Google Scholar 

  27. Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883

    Article  Google Scholar 

  28. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  29. Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybernetics 21(3):660–674

    Article  Google Scholar 

  30. Xia F, Zhang W, Li F, Yang Y (2008) Ranking with decision tree. Knowl Inf Syst 17(3):381–395

    Article  Google Scholar 

  31. Gunn SR (1998) Support vector machines for classification and regression. ISIS Tech Report 14:85–86

    Google Scholar 

  32. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  33. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43(6):1947–1958

    Article  PubMed  CAS  Google Scholar 

  34. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  35. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Paper presented at: Ijcai1995

  36. Refaeilzadeh P, Tang L, Liu H. Cross-validation. Encyclopedia of database systems: Springer; 2009:532–538

  37. Zhang P (1993) Model selection via multifold cross validation. Ann Stat:299–313

  38. Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923

    Article  PubMed  CAS  Google Scholar 

  39. Bouckaert RR. Choosing between two learning algorithms based on calibrated tests. Paper presented at: Proceedings of the 20th International Conference on Machine Learning (ICML-03)2003

  40. Ross KA (2009) Cache-conscious query processing. Springer, Encyclopedia of Database Systems, pp 301–304

    Google Scholar 

  41. Fodeh SJ, Trentalange M, Allore HG, Gill TM, Brandt CA, Murphy TE (2015) Baseline cluster membership demonstrates positive associations with first occurrence of multiple gerontologic outcomes over 10 years. Exp Aging Res 41(2):177–192

    Article  PubMed  PubMed Central  Google Scholar 

  42. Begg RK, Palaniswami M, Owen B (2005) Support vector machines for automated gait classification. IEEE Trans Biomed Eng 52(5):828–838

    Article  PubMed  Google Scholar 

  43. Widjaja E, Zheng W, Huang Z (2008) Classification of colonic tissues using near-infrared Raman spectroscopy and support vector machines. Int J Oncol 32(3):653–662

    PubMed  CAS  Google Scholar 

  44. Orrù G, Pettersson-Yeo W, Marquand AF, Sartori G, Mechelli A (2012) Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review. Neurosci Biobehav Rev 36(4):1140–1152

    Article  PubMed  Google Scholar 

  45. El-Naqa I, Yang Y, Wernick MN, Galatsanos NP, Nishikawa RM (2002) A support vector machine approach for detection of microcalcifications. IEEE Trans Med Imaging 21(12):1552–1563

    Article  PubMed  Google Scholar 

  46. Lee Y, Lee C-K (2003) Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics 19(9):1132–1139

    Article  PubMed  CAS  Google Scholar 

  47. Pakhomov SV, Hanson PL, Bjornsen SS, Smith SA (2008) Automatic classification of foot examination findings using clinical notes and machine learning. J Am Med Inform Assoc 15(2):198–202

    Article  PubMed  PubMed Central  Google Scholar 

  48. McCart JA, Berndt DJ, Finch D, Jarman J, Luther S. Using Statistical Text Mining to Identify Falls in VHA Ambulatory Care Data. Paper presented at: AMIA2012

  49. Fodeh S, Benin A, Miller P, Lee K, Koss M, Brandt C. Laplacian SVM Based Feature Selection Improves Medical Event Reports Classification. Paper presented at: 2015 I.E. International Conference on Data Mining Workshop (ICDMW)2015

  50. Cicero TJ, Wong G, Tian Y, Lynskey M, Todorov A, Isenberg K. Co-morbidity and utilization of medical services by pain patients receiving opioid medications: data from an insurance claims database. PAIN®. 2009;144(1):20–27

  51. Breen AC, Carr E, Langworthy JE, Osmond C, Worswick L (2011) Back pain outcomes in primary care following a practice improvement intervention:-a prospective cohort study. BMC Musculoskelet Disord 12(1):1

    Article  Google Scholar 

  52. Berger A, Sadosky A, Dukes E, Edelsberg J, Oster G (2012) Clinical characteristics and patterns of healthcare utilization in patients with painful neuropathic disorders in UK general practice: a retrospective cohort study. BMC Neurol 12(1):1

    Article  Google Scholar 

  53. Sullivan MD, Edlund MJ, Fan M-Y, DeVries A, Braden JB, Martin BC (2010) Risks for possible and probable opioid misuse among recipients of chronic opioid therapy in commercial and medicaid insurance plans: the TROUP study. Pain 150(2):332–339

    Article  PubMed  PubMed Central  Google Scholar 

  54. Goulet JL, Kerns RD, Bair M et al (2016) The musculoskeletal diagnosis cohort: examining pain and pain care among veterans. Pain 157(8):1696–1703

    Article  PubMed  PubMed Central  Google Scholar 

  55. Moore BA, Anderson D, Dorflinger L, Zlateva I, Lee A, Gilliam W, Tian T, Khatri K, Ruser C, Kerns RD (2016) The stepped care model of pain management and quality of pain care in long-term opioid therapy. J Rehab Res Develop 53(1):137–146

    Article  Google Scholar 

  56. Dworkin RH, Turk DC, Farrar JT, Haythornthwaite JA, Jensen MP, Katz NP, Kerns RD et al (2016) Core outcome measures for chronic pain clinical trials: IMMPACT recommendations. Pain 113:9–19

    Article  Google Scholar 

Download references

Funding

This study was funded by NIH National Center for Complementary and Alternative Medicine—grant number (1R01AT008448-01). It was also partially funded by the Veterans Affairs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samah Jamal Fodeh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fodeh, S.J., Finch, D., Bouayad, L. et al. Classifying clinical notes with pain assessment using machine learning. Med Biol Eng Comput 56, 1285–1292 (2018). https://doi.org/10.1007/s11517-017-1772-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-017-1772-1

Keywords

Navigation