HTNSystem: Hypertension Information Extraction System for Unstructured Clinical Notes

  • Jitendra Jonnagaddala
  • Siaw-Teng Liaw
  • Pradeep Ray
  • Manish Kumar
  • Hong-Jie Dai
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8916)


Hypertension (HTN) relevant information has great application potential in cohort discovery and building predictive models for prevention and surveillance. Unfortunately most of this valuable patient information is buried in the form of unstructured clinical notes. In this study we present HTN information extraction system called HTNSystem which is capable of extracting mentions of HTN and inferring HTN from BP lab values. HTNSystem is a rule based system which implements MetaMap as a core component together with custom built BP value extractor and post processing components. It is evaluated on a corpus of 514 clinical notes (82.92% F-measure). HTNSystem is distributed as an open source command line tool available at .


Hypertension Blood pressure Information extraction Rule based Apache UIMA Apache Ruta Text mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kearney, P.M., et al.: Global burden of hypertension: analysis of worldwide data. The Lancet 365(9455), 217–223 (2005)CrossRefGoogle Scholar
  2. 2.
    Organization, W.H., I.S.O.H.W Group: World Health Organization (WHO)/International Society of Hypertension (ISH) statement on management of hypertension. Journal of hypertension 21(11), 1983–1992 (2003)Google Scholar
  3. 3.
    Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. JAMA 309(13), 1351–1352 (2013)CrossRefGoogle Scholar
  4. 4.
    Aronson, A.R., Lang, F.-M.: An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association 17(3), 229–236 (2010)CrossRefGoogle Scholar
  5. 5.
    Turchin, A., et al.: Using regular expressions to abstract blood pressure and treatment intensification information from the text of physician notes. Journal of the American Medical Informatics Association 13(6), 691–695 (2006)CrossRefGoogle Scholar
  6. 6.
    Turchin, A., Pendergrass, M.L., Kohane, I.S.: DITTO–a Tool for Identification of Patient Cohorts from the Text of Physician Notes in the Electronic Medical Record. In: AMIA Annual Symposium Proceedings. American Medical Informatics Association (2005)Google Scholar
  7. 7.
    Xu, H., et al.: MedEx: a medication information extraction system for clinical narratives. Journal of the American Medical Informatics Association 17(1), 19–24 (2010)CrossRefGoogle Scholar
  8. 8.
    Chang, N.-W., et al.: TMUNSW System for Risk Factor Recognition and Progression Tracking. In: Proceedings of the 2014 i2b2/UTHealth Shared-Tasks and Workshop on Challenges in Natural Language Processing for Clinical Data (2014)Google Scholar
  9. 9.
    Jonnagaddala, J., et al.: Coronary heart disease risk assessment from unstructured clinical notes using Framingham risk score. In: Proceedings of the 2014 i2b2/UTHealth Shared-Tasks and Workshop on Challenges in Natural Language Processing for Clinical Data (2014)Google Scholar
  10. 10.
    Stubbs, A., et al.: Practical applications for NLP in Clinical Research: the 2014 i2b2/UTHealth shared tasks (2014)Google Scholar
  11. 11.
    Jonnagaddala, J., et al.: TMUNSW: Disorder Concept Recognition and Normalization in Clinical Notes for SemEval-2014 Task 7. In: 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, August 23-24. ACL Anthology (2014)Google Scholar
  12. 12.
    Greenberg, J.O., et al.: Meaningful measurement: developing a measurement system to improve blood pressure control in patients with chronic kidney disease. Journal of the American Medical Informatics Association, p. amiajnl-2012-001308 (2013)Google Scholar
  13. 13.
    Voorham, J., Denig, P.: Computerized Extraction of Information on the Quality of Diabetes Care from Free Text in Electronic Patient Records of General Practitioners. Journal of the American Medical Informatics Association 14(3), 349–354 (2007)CrossRefGoogle Scholar
  14. 14.
    Gooch, P., Roudsari, A.: A tool for enhancing MetaMap performance when annotating clinical guideline documents with UMLS concepts (2011)Google Scholar
  15. 15.
    Osborne, R.M., Aronson, A.R., Cohen, K.B.: A repository of semantic types in the MIMIC II database clinical notes. In: ACL 2014, p. 93 (2014)Google Scholar
  16. 16.
    Jonquet, C., Shah, N.H., Musen, M.A.: The open biomedical annotator. Summit on Translational Bioinformatics, 56 (2009)Google Scholar
  17. 17.
    Roeder, C., et al.: A UIMA wrapper for the NCBO annotator. Bioinformatics 26(14), 1800–1801 (2010)CrossRefGoogle Scholar
  18. 18.
    Denny, J.C., et al.: Evaluation of a method to identify and categorize section headers in clinical documents. Journal of the American Medical Informatics Association 16(6), 806–815 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jitendra Jonnagaddala
    • 1
    • 2
    • 3
  • Siaw-Teng Liaw
    • 2
  • Pradeep Ray
    • 3
  • Manish Kumar
    • 1
  • Hong-Jie Dai
    • 4
  1. 1.Translational Cancer Research NetworkUNSWAustralia
  2. 2.School of Public Health and Community MedicineUNSWAustralia
  3. 3.Asia-Pacific Ubiquitous Healthcare Research CentreUNSWAustralia
  4. 4.Graduate Institute of Biomedical Informatics, College of Medical Science and TechnologyTaipei Medical UniversityTaipeiTaiwan, R.O.C.

Personalised recommendations