Making Study Populations Visible Through Knowledge Graphs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11779)


Treatment recommendations within Clinical Practice Guidelines (CPGs) are largely based on findings from clinical trials and case studies, referred to here as research studies, that are often based on highly selective clinical populations, referred to here as study cohorts. When medical practitioners apply CPG recommendations, they need to understand how well their patient population matches the characteristics of those in the study cohort, and thus are confronted with the challenges of locating the study cohort information and making an analytic comparison. To address these challenges, we develop an ontology-enabled prototype system, which exposes the population descriptions in research studies in a declarative manner, with the ultimate goal of allowing medical practitioners to better understand the applicability and generalizability of treatment recommendations. We build a Study Cohort Ontology (SCO) to encode the vocabulary of study population descriptions, that are often reported in the first table in the published work, thus they are often referred to as Table 1. We leverage the well-used Semanticscience Integrated Ontology (SIO) for defining property associations between classes. Further, we model the key components of Table 1s, i.e., collections of study subjects, subject characteristics, and statistical measures in RDF knowledge graphs. We design scenarios for medical practitioners to perform population analysis, and generate cohort similarity visualizations to determine the applicability of a study population to the clinical population of interest. Our semantic approach to make study populations visible, by standardized representations of Table 1s, allows users to quickly derive clinically relevant inferences about study populations.

Resource Website:


Scientific Study Data Analysis Knowledge graphs Modeling Aggregations and Summary Statistics Ontology Development 



This work is partially supported by IBM Research AI through the AI Horizons Network. We thank our colleagues from IBM Research, Dan Gruen, Morgan Foreman and Ching-Hua Chen, and from RPI, John Erickson, Alexander New, and Rebecca Cowan, who greatly assisted the research.


  1. 1.
    American Diabetes Association (ADA) et al.: 8. Pharmacologic approaches to glycemic treatment: Standards of medical care in diabetes - 2018. Diabetes Care 41(Suppl. 1), S73–S85 (2018)Google Scholar
  2. 2.
    American Diabetes Association (ADA) et al.: 9. Cardiovascular disease and risk management: standards of medical care in diabetes - 2018. Diabetes Care 41(Suppl. 1), S86–S104 (2018)Google Scholar
  3. 3.
    Auer, S., Kovtun, V., Prinz, M., Kasprzik, A., Stocker, M., Vidal, M.E.: Towards a knowledge graph for science. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, p. 1. ACM, Novi Sad (2018)Google Scholar
  4. 4.
    Bechhofer, S., et al.: OWL web ontology language reference. OWL Reference Guide.
  5. 5.
    Courtot, M., et al.: MIREOT: The minimum information to reference an external ontology term. Appl. Ontol. 6(1), 23–33 (2011)Google Scholar
  6. 6.
    Cyganiak, R., Field, S., Gregory, A., Halb, W., Tennison, J.: Semantic statistics: bringing together SDMX and SCOVO. In: Proceedings of the Linked Data on the Web Workshop (LDOW 2010), Raleigh, North Carolina, USA, 27 April 2010 (2010). Accessed 26 Mar 2019
  7. 7.
    Garijo, D., Poveda-VillalÃşn, M.: A checklist for complete vocabulary metadata. List of Desirable Ontology Best-Practices.
  8. 8.
    Graham, R., et al.: Trustworthy clinical practice guidelines: challenges and potential. In: Clinical Practice Guidelines We Can Trust, pp. 53–75. National Academies Press (US), Washington D.C. (2011)Google Scholar
  9. 9.
    Hurtado, C.A., Poulovassilis, A., Wood, P.T.: Query relaxation in RDF. J. Data Semant. X 4900, 31–61 (2008)CrossRefGoogle Scholar
  10. 10.
    Ontarget Investigators: Telmisartan, ramipril, or both in patients at high risk for vascular events. N. Engl. J. Med. 358(15), 1547–1559 (2008)Google Scholar
  11. 11.
    Jang, M., Jahanshad, N., Espiritu, R.: The cohort ontology. Enigma Knowledge Capture and Discovery Project.
  12. 12.
    Masic, I., Miokovic, M., Muhamedagic, B.: Evidence based medicine-new approaches and challenges. Acta Inform. Med. 16(4), 219 (2008)CrossRefGoogle Scholar
  13. 13.
    National Institute of Health (NIH): Rigor and Reproducibility. Introduction and need for principles.
  14. 14.
    New, A., Rashid, S.M., Erickson, J.S., McGuinness, D.L., Bennett, K.P.: Semantically-aware population health risk analyses. Presented as a Poster at Machine Learning for Health (ML4H) Workshop, NeurIPS, Montreal, Canada (2018). Accessed 20 Mar 2019
  15. 15.
    NIH Colloboratory: Table 1 project. Rethinking Clinical Trials.
  16. 16.
    Noy, N.F., et al.: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 37(suppl\(_2\)), W170–W173 (2009)CrossRefGoogle Scholar
  17. 17.
    Patel, C., et al.: Matching patient records to clinical trials using ontologies. In: Aberer, K., et al. (eds.) ISWC 2007. LNCS, vol. 4825, pp. 816–829. Springer, Heidelberg (2007). Scholar
  18. 18.
    Reinhardt, S.: Property reification vocabulary. A Strawman Draft.
  19. 19.
    Shankar, R.D., Martins, S.B., O’Connor, M.J., Parrish, D.B., Das, A.K.: Epoch: an ontological framework to support clinical trials management. In: Proceedings of the International Workshop on Healthcare Information and Knowledge Management, pp. 25–32. ACM, Arlington (2006)Google Scholar
  20. 20.
    Sim, I., et al.: The ontology of clinical research (OCRe): an informatics foundation for the science of clinical research. J. Biomed. Inform. 52, 78–91 (2014)CrossRefGoogle Scholar
  21. 21.
    Tu, S.W., et al.: A practical method for transforming free-text eligibility criteria into computable criteria. J. Biomed. Inform. 44(2), 239–250 (2011)CrossRefGoogle Scholar
  22. 22.
    Valdez, J., Kim, M., Rueschman, M., Socrates, V., Redline, S., Sahoo, S.S.: Provcare semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. In: AMIA Annual Symposium Proceedings, vol. 2017, p. 1705. American Medical Informatics Association, Washington D.C., USA (2017)Google Scholar
  23. 23.
    Xiang, Z., Courtot, M., Brinkman, R.R., Ruttenberg, A., He, Y.: OntoFox: web-based support for ontology reuse. BMC Res. Notes 3(1), 175 (2010)CrossRefGoogle Scholar
  24. 24.
    Younesi, E.: A knowledge-based integrative modeling approach for in-silico identification of mechanistic targets in neurodegeneration with focus on Alzheimer’s disease. Ph.D. thesis, Department of Mathematics and Natural Sciences, Universitäts-und Landesbibliothek Bonn, Bonn, Germany (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Rensselaer Polytechnic InstituteTroyUSA
  2. 2.IBM ResearchCambridgeUSA

Personalised recommendations