Skip to main content

Knowledge-Based Patient Data Generation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8268))

Abstract

The development and investigation of medical applications require patient data from various Electronic Health Records (EHR) or Clinical Records (CR). However, in practice, patient data is and should be protected and monitored to avoid unauthorized access or publicity, because of many reasons including privacy, security, ethics, and confidentiality. Thus, many researchers and developers encounter the problem to access required patient data for their research or make patient data available for example to demonstrate the reproducibility of their results. In this paper, we propose a knowledge-based approach of synthesizing large scale patient data. Our main goal is to make the generated patient data as realistic as possible, by using domain knowledge to control the data generation process. Such domain knowledge can be collected from biomedical publications such as PubMed, from medical textbooks, or web resources (e.g. Wikipedia and medical websites). Collected knowledge is formalized in the Patient Data Definition Language (PDDL) for the patient data generation. We have implemented the proposed approach in our Advanced Patient Data Generator (APDG). We have used APDG to generate large scale data for breast cancer patients in the experiments of SemanticCT, a semantically-enabled system for clinical trials. The results show that the generated patient data are useful for various tests in the system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beale, T.: Archetypes: Constraint-based domain models for future-proof information systems. In: OOPSLA 2002 Workshop on Behavioural Semantics (2002)

    Google Scholar 

  2. Bucur, A., ten Teije, A., van Harmelen, F., Tagni, G., Kondylakis, H., van Leeuwen, J., Schepper, K.D., Huang, Z.: Formalization of eligibility conditions of CT and a patient recruitment method, D6.1. Technical report, EURECA Project (2012)

    Google Scholar 

  3. Buczak, A., Babin, S., Moniz, L.: Data-driven approach for creating synthetic electronic medical records. BMC Medical Informatics and Decision Making 10(59) (2010)

    Google Scholar 

  4. Dentler, K., ten Teije, A., Cornet, R., de Keizer, N.: Towards the automated calculation of clinical quality indicators. In: Riaño, D., ten Teije, A., Miksch, S. (eds.) KR4HC 2011. LNCS, vol. 6924, pp. 51–64. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Fensel, D., van Harmelen, F., Andersson, B., Brennan, P., Cunningham, H., Della Valle, E., Fischer, F., Huang, Z., Kiryakov, A., Lee, T., School, L., Tresp, V., Wesner, S., Witbrock, M., Zhong, N.: Towards LarKC: a platform for web-scale reasoning. In: Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2008). IEEE Computer Society Press, CA (2008)

    Google Scholar 

  6. Huang, Z., ten Teije, A., van Harmelen, F.: Rule-based formalization of eligibility criteria for clinical trials. In: Peek, N., Marín Morales, R., Peleg, M. (eds.) AIME 2013. LNCS, vol. 7885, pp. 38–47. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Huang, Z., ten Teije, A., van Harmelen, F.: SemanticCT: A semantically enabled clinical trial system. In: Lenz, R., Mikszh, S., Peleg, M., Reichert, M., ten Teije, D.R.A. (eds.) Process Support and Knowledge Representation in Health Care. LNCS (LNAI), Springer (2013)

    Google Scholar 

  8. Moniz, L., Buczak, A.L., Hung, L., Babin, S., Dorko, M., Lombardo, J.: Construction and validation of synthetic electronic medical records. Journal of Public Health 1(1), 1–36 (2009)

    Google Scholar 

  9. Spackman, K.: Managing clinical terminology hierarchies using algorithmic calculation of subsumption: Experience with snomed-rt. Journal of the American Medical Informatics Association (2000)

    Google Scholar 

  10. Witbrock, M., Fortuna, B., Bradesko, L., Kerrigan, M., Bishop, B., van Harmelen, F., ten Teije, A., Oren, E., Momtchev, V., Tenschert, A., Cheptsov, A., Roller, S., Gallizo, G.: D5.3.1 - requirements analysis and report on lessons learned during prototyping. Larkc project deliverable (June 2009)

    Google Scholar 

  11. Zhang, M., Huang, Z., Gu, J.: Visual interface tools for advanced patient data generator. In: Chinese Digital Medicine (to appear, 2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, Z., van Harmelen, F., ten Teije, A., Dentler, K. (2013). Knowledge-Based Patient Data Generation. In: Riaño, D., Lenz, R., Miksch, S., Peleg, M., Reichert, M., ten Teije, A. (eds) Process Support and Knowledge Representation in Health Care. ProHealth KR4HC 2013 2013. Lecture Notes in Computer Science(), vol 8268. Springer, Cham. https://doi.org/10.1007/978-3-319-03916-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03916-9_7

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03915-2

  • Online ISBN: 978-3-319-03916-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics