Approach and Method for Generating Realistic Synthetic Electronic Healthcare Records for Secondary Use

  • Kudakwashe Dube
  • Thomas Gallagher
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8315)


This position paper presents research work involving the development of a publicly available Realistic Synthetic Electronic Healthcare Record (RS-EHR). The paper presents PADARSER, a novel approach in which the real Electronic Healthcare Record (EHR) and neither authorization nor anonymisation are required in generating the synthetic EHR data sets. The GRiSER method is presented for use in PADARSER to allow the RS-EHR to be synthesized for statistically significant localised synthetic patients with statistically prevalent medical conditions based upon information found from publicly available data sources. In treating the synthetic patient within the GRiSER method, clinical workflow or careflows (Cfs) are derived from Clinical Practice Guidelines (CPGs) and the standard local practices of clinicians. The Cfs generated are used together with health statistics, CPGs, medical coding and terminology systems to generate coded synthetic RS-EHR entries from statistically significant observations, treatments, tests, and procedures. The RS-EHR is thus populated with a complete medical history describing the resulting events from treating the medical conditions. The strength of the PADARSER approach is its use of publicly available information. The strengths of the GRiSER method are that (1) it does not require the use of the real EHR for generating the coded RS-EHR entries; and (2) the generic components for obtaining careflow from CPGs and for generating coded RS-EHR entries are applicable in other areas such as knowledge transfer and EHR user interfaces respectively.


synthetic data healthcare statistics clinical guidelines clinical workflow electronic health record knowledge modeling medical terminology ICD10 SNOMED-CT 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    American Medical Association (AMA). CPT coding, medical billing and insurance (2013),
  2. 2.
    Buczak, A., Babin, S., Moniz, L.: Data-driven approach for creating synthetic electronic medical records. BMC Medical Informatics and Decision Making 10(1), 59 (2010)CrossRefGoogle Scholar
  3. 3.
    Daud, H., Razali, R., Asirvadam, V.: Sea Bed Logging Aapplications: ANOVA analysis 2 for Synthetic Data From Electromagnetic (EM) Simulator. In: 2012 IEEE Asia-Pacific Conference on Applied Electromagnetics (APACE), pp. 110–115 (2012)Google Scholar
  4. 4.
    De Backere, F., Moens, H., Steurbaut, K., De Turck, F., Colpaert, K., Danneels, C., Decruyenaere, J.: Automated generation and deployment of clinical guidelines in the ICU. In: 2010 IEEE 23rd International Symposium on Computer-Based Medical Systems (CBMS), pp. 197–202 (2010)Google Scholar
  5. 5.
    Esteller, R., Vachtsevanos, G., Echauz, J., Lilt, B.: A comparison of fractal dimension algorithms using synthetic and experimental data. In: Proceedings of the 1999 IEEE International Symposium on Circuits and Systems, ISCAS 1999, vol. 3, pp. 199–202 (1999)Google Scholar
  6. 6.
    GonzáLez-Ferrer, A., Teije, A.T., Fdez-Olivares, J., Milian, K.: Automated generation of patient-tailored electronic care pathways by translating computer-interpretable guidelines into hierarchical task networks. Artif. Intell. Med. 57(2), 91–109 (2013)CrossRefGoogle Scholar
  7. 7.
    Gooch, P., Roudsari, A.: Computerization of workflows, guidelines, and care pathways: a review of implementation challenges for process-oriented health information systems. J. Am. Med. Inform. Assoc. 18, 738–748 (2011)CrossRefGoogle Scholar
  8. 8.
    Grando, M.A., Glasspool, D., Boxwala, A.: Argumentation logic for the flexible enactment of goal-based medical guidelines. J. of Biomedical Informatics 45(5), 938–949 (2012)CrossRefGoogle Scholar
  9. 9.
    Grimson, J.: Delivering the electronic healthcare record for the 21st century. International Journal of Medical Informatics 64(2-3), 111–127 (2001)CrossRefGoogle Scholar
  10. 10.
    International Health Standards Development Organization (IHSDO). Systematized nomenclature of medicine clinical terms, SNOMed-CT (2013),
  11. 11.
    Jeske, D.R., Lin, P.J., Rendon, C.: Rui Xiao, and B. Samadi. Synthetic data generation capabilities for testing data mining tools. In: IEEE Military Communications Conference, MILCOM 2006, pp. 1–6 (2006)Google Scholar
  12. 12.
    Juarez, J.M., Martinez, P., Campos, M., Palma, J.: Step-Guided Clinical Workflow Fulfilment Measure for Clinical Guidelines. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2009. LNCS, vol. 5717, pp. 255–262. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Laleci, G.B., Dogac, A.: A semantically enriched clinical guideline model enabling deployment in heterogeneous healthcare environments. IEEE Transactions on Information Technology in Biomedicine 13(2), 263–273 (2009)CrossRefGoogle Scholar
  14. 14.
    Lee, N., Laine, A.F.: Mining electronic medical records to explore the linkage between healthcare resource utilization and disease severity in diabetic patients. In: 2011 First IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), pp. 250–257 (2011)Google Scholar
  15. 15.
    Maciejewski, R., Hafen, R., Rudolph, S., Tebbetts, G., Cleveland, W.S., Grannis, S.J., Ebert, D.S.: Generating synthetic syndromic-surveillance data for evaluating visual-analytics techniques. IEEE Computer Graphics and Applications 29(3), 18–28 (2008)CrossRefGoogle Scholar
  16. 16.
    Margner, V., Pechwitz, M.: Synthetic Data for Arabic OCR System Development. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 1159–1163 (2001)Google Scholar
  17. 17.
    Milla-Millán, G., Fdez-Olivares, J., Sánchez-Garzón, I., Prior, D., Castillo, L.: Knowledge-driven adaptive execution of care pathways based on continuous planning techniques. In: Lenz, R., Miksch, S., Peleg, M., Reichert, M., Riaño, D., ten Teije, A. (eds.) ProHealth 2012 and KR4HC 2012. LNCS, vol. 7738, pp. 42–55. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  18. 18.
    New Zealand Ministry of Health (NZ-MoH). New Zealand Health Statistics: Classificiation and Terminology (2011), (accessed: May 21, 2013)
  19. 19.
    Institute of Medicine (IOM). Guidelines for Clinical Practice: From Development to Use. National Academy Press, Washington DC (1992)Google Scholar
  20. 20.
    Peleg, M., Tu, S.W.: Design patterns for clinical guidelines. Artif. Intell. Med. 47(1), 1–24 (2009)CrossRefGoogle Scholar
  21. 21.
    Raza, A., Clyde, S.: Testing health-care integrated systems with anonymized test-data extracted from production systems. In: 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), pp. 457–464 (2012)Google Scholar
  22. 22.
    Riaño, D.: Ordered Time-Independent CIG Learning. In: Barreiro, J.M., Martín-Sánchez, F., Maojo, V., Sanz, F. (eds.) ISBMDA 2004. LNCS, vol. 3337, pp. 117–128. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  23. 23.
    Stark, E., Eltoft, T., Braathen, B.: Performance of Vegetation Classification Methods Using Synthetic Multi-Sspectral Satellite Data. In: International Geoscience and Remote Sensing Symposium (IGARSS 1995). Quantitative Remote Sensing for Science and Applications, vol. 2, pp. 1276–1278 (1995)Google Scholar
  24. 24.
    Tsai, A., Kuo, P.-H., Lee, G., Lin, M.-S.: Electronic clinical guidelines for intensive care unit. In: 2007 9th International Conference on e-Health Networking, Application and Services, pp. 117–124 (2007)Google Scholar
  25. 25.
    World Health Organisation (WHO). International Classificiation of Diseases (ICD), Web (2013), (accessed: May 21, 2013)

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Kudakwashe Dube
    • 1
  • Thomas Gallagher
    • 2
  1. 1.School of Engineering and Advanced TechnologyMassey UniversityNew Zealand
  2. 2.Applied Computing and ElectronicsUniversity of MontanaMissoulaUSA

Personalised recommendations