Approach and Method for Generating Realistic Synthetic Electronic Healthcare Records for Secondary Use
This position paper presents research work involving the development of a publicly available Realistic Synthetic Electronic Healthcare Record (RS-EHR). The paper presents PADARSER, a novel approach in which the real Electronic Healthcare Record (EHR) and neither authorization nor anonymisation are required in generating the synthetic EHR data sets. The GRiSER method is presented for use in PADARSER to allow the RS-EHR to be synthesized for statistically significant localised synthetic patients with statistically prevalent medical conditions based upon information found from publicly available data sources. In treating the synthetic patient within the GRiSER method, clinical workflow or careflows (Cfs) are derived from Clinical Practice Guidelines (CPGs) and the standard local practices of clinicians. The Cfs generated are used together with health statistics, CPGs, medical coding and terminology systems to generate coded synthetic RS-EHR entries from statistically significant observations, treatments, tests, and procedures. The RS-EHR is thus populated with a complete medical history describing the resulting events from treating the medical conditions. The strength of the PADARSER approach is its use of publicly available information. The strengths of the GRiSER method are that (1) it does not require the use of the real EHR for generating the coded RS-EHR entries; and (2) the generic components for obtaining careflow from CPGs and for generating coded RS-EHR entries are applicable in other areas such as knowledge transfer and EHR user interfaces respectively.
Keywordssynthetic data healthcare statistics clinical guidelines clinical workflow electronic health record knowledge modeling medical terminology ICD10 SNOMED-CT
Unable to display preview. Download preview PDF.
- 1.American Medical Association (AMA). CPT coding, medical billing and insurance (2013), http://www.ama-assn.org/ama/pub/physician-resources/solutions-managing-your-practice/coding-billing-insurance.page?
- 3.Daud, H., Razali, R., Asirvadam, V.: Sea Bed Logging Aapplications: ANOVA analysis 2 for Synthetic Data From Electromagnetic (EM) Simulator. In: 2012 IEEE Asia-Pacific Conference on Applied Electromagnetics (APACE), pp. 110–115 (2012)Google Scholar
- 4.De Backere, F., Moens, H., Steurbaut, K., De Turck, F., Colpaert, K., Danneels, C., Decruyenaere, J.: Automated generation and deployment of clinical guidelines in the ICU. In: 2010 IEEE 23rd International Symposium on Computer-Based Medical Systems (CBMS), pp. 197–202 (2010)Google Scholar
- 5.Esteller, R., Vachtsevanos, G., Echauz, J., Lilt, B.: A comparison of fractal dimension algorithms using synthetic and experimental data. In: Proceedings of the 1999 IEEE International Symposium on Circuits and Systems, ISCAS 1999, vol. 3, pp. 199–202 (1999)Google Scholar
- 10.International Health Standards Development Organization (IHSDO). Systematized nomenclature of medicine clinical terms, SNOMed-CT (2013), http://www.ihtsdo.org/snomed-ct/
- 11.Jeske, D.R., Lin, P.J., Rendon, C.: Rui Xiao, and B. Samadi. Synthetic data generation capabilities for testing data mining tools. In: IEEE Military Communications Conference, MILCOM 2006, pp. 1–6 (2006)Google Scholar
- 14.Lee, N., Laine, A.F.: Mining electronic medical records to explore the linkage between healthcare resource utilization and disease severity in diabetic patients. In: 2011 First IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB), pp. 250–257 (2011)Google Scholar
- 16.Margner, V., Pechwitz, M.: Synthetic Data for Arabic OCR System Development. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 1159–1163 (2001)Google Scholar
- 17.Milla-Millán, G., Fdez-Olivares, J., Sánchez-Garzón, I., Prior, D., Castillo, L.: Knowledge-driven adaptive execution of care pathways based on continuous planning techniques. In: Lenz, R., Miksch, S., Peleg, M., Reichert, M., Riaño, D., ten Teije, A. (eds.) ProHealth 2012 and KR4HC 2012. LNCS, vol. 7738, pp. 42–55. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 18.New Zealand Ministry of Health (NZ-MoH). New Zealand Health Statistics: Classificiation and Terminology (2011), http://www.health.govt.nz/nz-health-statistics/classification-and-terminology (accessed: May 21, 2013)
- 19.Institute of Medicine (IOM). Guidelines for Clinical Practice: From Development to Use. National Academy Press, Washington DC (1992)Google Scholar
- 21.Raza, A., Clyde, S.: Testing health-care integrated systems with anonymized test-data extracted from production systems. In: 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), pp. 457–464 (2012)Google Scholar
- 23.Stark, E., Eltoft, T., Braathen, B.: Performance of Vegetation Classification Methods Using Synthetic Multi-Sspectral Satellite Data. In: International Geoscience and Remote Sensing Symposium (IGARSS 1995). Quantitative Remote Sensing for Science and Applications, vol. 2, pp. 1276–1278 (1995)Google Scholar
- 24.Tsai, A., Kuo, P.-H., Lee, G., Lin, M.-S.: Electronic clinical guidelines for intensive care unit. In: 2007 9th International Conference on e-Health Networking, Application and Services, pp. 117–124 (2007)Google Scholar
- 25.World Health Organisation (WHO). International Classificiation of Diseases (ICD), Web (2013), http://www.who.int/classifications/icd/en/ (accessed: May 21, 2013)