Abstract
For service and multimedia processing that are not limited by time and space, it is necessary to go beyond the existing computing paradigm and resolve such limitations. In this study, health big data-based cardiac disease induction prediction made with multimedia extraction is suggested, which analyzes the relationships in health big data using multimedia extraction. Multimedia extraction is roughly divided into two types: extraction of structured data–based significant items, and extraction of unstructured data–based information. The extraction of structured data–based significant items is made with a multivariate analysis algorithm and similarity analysis. The extraction of unstructured data–based information is made with a technique called parsing based on medical keywords. Using personal health record (PHR)-based data, health big data are collected, while items having significant relationships are selected using logistics regression. Depending on the proximity of the Minkowski distance, a risky group with high similarity to patients with cardiovascular diseases is formed, while risk factors for cardiovascular diseases are evaluated using the similarities between the risky group and the user. A multivariate analysis was used to analyze the items with a significant level of significance. Through this, 27 out of 210 items were extracted. Therefore, only 12.9% of the data are used, and with the MAE results, it was found that an error in accuracy of 0.21. These results show that the suggested model could provide more personalized data and can be used as core technology for constructing an effective, efficient, smart healthcare system.
Similar content being viewed by others
References
Bala N, Price SN, Horan CM, Gerber MW, Taveras EM (2019) Use of Telehealth to enhance Care in a Family-Centered Childhood Obesity Intervention. Clin Pediatr 58(7):789–797
Billsus D, Pazzani MJ (1998) Learning collaborative information filters. Proc Int Conf Mach Learn 46–53
Brodera AZ, Glassmana SC, Manassea MS, Zweig G (1997) Syntactic clustering of the web. Computer Networks and ISDN Systems 29(8–13):1157–1166
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? – arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247–1250
Chilakamarri KB (1991) Unit-distance graphs in Minkowski metric spaces. Geom Dedicata 37(3):345–356
Fleischer R (1999) Decision trees: old and new results. Inf Comput 152(1):44–61
Friedman M, Bar-Noy T, Blau M, Kandel A (1998) Certain computational aspects of fuzzy decision trees. Fuzzy Sets Syst 28(2):163–170
Golas SB, Shibahara T, Agboola S, Otaki H, Sato J, Nakae T et al (2018) A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Medical Informatics and Decision Making. 18(44):1–17
Guarino N, Oberle D, Staab S (2009) What is an ontology? International Handbooks on Information Systems. https://doi.org/10.1007/978-3-540-92673-3_0
Hilbe JM (2009) Logistic regression models. CRC Press. Print, Florida
Hoehndorf R, Schofield PN, Gkoutos GV (2015) The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform 16(6):1069–1080
Jung H, Chung K (2016) Knowledge-based dietary nutrition recommendation for obese management. Inf Technol Manag 17(1):29–42
Kim J, Chung K (2014) Ontology-based healthcare context information model to implement ubiquitous environment. Multimed Tools Appl 71(2):873–888
Kim JC, Chung K (2018) Mining health-risk factors using PHR similarity in a hybrid P2P network. Peer-to-Peer Networking and Applications 11(6):1278–1287
Koren Y, Bell R (2015) Advances in collaborative filtering. Recommender Systems Handbook. https://doi.org/10.1007/978-1-4899-7637-6_3
Marziniak M, Brichetto G, Feys P, Meyding-Lamadé U, Vernon K, Meuth SG (2018) The use of digital and remote communication technologies as a tool for multiple sclerosis management: narrative review. JMIR Rehabilitation and Assistive Technologies 5(1):e5. https://doi.org/10.2196/rehab.7805
Menard S (1995) Applied logistic regression analysis. Thousand Oaks, CA: Sage University Series on Quantitative Applications in the Social Sciences. Thousand Oaks CA. Sage
Neamatullah I, Douglass MM, Lehman LH, Reisner A, Villarroel M, Long WJ, Szolovits P, Moody GB, Mark RG, Clifford GD (2008) Automated De-identification of free-text medical records. BMC Medical Informatics and Decision Making 8(1):32
Pan JZ (2009) Resource description framework. International Handbooks on Information Systems. https://doi.org/10.1007/978-3-540-92673-3_3
Park S, Lee JH, Bae HJ (2005) End user searching: a web log analysis of NAVER, a Korean web search engine. Libr Inf Sci Res 27(2):203–221
Pazzani M (1999) A framework for collaborative, content-based and demographic filtering. Journal of Artificial Intelligence Review 13(5):393–408
Penninga L, Lorentzen AK, Davis C (2019) A telemedicine case series for acute medical emergencies in Greenland: a model for austere environments. Telemedicine and e-Health. https://doi.org/10.1089/tmj.2019.0123
Rau HH, Wu YS, Chu CM, Wang FC, Hsu MH, Chang CW et al (2017) Importance-performance analysis of personal health Records in Taiwan: a web-based survey. J Med Internet Res 19(4):e131. https://doi.org/10.2196/jmir.7065
Riaz MS, Atreja A (2016) Personalized Technologies in Chronic Gastrointestinal Disorders: self-monitoring and remote sensor technologies. Clin Gastroenterol Hepatol 14(12):1697–1705
Richesson RL, Hammond WE, Nahm M, Wixted D, Simon GE, Robinson JG, Bauck AE, Cifelli D, Smerek MM, Dickerson J, Laws RL, Madigan RA, Rusincovitch SA, Kluchar C, Califf RM (2013) Electronic health records based Phenotyping in next-generation clinical trials: a perspective from the NIH health care systems Collaboratory. J Am Med Inform Assoc 20(e2):e226–e231
Ristoski P, Paulheim H (2016) Semantic web in data mining and knowledge discovery: a comprehensive survey. Journal of Web Semantics 36:1–22
Schlegel DR, Ficheur G (2017) Secondary use of patient data: review of the literature published in 2016. Yearb Med Inform 26(1):68–71
The Seventh Korea National Health and Nutrition Examination Survey (KNHANES VII-2) (2017) Korea Centers for Disease Control and Prevention
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30(1):79–82
World Health Organization (2005) Preventing chronic diseases: a vital investment. World Health Organization.
Yeon J, Lee D, Shim J, Lee S (2011) Product review data and sentiment analytical processing modeling. Journal of Society for E-business studies 16(4):125–137
Yoo H, Chung K (2017) PHR based diabetes index service model using life behavior analysis. Wirel Pers Commun 93(1):161–174
Yoo H, Chung K (2019) Heart rate variability based stress index service model using bio-sensor. Clust Comput 21(1):1139–1149
Zan S, Agboola S, Moore SA, Parks KA, Kvedar JC, Jethwani K (2015) Patient engagement with a Mobile web-based Telemonitoring system for heart failure self-management: a pilot study. JMIR Mhealth Uhealth 3(2):e33
Acknowledgements
This work was supported by the GRRC program of Gyeonggi province. [GRRC KGU 2020-B03, Industry Statistics and Data Mining Research].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yoo, H., Chung, K. & Han, S. Prediction of cardiac disease-causing pattern using multimedia extraction in health ontology. Multimed Tools Appl 80, 34713–34729 (2021). https://doi.org/10.1007/s11042-020-09052-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09052-9