Data Mining and Analytics for Exploring Bulgarian Diabetic Register

  • Svetla BoytchevaEmail author
  • Galia Angelova
  • Zhivko Angelov
  • Dimitar Tcharaktchiev
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 822)


This paper discusses the need of building diabetic registers in order to monitor the disease development and assess the prevention and treatment plans. The automatic generation of a nation-wide Diabetes Register in Bulgaria is presented, using outpatient records submitted to the National Health Insurance Fund in 2010–2014 and updated with data from outpatient records for 2015–2016. The construction relies on advanced automatic analysis of free clinical texts and business analytics technologies for storing, maintaining, searching, querying and analyzing data. Original frequent pattern mining algorithms enable to discover maximal frequent itemsets of simultaneous diseases for diabetic patients. We show how comorbidities, identified for patients in the prediabetes period, can help to define alerts about specific risk factors for Diabetes Mellitus type 2, and thus might contribute to prevention. We also claim that the synergy of modern analytics and data mining tools transforms a static archive of clinical patient records to a sophisticated knowledge discovery and prediction environment.


Big data analytics Data mining Frequent pattern mining Text mining Health informatics 



This research is partially supported by grant IZIDA 02/4 (SpecialIZed Data MIning MethoDs Based on Semantic Attributes), funded by the Bulgarian National Science Fund in 2017–2019. The authors acknowledge also the support of Medical University – Sofia, the National Health Insurance Fund and the Bulgarian Ministry of Health.


  1. 1.
    WHO Diabetes Fact Sheets, November 2017. Accessed 20 Jan 2018
  2. 2.
    WHO Global Report on Diabetes (2016). Accessed 20 Jan 2018. ISBN 978 924 156525 7
  3. 3.
    Richardson, E., (ed.): National Diabetes Plans in Europe: what lessons are there for the prevention and control of chronic diseases in Europe? Policy Brief of the Joint Action on Chronic Diseases and Promoting Healthy Ageing across the Life Cycle, WHO Regional Office for Europe (2016). ISSN 1997-8065Google Scholar
  4. 4.
    Garrofé, B., Björnberg, A., Phang, A.Y.: Euro Diabetes Index 2014. Health Consumer Powerhouse Ltd., (2014). ISBN 978-91-980687-4-0Google Scholar
  5. 5.
    Boytcheva, S., Angelova, G., Angelov, Z., Tcharaktchiev, D.: Integrating Data Analysis Tools for Better Treatment of Diabetic Patients. In: Kalinichenko, L., Manolopoulos, Y., Skvortsov, N., Sukhomlin, V. (eds.) Selected Papers of the XIX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2017), CEUR Workshop Proceedings, vol. 2022, pp. 230–237 (2017). Accessed 20 Jan 2018
  6. 6.
    European Best Information through Regional Outcomes in Diabetes (EUBIROD) homepage. Accessed 20 Jan 2018
  7. 7.
    Hallgren Elfgren, I.M., Törnvall, E., Grodzinsky, E.: The process of implementation of the diabetes register in primary health care. Int. J. Qual. Health Care 24(4), 419–424 (2012)CrossRefGoogle Scholar
  8. 8.
    Tcharaktchiev, D., Zacharieva, S., Angelova, G., Boytcheva, S., Angelov, Z., et al.: Building a bulgarian national registry of patients with diabetes mellitus. J. Soc. Med. 2, 19–21 (2015). ISSN 1310-1757 (in Bulgarian Language)Google Scholar
  9. 9.
    Boytcheva, S., et al.: Obtaining status descriptions via automatic analysis of hospital patient records. Informatica 34, 269–278 (2010)Google Scholar
  10. 10.
    Boytcheva, S., Angelova, G., Angelov, Z., Tcharaktchiev, D.: Text mining and big data analytics for retrospective analysis of clinical texts from outpatient care. Cybern. Inf. Technol. 15(4), 58–77 (2015). Scholar
  11. 11.
    Boytcheva, S., Angelova, G., Angelov, Z., Tcharaktchiev, D.: Mining comorbidity patterns using retrospective analysis of big collection of outpatient records. Health Inf. Sci. Syst. 5(1), 3 (2017). Scholar
  12. 12.
    Aggarwal, C., Bhuiyan, M., Hasan, M.: Frequent pattern mining algorithms: a survey. In: Aggarwal, C., Han, J. (eds.) Frequent pattern mining, pp. 19–64. Springer, Cham (2014). Scholar
  13. 13.
    Rabatel, J., Bringay, S., Poncelet, P.: Mining sequential patterns: a context-aware approach. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D. (eds.) Advances in Knowledge Discovery and Management, pp. 23–41. Springer, Heidelberg (2013). Scholar
  14. 14.
    Huang, J., Huan, J., Tropsha, A., Dang, J., Zhang, H., Xiong, M.: Semantics-driven frequent data pattern mining on electronic health records for effective adverse drug event monitoring. In: 2013 IEEE International Conference on Bioinformatics and Biomedicine BIBM, pp. 608–611. IEEE (2013).
  15. 15.
    Ziembiński, R.Z.: Accuracy of generalized context patterns in the context based sequential patterns mining. Control Cybern. 40(3), 585–603 (2011). Accessed 20 Jan 2018
  16. 16.
    Yu, H.F., Hsieh, C.J., Chang, K.W., Lin, C.J.: Large linear classification when data cannot fit in memory. ACM Trans. Knowl. Discov. Data 5(4), 23 (2012). Scholar
  17. 17.
    Pan, X.F., He, M., Yu, C., Lv, J., Guo, Y., Bian, Z., et al.: Type 2 Diabetes and risk of incident cancer in China: a prospective study among 0.5 million Chinese adults. Am. J. Epidemiol., kwx376 (2018).
  18. 18.
    Onitilo, A.A., Stankowski, R.V., Berg, R.L., Engel, J.M., Glurich, I., Williams, G.M., Doi, S.A.: Breast cancer incidence before and after diagnosis of type 2 diabetes mellitus in women: increased risk in the prediabetes phase. Eur. J. Cancer Prev. 23(2), 76–83 (2014). Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institute of Information and Communication Technologies, Bulgarian Academy of SciencesSofiaBulgaria
  2. 2.ADISS Lab Ltd.SofiaBulgaria
  3. 3.University Specialized Hospital for Active Treatment of Endocrinology – Medical University SofiaSofiaBulgaria

Personalised recommendations