Skip to main content

Data Science Approaches to Public Health: Case Studies Using Routine Health Data from India

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation (ICDMAI 2023)

Abstract

The promise of data science for social good has not yet percolated to public health, where the need is most, but lacks priority. The lack of data use policy or culture in Indian health information systems could be one of the reasons for this. Learning from global experiences on how routine health data has been used might benefit us as a newcomer in the field of digital health. The current study aims to demonstrate the potential of data science in transforming publicly available routine health data from India into evidence for public health decision-making. Four case studies were conducted using the expanded data sources to integrate data and link various sources of information. Implementing these data science projects required developing robust algorithms using reproducible research principles to maximize efficiency. They also led to new and incremental challenges that needed to be addressed in novel ways. The paper successfully demonstrates that data science has immense potential for applications in public health. Additionally, data science approach to public health can ensure transparency and efficiency while also addressing systemic and social issues such as data quality and health equity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Odone A, Buttigieg S, Ricciardi W, Azzopardi-Muscat N, Staines A (2019) Public health digitalization in Europe. Eur J Public Health 29:28–35

    Article  Google Scholar 

  2. Chiolero A, Buckeridge D (2020) Glossary for public health surveillance in the age of data science. J Epidemiol Community Health 74(7):612–616

    Google Scholar 

  3. Ford E, Boyd A, Bowles JKF, Havard A, Aldridge RW, Curcin V et al (2019) Our data, our society, our health: a vision for inclusive and transparent health data science in the United Kingdom and beyond. Learn Health Syst 3(3):e10191

    Google Scholar 

  4. Benke K, Benke G (2018) Artificial intelligence and big data in public health. Int J Environ Res Public Health 15(12):E2796

    Article  Google Scholar 

  5. Belle A, Thiagarajan R, Soroushmehr SMR, Navidi F, Beard DA, Najarian K (2015) Big data analytics in healthcare. Biomed Res Int 2015:1–16

    Article  Google Scholar 

  6. Sahay S, Sundararaman T, Braa J (2017) Public health informatics: designing for change-a developing country perspective. Oxford University Press

    Google Scholar 

  7. Haneef R, Delnord M, Vernay M, Bauchet E, Gaidelyte R, Van Oyen H et al (2020) Innovative use of data sources: a cross-sectional study of data linkage and artificial intelligence practices across European countries. Arch Public Health 78:55

    Article  Google Scholar 

  8. Deeny SR, Steventon A (2015) Making sense of the shadows: priorities for creating a learning healthcare system based on routinely collected data. BMJ Qual Saf 24(8):505–515

    Article  Google Scholar 

  9. Vayena E, Dzenowagis J, Brownstein JS, Sheikh A (2018) Policy implications of big data in the health sector. Bull World Health Organ 96(1):66–68

    Article  Google Scholar 

  10. Zodpey SP, Negandhi HN (2016) Improving the quality and use of routine health data for decision-making. Indian J Public Health 60(1):1

    Article  Google Scholar 

  11. Pandey A, Roy N, Bhawsar R, Mishra RM (2010) Health information system in India: issues of data availability and quality. Demography India 39(1):111–128

    Google Scholar 

  12. Hung YW, Hoxha K, Irwin BR, Law MR, Grépin KA (2020) Using routine health information data for research in low- and middle-income countries: a systematic review. BMC Health Serv Res 20(1):790

    Article  Google Scholar 

  13. Bomba B, Cooper J, Miller M (1995) Working towards a national health information system in Australia. Medinfo MEDINFO 8:1633–1633

    Google Scholar 

  14. Morrato EH, Elias M, Gericke CA (2007) Using population-based routine data for evidence-based health policy decisions: lessons from three examples of setting and evaluating national health policy in Australia, the UK and the USA. J Public Health 29(4):463–471

    Article  Google Scholar 

  15. Houston TK, Sands DZ, Jenckes MW, Ford DE (2004) Experiences of patients who were early adopters of electronic communication with their physician: satisfaction, benefits, and concerns. Am J Manag Care 10(9):601–608

    Google Scholar 

  16. Tull K (2018) Designing and implementing health management information systems

    Google Scholar 

  17. Trewin C, Strand BH, Grøholt EK (2008) Norhealth: norwegian health information system. Scand J Public Health 36(7):685–689

    Article  Google Scholar 

  18. Ringard Ã…, Sagan A, Sperre Saunes I, Lindahl AK, World Health Organization et al (2013) Norway: health system review

    Google Scholar 

  19. Center for international earth science information network—CIESIN—Columbia University. Gridded population of the world, version 4 (GPWv4): population density, revision 11 [Internet]. NASA Socioeconomic Data and Applications Center (SEDAC), Palisades, New York (2018). Available from: https://doi.org/10.7927/H49C6VHW

  20. Government of India (2021) Unique identification authority of India [Internet]. [cited 2021 May 19]. Available from: https://uidai.gov.in/images/state-wise-aadhaar-saturation.pdf

  21. R Core Team. R (2021) A language and environment for statistical computing [Internet]. Vienna, Austria. Available from: https://www.R-project.org/

  22. RStudio Team (2021) RStudio: integrated development environment for r [Internet]. Boston, MA. Available from: http://www.rstudio.com/

  23. Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R et al (2019) Welcome to the tidyverse. J Open Source Softw 4(43):1686

    Article  Google Scholar 

  24. Wickham H. Advanced R, 2nd edn. Advanced R 604

    Google Scholar 

  25. Wickham H (2021) Mastering shiny. O’Reilly Media, Inc., p 395

    Google Scholar 

  26. Munafò MR, Nosek BA, Bishop DV, Button KS, Chambers CD, Percie du Sert N et al (2017) A manifesto for reproducible science. Nat Hum Behav 1(1):1–9.

    Google Scholar 

  27. Peng RD, Hicks SC (2021) Reproducible research: a retrospective. Annu Rev Public Health 1(42):79–93

    Article  Google Scholar 

  28. The comprehensive R archive network [Internet]. [cited 2022 Sep 30]. Available from: https://cran.r-project.org/

  29. RECon (2021) R epidemics consortium [Internet]. [cited 2021 May 25]. Available from: http://reconhub.github.io/

  30. Mitra A, Soman B, Gaitonde R, Singh G, Roy A (2022) Data science methods to develop decision support systems for real-time monitoring of COVID-19 outbreak. J Human, Earth, Future 3(2):223–236

    Article  Google Scholar 

  31. Mitra A, Pakhare AP, Roy A, Joshi A (2020) Impact of COVID-19 epidemic curtailment strategies in selected Indian states: an analysis by reproduction number and doubling time with incidence modelling. PLoS ONE 15(9):e0239026

    Article  Google Scholar 

  32. Mitra A, Soman B, Singh G (2021) An interactive dashboard for real-time analytics and monitoring of COVID-19 outbreak in India: a proof of concept. In: arXiv preprint arXiv: 210809937 [Internet]. International Federation for Information Processing, Norway. Available from: https://arxiv.org/ftp/arxiv/papers/2108/2108.09937.pdf

  33. United Nations (2014) Principles and recommendations for a vital statistics system: revision 3 [Internet]. UN [cited 2022 Sep 30]. (Statistical papers (Ser. M)). Available from: https://www.un-ilibrary.org/content/books/9789210561402

  34. MCCD division (2018) Report on medical certification of cause of death–2018. Department of Economics and Statistics, Thiruvananthapuram

    Google Scholar 

  35. Vital Statistics Division (2019) Annual vital statistics report–2019. Department of Economics and Statistics, Thiruvananthapuram

    Google Scholar 

  36. Kermany DS, Goldbaum M, Cai W, Valentim CCS, Liang H, Baxter SL et al (2018) Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5):1122-1131.e9

    Article  Google Scholar 

  37. Mitra A (2021) Retinal disease classification using deep learning networks : applications in teleophthalmology. In: Proceedings of the 17th international conference of telemedicine society of India. TSI, India, India

    Google Scholar 

  38. Mitra A, Soman B, Gaitonde R, Singh G, Roy A (2021) Tracking and monitoring COVID-19 in Kerala: Development of an interactive dashboard. In: Health system research. Cochin, Kerala

    Google Scholar 

  39. Singh G, Soman B, Mitra A. A systematic approach to cleaning routine health surveillance datasets: an illustration using national vector borne disease control programme data of Punjab, India. arXiv:210809963 [cs] [Internet]. 2021 Aug 23 [cited 2021 Oct 2]; Available from: http://arxiv.org/abs/2108.09963

  40. Joshi A, Mitra A, Anjum N, Shrivastava N, Khadanga S, Pakhare A et al (2019) Patterns of glycemic variability during a diabetes self-management educational program. Med Sci 7(3):52

    Google Scholar 

  41. Saoji A, Nayse J, Deoke A, Mitra A (2016) Maternal risk factors of caesarean delivery in a tertiary care hospital in Central India: a case control study. People’s J Sci Res 9(2):18–23

    Google Scholar 

  42. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A et al (2016) The FAIR guiding principles for scientific data management and stewardship. Sci Data 3(1):160018

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Biju Soman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mitra, A., Soman, B., Gaitonde, R., Bhatnagar, T., Nieuhas, E., Kumar, S. (2023). Data Science Approaches to Public Health: Case Studies Using Routine Health Data from India. In: Sharma, N., Goje, A., Chakrabarti, A., Bruckstein, A.M. (eds) Data Management, Analytics and Innovation. ICDMAI 2023. Lecture Notes in Networks and Systems, vol 662. Springer, Singapore. https://doi.org/10.1007/978-981-99-1414-2_63

Download citation

Publish with us

Policies and ethics