Skip to main content
Log in

Linkage of Administrative Datasets: Enhancing Longitudinal Epidemiological Studies in the Era of “Big Data”

  • Invited Commentary
  • Published:
Current Epidemiology Reports Aims and scope Submit manuscript

Abstract

“Modern epidemiology” has consolidated the direct collection of individual data as the most valued approach for conducting epidemiological research. An essential feature of powerful epidemiological studies (in whatever design, observational, quasi-experimental or experimental) is a longitudinal structure, so that in the course of the study, data are collected over time and measurements can be repeated for each participant. Notably, the amount and variety of individual health data routinely collected from different sources and available in digital media have increased exponentially. This growing amount of data has caused scientific disciplines to confront essential challenges in operational (data management, infrastructure, training), methodological (new approaches to analyze and to derive inferences from “big data”), and epistemological (several argue that the hypothesis-driven science is outdated, and we live now in a data-driven era) realms. There is no doubt that the use of large administrative databases in particular when enriched through linkage with other sources of data, while in its infancy, is a powerful tool with the potential to bolster medical and epidemiological longitudinal research. Being relatively fast and low cost, it can enable the study of essential research questions previously unfeasible for among others, budgetary, or ethical reasons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lipworth W, Mason PH, Kerridge I, Ioannidis JPA. Ethics and epistemology in big data research. Bioeth Inq. 2017;14:489–500. https://doi.org/10.1007/s11673-017-9771-3.

    Article  Google Scholar 

  2. Hu H, Galea S, Rosella L, Henry D. Big data and population health: focusing on the health impacts of the social, physical, and economic environment. Epidemiology. 2017;28(6):759–62. https://doi.org/10.1097/EDE.0000000000000711.

    Article  PubMed  Google Scholar 

  3. National Research Council (US) Committee on A Framework for Developing a New Taxonomy of Disease. Toward precision medicine: building a knowledge network for biomedical research and a new taxonomy of disease. Washington (DC): National Academies Press (US); 2011.

    Google Scholar 

  4. Dolley S. Big data’s role in precision public health. Front Public Health. 2018;6:68. https://doi.org/10.3389/fpubh.2018.00068.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Genowska A, Jamiołkowski J, Szafraniec K, Stepaniak U, Szpak A, Pająk A. Environmental and socio-economic determinants of infant mortality in Poland: an ecological study. Environ Health. 2015;14:61. https://doi.org/10.1186/s12940-015-0048-1.r.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Rasella D, Harhay MO, Pamponet ML, Aquino R, Barreto ML. Impact of primary health care on mortality from heart and cerebrovascular diseases in Brazil: a nationwide analysis of longitudinal data. BMJ. 2014;349:g4014.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Rasella D, Aquino R, Santos CAT, Paes-Sousa R, Barreto ML. Effect of a conditional cash transfer programme on childhood mortality: a nationwide analysis of Brazilian municipalities. Lancet. 2013;382(9886):57–64.

    Article  PubMed  Google Scholar 

  8. Editorial. Epidemiology is a science of high importance. Nat Commun. 2018;9:1703. https://doi.org/10.1038/s41467-018-04243-3.

    Article  CAS  Google Scholar 

  9. Stringhini S, Carmeli C, Jokela M, Avendaño M, Muennig P, Guida F, et al. Socioeconomic status and the 25 × 25 risk factors as determinants of premature mortality: a multicohort study and meta-analysis of 1·7 million men and women. Lancet. 2017;389(10075):1229–37. https://doi.org/10.1016/S0140-6736(16)32380-7.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Dunn HL. Record linkage. Am J Publ Health. 1946;36:1412–6.

    Article  Google Scholar 

  11. Somers RL. Repeat abortion in Denmark: an analysis based on national record linkage. Stud Fam Plan. 1977;8(6):142–7.

    Article  CAS  Google Scholar 

  12. Schmidt M, Pedersen L, Sørensen HT. The Danish civil registration system as a tool in epidemiology. Eur J Epidemiol. 2014;29(8):541–9. https://doi.org/10.1007/s10654-014-9930-3.

    Article  PubMed  Google Scholar 

  13. Davidsen M, Kjøller M, Helweg-Larsen K. The Danish National Cohort Study (DANCOS). Scand J Public Health. 2011;39(7 Suppl):131–5. https://doi.org/10.1177/1403494811399167.

    Article  PubMed  Google Scholar 

  14. Spoerri A, Zwahlen M, Egger M, Bopp M. The Swiss National Cohort: a unique database for national and international researchers. Int J Public Health. 2010;55(4):239–42. https://doi.org/10.1007/s00038-010-0160-5.

    Article  PubMed  Google Scholar 

  15. Zhao J, Gibb S, Jackson R, Mehta S, Exeter DJ. Constructing whole of population cohorts for health and social research using the New Zealand Integrated Data Infrastructure. Aust N Z J Public Health. 2018. https://doi.org/10.1111/1753-6405.12781.

  16. https://www.closer.ac.uk/. Accessed 6th June 2018

  17. Harron K, Dibben C, Boyd J, Hjern A, Azimaee M, Barreto ML, et al. Challenges in administrative data linkage for research. Big Data Soc. 2017;4:1–12. https://doi.org/10.1177/2053951717745678.

    Article  Google Scholar 

  18. Mooney SJ, Pejaver V. Big data in public health: terminology, machine learning, and privacy. Annu Rev Public Health. 2018;39(1):95–112. https://doi.org/10.1146/annurev-publhealth-040617-014208.

    Article  PubMed  Google Scholar 

  19. Pita R, Pinto C, Sena S, Fiaccone R, Amorim L, Reis S, et al. On the accuracy and scalability of probabilistic data linkage over the Brazilian 114 million cohort. IEEE J Biomed Health Inform. 2018;22(2):346–53. https://doi.org/10.1109/JBHI.2018.2796941.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mauricio L. Barreto.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barreto, M.L., Rodrigues, L.C. Linkage of Administrative Datasets: Enhancing Longitudinal Epidemiological Studies in the Era of “Big Data”. Curr Epidemiol Rep 5, 317–320 (2018). https://doi.org/10.1007/s40471-018-0177-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40471-018-0177-5

Keywords

Navigation