Cohort Research in “Omics” and Preventive Medicine

  • Yi Shen
  • Sheng Zhang
  • Jie Zhou
  • Jiajia ChenEmail author
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 1005)


Cohort studies are observational studies in which the investigator determines the exposure status of subjects and then follows them for subsequent outcomes. The incidence of outcomes is observed in the exposed group and compared with that in a nonexposed group. Recently, new epidemiologic strategies have encouraged cohort research information exchange and cooperation to improve the cognition of disease etiology, such as case-cohort design and nested case-control study, which is available for “omics” data. Meanwhile, large-scale cohort studies using a prospective multiple design and long follow-ups have explored some of the challenges in preventive medicine. Cohort study can bridge the gap between the micro and macro research.

This chapter is divided into three parts:
  1. 1.

    Basic knowledge of cohort study, which included the definition of cohort study and different types of cohort study, how to design the cohort study, data analysis for the cohort study, sources of bias in cohort studies, tools and software for cohort studies, and strengths and limitations of cohort study

  2. 2.

    Cohort study for “omics” data analysis, which introduced three related methodologically distinct study designs, case-cohort design for genomic cohort study, nested case-control design for transcriptomics cohort data, and population-based design for integrative “omics” cohort

  3. 3.

    Perspectives on cohort study including data-driven medicine and cohort research, cohort research for healthcare medicine, and cohort research for preventive medicine



Cohort study “Omics” data preventive medicine 



This work was supported by the National Natural Science Foundation of China grants (31400712) and Technology R&D Program of Suzhou (SYN201409).


  1. 1.
    Dawson B, Trapp RG. Basic & clinical biostatistics. New York: Lange Medical Books-McGraw-Hill, Medical Pub. Division; 2004.Google Scholar
  2. 2.
    Kirby RS. Designing clinical research. Ann Epidemiol. 2014;24(5):410.CrossRefGoogle Scholar
  3. 3.
    Leon G. Epidemiology. 4th ed. Philadelphia: Elsevier/Saunders; 2008.Google Scholar
  4. 4.
    Simpson JA, Hannaford PC. The contribution of cohort studies to prescribing research. J Clin Pharm Ther. 2002;27(2):151–6.CrossRefPubMedGoogle Scholar
  5. 5.
    Wild C, Vineis P, Garte SJ. Molecular epidemiology of chronic diseases. Hoboken: Wiley; 2008.CrossRefGoogle Scholar
  6. 6.
    Drysdale R. Methods Mol Biol. 2008;420:45–59.CrossRefPubMedGoogle Scholar
  7. 7.
    Hood MN. A review of cohort study design for cardiovascular nursing research. J Cardiovasc Nurs. 2009;24(6):E1.CrossRefPubMedGoogle Scholar
  8. 8.
    Shen H. Epidemiology. Beijing: People’s Medical Publishing House; 2016.Google Scholar
  9. 9.
    Grimes DA, Schulz KF. Cohort studies: marching towards outcomes. Lancet. 2002;359(9303):341.CrossRefPubMedGoogle Scholar
  10. 10.
    Commenges D, Moreau T. Comparative efficiency of a survival-based case-control design and a random selection cohort design. Stat Med. 1991;10(11):1775–82.CrossRefPubMedGoogle Scholar
  11. 11.
    Eckart RE, et al. Incidence and follow-up of inflammatory cardiac complications after smallpox vaccination. J Am Coll Cardiol. 2004;44(1):201–5.CrossRefPubMedGoogle Scholar
  12. 12.
    Ho VB, et al. Major vascular anomalies in turner syndrome: prevalence and magnetic resonance angiographic features. Circulation. 2004;110(12):1694–700.CrossRefPubMedGoogle Scholar
  13. 13.
    Eley JW. Medical epidemiology. New York: Lange Medical Books/McGraw-Hill; 2001.Google Scholar
  14. 14.
    Zhong K, et al. CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies. BMC Bioinformatics. 2016;17(1):156.CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Montesinos-López OA, et al. A Genomic Bayesian Multi-trait and Multi-environment Model. G3-Genes Genomes Genetics. 2016;6(9):2725–44.PubMedCentralGoogle Scholar
  16. 16.
    Hulley SB, Cummings SR, Browner WS. Designing clinical research: an epidemiologic approach. Philadelphia: Lippincott Williams & Wilkins; 2001.Google Scholar
  17. 17.
    Lander ES, International Human Genome Sequencing Consortium, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921.CrossRefPubMedGoogle Scholar
  18. 18.
    Venter JC, et al. The sequence of the human genome. Science. 2001;291(5507):1304–51.CrossRefPubMedGoogle Scholar
  19. 19.
    Olivier M. A haplotype map of the human genome. Physiol Genomics. 2005;13(1):3–9.CrossRefGoogle Scholar
  20. 20.
    Ritchie MD, et al. Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet. 2015;16(2):85.CrossRefPubMedGoogle Scholar
  21. 21.
    Mackay E, et al. Association of gestational weight gain and maternal body mass index in early pregnancy with risk for nonaffective psychosis in offspring. JAMA Psychiatry. 2017;74:339–49.CrossRefPubMedGoogle Scholar
  22. 22.
    van Hecke O, Hocking LJ, Torrance N. Chronic pain, depression and cardiovascular disease linked through a shared genetic predisposition: Analysis of a family-based cohort and twin study. PloS One. 2017;12(2):e0170653.CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Katsumata Y, Fardo DW. On combining family- and population-based sequencing data. BMC Proc. 2016;10(7):175–9.PubMedPubMedCentralGoogle Scholar
  24. 24.
    Zeng Y et al. Genome-wide regional heritability mapping identifies a locus within the TOX2 gene associated with major depressive disorder. Biol Psychiatry, 2016;S0006-3223(16):33113–4.Google Scholar
  25. 25.
    Miettinen O. Design options in epidemiologic research. An update. Scand J Work Environ Health. 1982;8(Suppl 1):7.PubMedGoogle Scholar
  26. 26.
    Pfeiffer RM, et al. A case-cohort design for assessing covariate effects in longitudinal studies. Biometrics. 2005;61(4):982–91.CrossRefPubMedGoogle Scholar
  27. 27.
    Le PDWO, Maguire H, Moren A. The case-cohort design in outbreak investigations. Euro Surveill. 2012;17(25):11–5.Google Scholar
  28. 28.
    Shen Y, et al. Retrospective likelihood based methods for analyzing case-cohort genetic association studies. Biometrics. 2015;71(4):960.CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Liu M, et al. Aromatase inhibitor-associated bone fractures: a case-cohort GWAS and functional genomics. Mol Endocrinol. 2014;28(10):1740–51.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Dumeaux V, et al. Gene expression analyses in breast cancer epidemiology: the Norwegian women and cancer postgenome cohort study. Breast Cancer Res. 2008;10(1):R13.CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Lund E, et al. A new statistical method for curve group analysis of longitudinal gene expression data illustrated for breast cancer in the NOWAC postgenome cohort as a proof of principle. BMC Med Res Methodol. 2016;16(1):28.CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Zhang W, Li F, Nie L. Integrating multiple ‘omics’ analysis for microbial biology: application and methodologies. Microbiology. 2010;156(2):287–301.CrossRefPubMedGoogle Scholar
  33. 33.
    Chakravarti A, Little P. Nature, nurture and human disease. Nature. 2003;421(6921):412–4.CrossRefPubMedGoogle Scholar
  34. 34.
    Collins FS. The case for a US prospective cohort study of genes and environment. Nature. 2004;429(6990):475–7.CrossRefPubMedGoogle Scholar
  35. 35.
    Hwadmin. Intersalt: an international study of electrolyte excretion and blood pressure. Results for 24 hour urinary sodium and potassium excretion. Intersalt Cooperative Research Group. British Med J. 1988;297(6644):319–28.CrossRefGoogle Scholar
  36. 36.
    Awadalla P, et al. Cohort profile of the CARTaGENE study: Quebec’s population-based biobank for public health and personalized genomics. Int J Epidemiol. 2012;42(5):1285–99.CrossRefPubMedGoogle Scholar
  37. 37.
    Hamad R, et al. Using “big data” to capture overall health status: properties and predictive value of a claims-based health risk score. PLoS One. 2015;10(5):e0126054.CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Roski J, Bolinn GW, Andrews TA. Creating value in health care through big data: opportunities and policy implications. Health Aff. 2014;33(7):1115–22.CrossRefGoogle Scholar
  39. 39.
    Bellazzi R, Ferrazzi F, Sacchi L. Predictive data mining in clinical medicine: a focus on selected methods and applications. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery. 2011;1(5):416–30.CrossRefGoogle Scholar
  40. 40.
    Wang, et al. Supervised patient similarity measure of heterogeneous patient records. Acm Sigkdd Explorations Newsletter. 2012;14(1):16–24.CrossRefGoogle Scholar
  41. 41.
    Wang F, Hu J, Sun J. Medical prognosis based on patient similarity and expert feedback. In: International Conference on Pattern Recognition. 2012.Google Scholar
  42. 42.
    Chawla NV, Davis DA. Bringing big data to personalized healthcare: a patient-centered framework. J Gen Intern Med. 2013;28(3):660–5.CrossRefPubMedCentralGoogle Scholar
  43. 43.
    Syed Z, Guttag J. Unsupervised similarity-based risk stratification for cardiovascular events using long-term time-series data. J Mach Learn Res. 2011;12(5):999–1024.Google Scholar
  44. 44.
    Roque FS, et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol. 2011;7(8):e1002141.CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Huang Z, et al. Similarity measure between patient traces for clinical pathway analysis: problem, method, and applications. IEEE J Biomed Health Inform. 2014;18(1):4–14.CrossRefPubMedGoogle Scholar
  46. 46.
    Ebadollahi, S., et al. Predicting patient’s trajectory of physiological data using temporal trends in similar patients: a system for near-term prognostics. AMIA Annual Symposium proceedings/AMIA Symposium AMIA Symposium, 2009. 2010:192–96.Google Scholar
  47. 47.
    Sun J, et al. A system for mining temporal physiological data streams for advanced prognostic decision support. In: IEEE International Conference on Data Mining. 2010.Google Scholar
  48. 48.
    Sun J, et al. Localized supervised metric learning on temporal physiological data. In: International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23–26 August 2010.Google Scholar
  49. 49.
    Booth CM, Tannock IF. Randomised controlled trials and population-based observational research: partners in the evolution of medical evidence. Br J Cancer. 2014;110(3):551–5.CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Tyldesley S, et al. Association between age and the utilization of radiotherapy in Ontario. Int J Rad Oncol Biol Phys. 2000;47(47):469–80.CrossRefGoogle Scholar
  51. 51.
    Faivre J, et al. Management and survival of colorectal cancer in the elderly in population-based studies. Eur J Cancer. 2007;43(15):2279–84.CrossRefPubMedGoogle Scholar
  52. 52.
    Kerkhofs TM, et al. Adrenocortical carcinoma: a population-based study on incidence and survival in the Netherlands since 1993. Eur J Cancer. 2013;49(11):2579–86.CrossRefPubMedGoogle Scholar
  53. 53.
    Schreiber D, et al. Characterization and outcomes of small cell carcinoma of the bladder using the surveillance, epidemiology, and end results database. Am J Clin Oncol. 2012;36(2):126–31.CrossRefGoogle Scholar
  54. 54.
    Darby SC, et al. Risk of ischemic heart disease in women after radiotherapy for breast cancer. N Engl J Med. 2013;368(11):987–98.CrossRefPubMedGoogle Scholar
  55. 55.
    Simon G, Wagner E, Vonkorff M. Cost-effectiveness comparisons using “real world” randomized trials: the case of new antidepressant drugs. J Clin Epidemiol. 1995;48(3):363–73.CrossRefPubMedGoogle Scholar
  56. 56.
    Keating NL, O’Malley AJ, Smith MR. Diabetes and cardiovascular disease during androgen deprivation therapy for prostate cancer. J Clin Oncol. 2006;24(27):4448–56.CrossRefPubMedGoogle Scholar
  57. 57.
    Fosså SD, et al. Noncancer causes of death in survivors of testicular cancer. J Natl Cancer Inst. 2007;99(7):533–44.CrossRefPubMedGoogle Scholar
  58. 58.
    Schwartz GF, Lagios MD, Silverstein MJ. Re: trends in the treatment of ductal carcinoma in situ of the breast. Cancer Spec Knowl Environ. 2004;96(6):1258–9.Google Scholar
  59. 59.
    Cooperberg MR, Broering JM, Carroll PR. Time trends and local variation in primary treatment of localized prostate cancer. J Clin Oncol. 2010;28(7):1117–23.CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Porter GA, et al. The impact of audit and feedback on nodal harvest in colorectal cancer. BMC Cancer. 2011;11(1):2.CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Mackillop WJ, et al. Does a centralized radiotherapy system provide adequate access to care? J Clin Oncol. 1997;15(3):1261.CrossRefPubMedGoogle Scholar
  62. 62.
    Hershman DL, et al. Delay of adjuvant chemotherapy initiation following breast cancer surgery among elderly women. Breast Cancer Res Treat. 2006;99(3):313–21.CrossRefPubMedGoogle Scholar
  63. 63.
    Lohrisch C, et al. Impact on survival of time from definitive surgery to initiation of adjuvant chemotherapy for early-stage breast cancer. J Clin Oncol. 2006;24(30):4888–94.CrossRefPubMedGoogle Scholar
  64. 64.
    Birkmeyer JD, Siewers AE, Finlayson EVA. Hospital volume and surgical mortality in the United States ☆. ACC Curr J Rev. 2002;346(15):1128–37.Google Scholar
  65. 65.
    Derogar M, et al. Hospital and surgeon volume in relation to survival after esophageal cancer surgery in a population-based study. J Clin Oncol. 2013;31(5):551–7.CrossRefPubMedGoogle Scholar
  66. 66.
    Chen SL, Bilchik AJ. More extensive nodal dissection improves survival for stages I to III of colon cancer: a population-based study. Ann Surg. 2006;244(4):602.PubMedPubMedCentralGoogle Scholar
  67. 67.
    Johnson PM, et al. Increasing negative lymph node count is independently associated with improved long-term survival in stage IIIB and IIIC colon cancer. J Clin Oncol. 2006;24(24):3570–5.CrossRefPubMedGoogle Scholar
  68. 68.
    Baxter R, et al. Safety of quadrivalent live attenuated influenza vaccine in subjects aged 2–49 years. Vaccine. 2017;35:1254–8.CrossRefPubMedGoogle Scholar
  69. 69.
    Slopen ME, et al. 64: school-age outcomes of late preterm infants. Am J Obstet Gynecol. 2011;204(1):S37–8.Google Scholar
  70. 70.
    Nair H, et al. Cohort studies around the world: methodologies, research questions and integration to address the emerging global epidemic of chronic diseases. Public Health. 2012;126(3):202–5.CrossRefPubMedGoogle Scholar
  71. 71.
    Trojano M, et al. Treatment decisions in multiple sclerosis [mdash] insights from real-world observational studies. Nat Rev Neurol. 2017;13:105–18.CrossRefPubMedGoogle Scholar
  72. 72.
    Narimatsu H. Gene–environment interactions in preventive medicine: current status and expectations for the future. Int J Mol Sci. 2017;18(2):302.CrossRefPubMedCentralGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of Epidemiology and Medical StatisticsNantong UniversityNantongChina
  2. 2.School of Chemistry, Biology and Materials EngineeringSuzhou University of Science and TechnologySuzhouChina

Personalised recommendations