As one of causal inference methodologies, the inverse probability weighting (IPW) method has been utilized to address confounding and account for missing data when subjects with missing data cannot be included in a primary analysis. The transdisciplinary field of molecular pathological epidemiology (MPE) integrates molecular pathological and epidemiological methods, and takes advantages of improved understanding of pathogenesis to generate stronger biological evidence of causality and optimize strategies for precision medicine and prevention. Disease subtyping based on biomarker analysis of biospecimens is essential in MPE research. However, there are nearly always cases that lack subtype information due to the unavailability or insufficiency of biospecimens. To address this missing subtype data issue, we incorporated inverse probability weights into Cox proportional cause-specific hazards regression. The weight was inverse of the probability of biomarker data availability estimated based on a model for biomarker data availability status. The strategy was illustrated in two example studies; each assessed alcohol intake or family history of colorectal cancer in relation to the risk of developing colorectal carcinoma subtypes classified by tumor microsatellite instability (MSI) status, using a prospective cohort study, the Nurses’ Health Study. Logistic regression was used to estimate the probability of MSI data availability for each cancer case with covariates of clinical features and family history of colorectal cancer. This application of IPW can reduce selection bias caused by nonrandom variation in biospecimen data availability. The integration of causal inference methods into the MPE approach will likely have substantial potentials to advance the field of epidemiology.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Area under receiver-operating characteristic curve
Complete case analysis
Directed acyclic graph
Inverse probability weighting
Missing at random
Mean metabolic equivalent task score
Missing completely at random
Molecular pathological epidemiology
Nurses’ Health Study
Receiver-operating characteristic curve
Ogino S, Lochhead P, Chan AT, Nishihara R, Cho E, Wolpin BM, Meyerhardt JA, Meissner A, Schernhammer ES, Fuchs CS, Giovannucci E. Molecular pathological epidemiology of epigenetics: emerging integrative science to analyze environment, host, and disease. Mod Pathol. 2013;26(4):465–84.
Ogino S, Nishihara R, VanderWeele TJ, Wang M, Nishi A, Lochhead P, Qian ZR, Zhang X, Wu K, Nan H, Yoshida K, Milner DA Jr, Chan AT, Field AE, Camargo CA Jr, Williams MA, Giovannucci EL. Review article: the role of molecular pathological epidemiology in the study of neoplastic and non-neoplastic diseases in the era of precision medicine. Epidemiology. 2016;27(4):602–11.
Nishihara R, VanderWeele TJ, Shibuya K, Mittleman MA, Wang M, Field AE, Giovannucci E, Lochhead P, Ogino S. Molecular pathological epidemiology gives clues to paradoxical findings. Eur J Epidemiol. 2015;30(10):1129–35.
Nishi A, Milner DA Jr, Giovannucci EL, Nishihara R, Tan AS, Kawachi I, Ogino S. Integration of molecular pathology, epidemiology and social science for global precision medicine. Expert Rev Mol Diagn. 2016;16(1):11–23.
Ogino S, Chan AT, Fuchs CS, Giovannucci E. Molecular pathological epidemiology of colorectal neoplasia: an emerging transdisciplinary and interdisciplinary field. Gut. 2011;60(3):397–411.
Richiardi L, Barone-Adesi F, Pearce N. Cancer subtypes in aetiological research. Eur J Epidemiol. 2017;32(5):353–61.
Drew DA, Cao Y, Chan AT. Aspirin and colorectal cancer: the promise of precision chemoprevention. Nat Rev Cancer. 2016;16(3):173–86.
Chia WK, Ali R, Toh HC. Aspirin as adjuvant therapy for colorectal cancer–reinterpreting paradigms. Nat Rev Clin Oncol. 2012;9(10):561–70.
Tougeron D, Sha D, Manthravadi S, Sinicrope FA. Aspirin and colorectal cancer: back to the future. Clin Cancer Res. 2014;20(5):1087–94.
Umar A, Steele VE, Menter DG, Hawk ET. Mechanisms of nonsteroidal anti-inflammatory drugs in cancer prevention. Semin Oncol. 2016;43(1):65–77.
Jiang MJ, Dai JJ, Gu DN, Huang Q, Tian L. Aspirin in pancreatic cancer: chemopreventive effects and therapeutic potentials. Biochim Biophys Acta. 2016;1866(2):163–76.
Coyle C, Cafferty FH, Langley RE. Aspirin and colorectal cancer prevention and treatment: is it for everyone? Curr Colorectal Cancer Rep. 2016;12:27–34.
Liao X, Lochhead P, Nishihara R, Morikawa T, Kuchiba A, Yamauchi M, Imamura Y, Qian ZR, Baba Y, Shima K, Sun R, Nosho K, Meyerhardt JA, Giovannucci E, Fuchs CS, Chan AT, Ogino S. Aspirin use, tumor PIK3CA mutation, and colorectal-cancer survival. N Engl J Med. 2012;367(17):1596–606.
Nishihara R, Lochhead P, Kuchiba A, Jung S, Yamauchi M, Liao X, Imamura Y, Qian ZR, Morikawa T, Wang M, Spiegelman D, Cho E, Giovannucci E, Fuchs CS, Chan AT, Ogino S. Aspirin use and risk of colorectal cancer according to BRAF mutation status. JAMA. 2013;309(24):2563–71.
Chan AT, Ogino S, Fuchs CS. Aspirin and the risk of colorectal cancer in relation to the expression of COX-2. N Engl J Med. 2007;356(21):2131–42.
Cao Y, Nishihara R, Qian ZR, Song M, Mima K, Inamura K, Nowak JA, Drew DA, Lochhead P, Nosho K, Morikawa T, Zhang X, Wu K, Wang M, Garrett WS, Giovannucci EL, Fuchs CS, Chan AT, Ogino S. Regular aspirin use associates with lower risk of colorectal cancers with low numbers of tumor-infiltrating lymphocytes. Gastroenterology. 2016;151(5):879–92.
Lu K, Tsiatis AA. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics. 2001;57(4):1191–7.
Nevo D, Nishihara R, Ogino S, Wang M. The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure. Lifetime Data Anal. 2017. https://doi.org/10.1007/s10985-017-9401-8.
Greenland S, Finkle WD. A critical look at methods for handling missing covariates in epidemiologic regression analyses. Am J Epidemiol. 1995;142(12):1255–64.
Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res. 2013;22(3):278–95.
Graff RE, Pettersson A, Lis RT, Ahearn TU, Markt SC, Wilson KM, Rider JR, Fiorentino M, Finn S, Kenfield SA, Loda M, Giovannucci EL, Rosner B, Mucci LA. Dietary lycopene intake and risk of prostate cancer defined by ERG protein expression. Am J Clin Nutr. 2016;103(3):851–60.
Cox DR. Regression models and life-tables. J R Stat Soc Ser B (Methodological). 1972;34(2):187–220.
Wang M, Spiegelman D, Kuchiba A, Lochhead P, Kim S, Chan AT, Poole EM, Tamimi R, Tworoger SS, Giovannucci E, Rosner B, Ogino S. Statistical methods for studying disease subtype heterogeneity. Stat Med. 2016;35(5):782–800.
Lunn M, McNeil D. Applying Cox regression to competing risks. Biometrics. 1995;51(2):524–32.
Ballester V, Rashtak S, Boardman L. Clinical and molecular features of young-onset colorectal cancer. World J Gastroenterol. 2016;22(5):1736–44.
Ogino S, Nosho K, Kirkner GJ, Kawasaki T, Meyerhardt JA, Loda M, Giovannucci EL, Fuchs CS. CpG island methylator phenotype, microsatellite instability, BRAF mutation and clinical outcome in colon cancer. Gut. 2009;58(1):90–6.
Lochhead P, Kuchiba A, Imamura Y, Liao X, Yamauchi M, Nishihara R, Qian ZR, Morikawa T, Shen J, Meyerhardt JA, Fuchs CS, Ogino S. Microsatellite instability and BRAF mutation testing in colorectal cancer prognostication. J Natl Cancer Inst. 2013;105(15):1151–6.
Hernán MA, Robins JM. Causal survival analysis. In: Causal inference. Boca Raton: Chapman & Hall/CRC, forthcoming; 2018. p. 69–78. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.
Prentice RL, Kalbfleisch JD, Peterson AV Jr, Flournoy N, Farewell VT, Breslow NE. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34(4):541–54.
Lau B, Cole SR, Gange SJ. Competing risk regression models for epidemiologic data. Am J Epidemiol. 2009;170(2):244–56.
Schernhammer ES, Giovannuccci E, Fuchs CS, Ogino S. A prospective study of dietary folate and vitamin B and colon cancer according to microsatellite instability and KRAS mutational status. Cancer Epidemiol Biomark Prev. 2008;17(10):2895–8.
Ogino S, Nishihara R, Lochhead P, Imamura Y, Kuchiba A, Morikawa T, Yamauchi M, Liao X, Qian ZR, Sun R, Sato K, Kirkner GJ, Wang M, Spiegelman D, Meyerhardt JA, Schernhammer ES, Chan AT, Giovannucci E, Fuchs CS. Prospective study of family history and colorectal cancer risk by tumor LINE-1 methylation level. J Natl Cancer Inst. 2013;105(2):130–40.
Song M, Nishihara R, Wu K, Qian ZR, Kim SA, Sukawa Y, Mima K, Inamura K, Masuda A, Yang J, Fuchs CS, Giovannucci EL, Ogino S, Chan AT. Marine omega-3 polyunsaturated fatty acids and risk of colorectal cancer according to microsatellite instability. J Natl Cancer Inst. 2015;107(4):djv007.
Ogino S, Brahmandam M, Cantor M, Namgyal C, Kawasaki T, Kirkner G, Meyerhardt JA, Loda M, Fuchs CS. Distinct molecular features of colorectal carcinoma with signet ring cell component and colorectal carcinoma with mucinous component. Mod Pathol. 2006;19(1):59–68.
Lynch KL, Ahnen DJ, Byers T, Weiss DG, Lieberman DA. First-degree relatives of patients with advanced colorectal adenomas have an increased prevalence of colorectal cancer. Clin Gastroenterol Hepatol. 2003;1(2):96–102.
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36.
Ogino S, Campbell PT, Nishihara R, Phipps AI, Beck AH, Sherman ME, Chan AT, Troester MA, Bass AJ, Fitzgerald KC, Irizarry RA, Kelsey KT, Nan H, Peters U, Poole EM, Qian ZR, Tamimi RM, Tchetgen Tchetgen EJ, Tworoger SS, Zhang X, Giovannucci EL, van den Brandt PA, Rosner BA, Wang M, Chatterjee N, Begg CB. Proceedings of the second international molecular pathological epidemiology (MPE) meeting. Cancer Causes Control. 2015;26(7):959–72.
Campbell PT, Rebbeck TR, Nishihara R, Beck AH, Begg CB, Bogdanov AA, Cao Y, Coleman HG, Freeman GJ, Heng YJ, Huttenhower C, Irizarry RA, Kip NS, Michor F, Nevo D, Peters U, Phipps AI, Poole EM, Qian ZR, Quackenbush J, Robins H, Rogan PK, Slattery ML, Smith-Warner SA, Song M, VanderWeele TJ, Xia D, Zabor EC, Zhang X, Wang M, Ogino S. Proceedings of the third international molecular pathological epidemiology (MPE) meeting. Cancer Causes Control. 2017;28(2):167–76.
Hamada T, Keum N, Nishihara R, Ogino S. Molecular pathological epidemiology: new developing frontiers of big data science to study etiologies and pathogenesis. J Gastroenterol. 2017;52(3):265–75.
Gao C. Molecular pathological epidemiology in diabetes mellitus and risk of hepatocellular carcinoma. World J Hepatol. 2016;8(27):1119–27.
Rescigno T, Micolucci L, Tecce MF, Capasso A. Bioactive nutrients and nutrigenomics in age-related diseases. Molecules. 2017;22(1):E105.
Bishehsari F, Mahdavinia M, Vacca M, Malekzadeh R, Mariani-Costantini R. Epidemiological transition of colorectal cancer in developing countries: environmental factors, molecular pathways, and opportunities for prevention. World J Gastroenterol. 2014;20(20):6055–72.
Martinez-Useros J, Garcia-Foncillas J. Obesity and colorectal cancer: molecular features of adipose tissue. J Transl Med. 2016;14:21.
Serafino A, Sferrazza G, Colini Baldeschi A, Nicotera G, Andreola F, Pittaluga E, Pierimarchi P. Developing drugs that target the Wnt pathway: recent approaches in cancer and neurodegenerative diseases. Expert Opin Drug Discov. 2017;12(2):169–86.
Patil H, Saxena SG, Barrow CJ, Kanwar JR, Kapat A, Kanwar RK. Chasing the personalized medicine dream through biomarker validation in colorectal cancer. Drug Discov Today. 2017;22(1):111–9.
Alnabulsi A, Murray GI. Integrative analysis of the colorectal cancer proteome: potential clinical impact. Expert Rev Proteomics. 2016;13(10):917–27.
Kuroiwa-Trzmielina J, Wang F, Rapkins RW, Rapkins RW, Ward RL, Buchanan DD, Win AK, Clendenning M, Rosty C, Southey MC, Winship IM, Hopper JL, Jenkins MA, Olivier J, Hawkins NJ, Hitchins MP. SNP rs16906252C > T is an expression and methylation quantitative trait locus associated with an increased risk of developing MGMT-methylated colorectal cancer. Clin Cancer Res. 2016;22(24):6266–77.
Slattery ML, Lee FY, Pellatt AJ, Mullany LE, Stevens JR, Samowitz WS, Wolff RK, Herrick JS. Infrequently expressed miRNAs in colorectal cancer tissue and tumor molecular phenotype. Mod Pathol. 2017;30(8):1152–69.
Hughes LA, Khalid-de Bakker CA, Smits KM, van den Brandt PA, Jonkers D, Ahuja N, Herman JG, Weijenberg MP, van Engeland M. The CpG island methylator phenotype in colorectal cancer: progress and problems. Biochim Biophys Acta. 2012;1825(1):77–85.
Campbell PT, Newton CC, Newcomb PA, Phipps AI, Ahnen DJ, Baron JA, Buchanan DD, Casey G, Cleary SP, Cotterchio M, Farris AB, Figueiredo JC, Gallinger S, Green RC, Haile RW, Hopper JL, Jenkins MA, Le Marchand L, Makar KW, McLaughlin JR, Potter JD, Renehan AG, Sinicrope FA, Thibodeau SN, Ulrich CM, Win AK, Lindor NM, Limburg PJ. Association between body mass index and mortality for colorectal cancer survivors: overall and by tumor molecular phenotype. Cancer Epidemiol Biomark Prev. 2015;24(8):1229–38.
Gray RT, Loughrey MB, Bankhead P, Cardwell CR, McQuaid S, O’Neill RF, Arthur K, Bingham V, McGready C, Gavin AT, James JA, Hamilton PW, Salto-Tellez M, Murray LJ, Coleman HG. Statin use, candidate mevalonate pathway biomarkers, and colon cancer survival in a population-based cohort study. Br J Cancer. 2017;116(12):1652–9.
Begg CB, Orlow I, Zabor EC, Arora A, Sharma A, Seshan VE, Bernstein JL. Identifying etiologically distinct sub-types of cancer: a demonstration project involving breast cancer. Cancer Med. 2015;4(9):1432–9.
Begg CB, Seshan VE, Zabor EC, Furberg H, Arora A, Shen R, Maranchie JK, Nielsen ME, Rathmell WK, Signoretti S, Tamboli P, Karam JA, Choueiri TK, Hakimi AA, Hsieh JJ. Genomic investigation of etiologic heterogeneity: methodologic challenges. BMC Med Res Methodol. 2014;14:138.
Begg CB, Zabor EC, Bernstein JL, Bernstein L, Press MF, Seshan VE. A conceptual and methodological framework for investigating etiologic heterogeneity. Stat Med. 2013;32(29):5039–52.
Chatterjee N, Sinha S, Diver WR, Feigelson HS. Analysis of cohort studies with multivariate and partially observed disease classification data. Biometrika. 2010;97(3):683–98.
Wang M, Kuchiba A, Ogino S. A meta-regression method for studying etiological heterogeneity across disease subtypes classified by multiple biomarkers. Am J Epidemiol. 2015;182(3):263–70.
Inamura K, Song M, Jung S, Nishihara R, Yamauchi M, Lochhead P, Qian ZR, Kim SA, Mima K, Sukawa Y, Masuda A, Imamura Y, Zhang X, Pollak MN, Mantzoros CS, Harris CC, Giovannucci E, Fuchs CS, Cho E, Chan AT, Wu K, Ogino S. Prediagnosis plasma adiponectin in relation to colorectal cancer risk according to KRAS mutation status. J Natl Cancer Inst. 2016;108(4):djv363.
Song M, Nishihara R, Wang M, Chan AT, Qian ZR, Inamura K, Zhang X, Ng K, Kim SA, Mima K, Sukawa Y, Nosho K, Fuchs CS, Giovannucci EL, Wu K, Ogino S. Plasma 25-hydroxyvitamin D and colorectal cancer risk according to tumour immunity status. Gut. 2016;65(2):296–304.
Demissie S, LaValley MP, Horton NJ, Glynn RJ, Cupples LA. Bias due to missing exposure data using complete-case analysis in the proportional hazards regression model. Stat Med. 2003;22(4):545–57.
We would like to thank the participants and staff of the Nurses’ Health Study for their valuable contributions as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. The authors assume full responsibility for analyses and interpretation of these data.
This work was supported by U.S. National Institutes of Health (NIH) grants [P01 CA87969 to M.J. Stampfer; UM1 CA186107 to M.J. Stampfer; R01 CA137178 to A.T.C.; K24 DK098311 to A.T.C.; R01 CA151993 to S.O.; R35 CA197735 to S.O.; K07 CA190673 to R.N.]; and Nodal Award (to S.O.) from the Dana-Farber Harvard Cancer Center. L.L. is supported by the grant from National Natural Science Foundation of China No. 81302491, a scholarship grant from Chinese Scholarship Council and a fellowship grant from Huazhong University of Science and Technology. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Conflict of interest
The authors declare that they have no conflicts of interest.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent was obtained from all individual participants included in the study.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Liu, L., Nevo, D., Nishihara, R. et al. Utility of inverse probability weighting in molecular pathological epidemiology. Eur J Epidemiol 33, 381–392 (2018). https://doi.org/10.1007/s10654-017-0346-8
- Etiologic heterogeneity
- Marginal structural model
- Missing at random
- Unique disease principle
- Selection bias