Abstract
Big longitudinal observational databases present the opportunity to extract new knowledge in a cost effective manner. Unfortunately, the ability of these databases to be used for causal inference is limited due to the passive way in which the data are collected resulting in various forms of bias. In this paper we investigate a method that can overcome these limitations and determine causal contrast set rules efficiently from big data. In particular, we present a new methodology for the purpose of identifying risk factors that increase a patients likelihood of experiencing the known rare side effect of renal failure after ingesting aminosalicylates. The results show that the methodology was able to identify previously researched risk factors such as being prescribed diuretics and highlighted that patients with a higher than average risk of renal failure may be even more susceptible to experiencing it as a side effect after ingesting aminosalicylates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Giordano, S.H., Kuo, Y.-F., Duan, Z., Hortobagyi, G.N., Freeman, J., Goodwin, J.S.: Limits of observational data in determining outcomes from cancer therapy. Cancer 112(11), 2456–2466 (2008)
Cochran, W.G., Rubin, D.B.: Controlling bias in observational studies: A review. Sankhyā: The Indian Journal of Statistics, Series A, 417–446 (1973)
Black, N.: Why we need observational studies to evaluate the effectiveness of health care. British Medical Journal 312(7040), 1215–1218 (1996)
Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Machine learning 9(4), 309–347 (1992)
Silverstein, C., Brin, S., Motwani, R., Ullman, J.: Scalable techniques for mining causal structures. Data Mining and Knowledge Discovery 4(2–3), 163–192 (2000)
Heckerman, D., Meek, C., Cooper, G.: A bayesian approach to causal discovery. Computation, causation, and discovery 19, 141–166 (1999)
Li, J., Le, T.D., Liu, L., Liu, J., Jin, Z., Sun, B.: Mining causal association rules. In: 2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW), pp. 114–123. IEEE (2013)
Van Staa, T.P., Travis, S., Leufkens, H.G., Logan, R.F.: 5-aminosalicylic acids and the risk of renal disease: a large british epidemiologic study. Gastroenterology 126(7), 1733–1739 (2004)
Lewis, J.D., Schinnar, R., Bilker, W.B., Wang, X., Strom, B.L.: Validation studies of the health improvement network (THIN) database for pharmacoepidemiology research. Pharmacoepidemiology and Drug Safety 16(4), 393–401 (2007)
Lewis, J.D., Bilker, W.B., Weinstein, R.B., Strom, B.L.: The relationship between time since registration and measured incidence rates in the General Practice Research Database. Pharmacoepidemiology and Drug Safety 14(7), 443–451 (2005)
Stuart-Buttle, C., Brown, P., Price, C., O’Neil, M., Read, J.: The read thesaurus-creation and beyond. Studies in health technology and informatics 43, 416–420 (1996)
Committee, J.F.: British national formulary, vol. 65. Pharmaceutical Press (2013)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, no. 2, pp. 207–216. ACM (1993)
Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–52. ACM (1999)
Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery 5(3), 213–246 (2001)
Novak, P.K., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. The Journal of Machine Learning Research 10, 377–403 (2009)
Hosmer Jr., D.W., Lemeshow, S.: Applied logistic regression. John Wiley & Sons (2004)
Team, R.C., et al.: R: A language and environment for statistical computing (2012)
Hahsler, M., Gruen, B., Hornik, K.: arules - A computational environment for mining association rules and frequent item sets. Journal of Statistical Software 14(15), 1–25 (2005). http://www.jstatsoft.org/v14/i15/
De Jong, D., Tielen, J., Habraken, C., Wetzels, J., Naber, A.: 5-aminosalicylates and effects on renal function in patients with crohn’s disease. Inflammatory bowel diseases 11(11), 972–976 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Reps, J., Guo, Z., Zhu, H., Aickelin, U. (2015). Identifying Candidate Risk Factors for Prescription Drug Side Effects Using Causal Contrast Set Mining. In: Yin, X., Ho, K., Zeng, D., Aickelin, U., Zhou, R., Wang, H. (eds) Health Information Science. HIS 2015. Lecture Notes in Computer Science(), vol 9085. Springer, Cham. https://doi.org/10.1007/978-3-319-19156-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-19156-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19155-3
Online ISBN: 978-3-319-19156-0
eBook Packages: Computer ScienceComputer Science (R0)