Abstract
Models with high predictive performance are often opaque, i.e., they do not allow for direct interpretation, and are hence of limited value when the goal is to understand the reasoning behind predictions. A recently proposed algorithm, GoldenEye, allows detection of groups of interacting variables exploited by a model. We employed this technique in conjunction with random forests generated from data obtained from electronic patient records for the task of detecting adverse drug events (ADEs). We propose a refined version of the GoldenEye algorithm, called GoldenEye++, utilizing a more sensitive grouping metric. An empirical investigation comparing the two algorithms on 27 datasets related to detecting ADEs shows that the new version of the algorithm in several cases finds groups of medically relevant interacting attributes, corresponding to prescribed drugs, undetected by the previous version. This suggests that the GoldenEye++ algorithm can be a useful tool for finding novel (adverse) drug interactions.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Henelius, A., Puolamäki, K., Boström, H., Asker, L., Papapetrou, P.: A peek into the black box: exploring classifiers by randomization. Data Mining and Knowledge Discovery 28(5–6), 1503–1529 (2014)
Härmark, L., Van Grootheest, A.C.: Pharmacovigilance: methods, recent developments and future perspectives. European Journal of Clinical Pharmacology 64(8), 743–752 (2008)
Stricker, B.H.Ch., Psaty, B.M.: Detection, verification, and quantification of adverse drug reactions. BMJ: British Medical Journal 329(7456), 44 (2004)
van Puijenbroek, E.P., Egberts, A.C.G., Heerdink, E.R., Leufkens, H.G.M.: Detecting drug-drug interactions using a database for spontaneous adverse drug reactions: an example with diuretics and non-steroidal anti-inflammatory drugs. European Journal of Clinical Pharmacology 56(9–10), 733–738 (2000)
Milstien, J.B., Faich, G.A., Hsu, J.P., Knapp, D.E., Baum, C., Dreis, M.W.: Factors affecting physician reporting of adverse drug reactions. Drug Information Journal 20(2), 157–164 (1986)
Norén, G.N., Edwards, I.R.: Opportunities and challenges of adverse drug reaction surveillance in electronic patient records. Pharmacovigilance Review 4(1), 17–20 (2010)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Data mining and Knowledge Discovery Handbook, pp. 853–867. Springer (2005)
Wickham, H., Chang, W.: devtools: Tools to make developing R code easier, R package version 1.5 (2014)
Bache, K., Lichman, M.: UCI machine learning repository (2014)
Dalianis, H., Hassel, M., Henriksson, A., Skeppstedt, M.: Stockholm EPR corpus: a clinical database used to improve health care. In: Swedish Language Technology Conference, pp. 17–18 (2012)
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Henelius, A. et al. (2015). GoldenEye++: A Closer Look into the Black Box. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds) Statistical Learning and Data Sciences. SLDS 2015. Lecture Notes in Computer Science(), vol 9047. Springer, Cham. https://doi.org/10.1007/978-3-319-17091-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-17091-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17090-9
Online ISBN: 978-3-319-17091-6
eBook Packages: Computer ScienceComputer Science (R0)