GoldenEye++: A Closer Look into the Black Box
Models with high predictive performance are often opaque, i.e., they do not allow for direct interpretation, and are hence of limited value when the goal is to understand the reasoning behind predictions. A recently proposed algorithm, GoldenEye, allows detection of groups of interacting variables exploited by a model. We employed this technique in conjunction with random forests generated from data obtained from electronic patient records for the task of detecting adverse drug events (ADEs). We propose a refined version of the GoldenEye algorithm, called GoldenEye++, utilizing a more sensitive grouping metric. An empirical investigation comparing the two algorithms on 27 datasets related to detecting ADEs shows that the new version of the algorithm in several cases finds groups of medically relevant interacting attributes, corresponding to prescribed drugs, undetected by the previous version. This suggests that the GoldenEye++ algorithm can be a useful tool for finding novel (adverse) drug interactions.
KeywordsClassifiers Randomization Adverse drug events.
Unable to display preview. Download preview PDF.
- 3.Stricker, B.H.Ch., Psaty, B.M.: Detection, verification, and quantification of adverse drug reactions. BMJ: British Medical Journal 329(7456), 44 (2004)Google Scholar
- 4.van Puijenbroek, E.P., Egberts, A.C.G., Heerdink, E.R., Leufkens, H.G.M.: Detecting drug-drug interactions using a database for spontaneous adverse drug reactions: an example with diuretics and non-steroidal anti-inflammatory drugs. European Journal of Clinical Pharmacology 56(9–10), 733–738 (2000)CrossRefGoogle Scholar
- 5.Milstien, J.B., Faich, G.A., Hsu, J.P., Knapp, D.E., Baum, C., Dreis, M.W.: Factors affecting physician reporting of adverse drug reactions. Drug Information Journal 20(2), 157–164 (1986)Google Scholar
- 6.Norén, G.N., Edwards, I.R.: Opportunities and challenges of adverse drug reaction surveillance in electronic patient records. Pharmacovigilance Review 4(1), 17–20 (2010)Google Scholar
- 8.Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Data mining and Knowledge Discovery Handbook, pp. 853–867. Springer (2005)Google Scholar
- 9.Wickham, H., Chang, W.: devtools: Tools to make developing R code easier, R package version 1.5 (2014)Google Scholar
- 10.Bache, K., Lichman, M.: UCI machine learning repository (2014)Google Scholar
- 11.Dalianis, H., Hassel, M., Henriksson, A., Skeppstedt, M.: Stockholm EPR corpus: a clinical database used to improve health care. In: Swedish Language Technology Conference, pp. 17–18 (2012)Google Scholar
- 12.Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
- 13.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2014)Google Scholar