Skip to main content
Log in

Advanced analytics for the automation of medical systematic reviews

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

While systematic reviews (SRs) are positioned as an essential element of modern evidence-based medical practice, the creation and update of these reviews is resource intensive. In this research, we propose to leverage advanced analytics techniques for automatically classifying articles for inclusion and exclusion for systematic reviews. Specifically, we used soft-margin polynomial Support Vector Machine (SVM) as a classifier, exploited Unified Medical Language Systems (UMLS) for medical terms extraction, and examined various techniques to resolve the class imbalance issue. Through an empirical study, we demonstrated that soft-margin polynomial SVM achieves better classification performance than the existing algorithms used in current research, and the performance of the classifier can be further improved by using UMLS to identify medical terms in articles and applying re-sampling methods to resolve the class imbalance issue.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Adeva, G., Atxa, P., Carrillo, U., & Zengotitabengoa, A. (2014). Automatic text classification to support systematic reviews in medicine. Expert Systems with Applications, 41(4), 1498–1508.

    Article  Google Scholar 

  • Allen, I., & Olkin, I. (1999). Estimating time to conduct a meta-analysis from number of citations retrieved. JAMA, 282(7), 634–635.

    Article  Google Scholar 

  • Ananiadou, S., Procter, R., Rea, B., & Sasaki, Y. (2009). Supporting Systematic Reviews Using Text Mining., 3.

  • Aronson, A. R., Bodenreider, O., Demner-Fushman, D., Fung, K. W., Lee, V. K., Mork, J. G., et al. (2007) From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches. In Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, 2007 (pp. 105–112): Association for Computational Linguistics

  • Bekhuis, T., & Demner-Fushman, D. (2012). Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artificial Intelligence in Medicine, 55, 197–207. doi:10.1016/j.artmed.2012.05.002.

    Article  Google Scholar 

  • Chawla, N. V. (2010). Data mining for imbalanced datasets: an overview. Data mining and knowledge discovery handbook, Springer.

  • Cochrane (2013). Cochrane handbook for systematic reviews of interventions. http://handbook.cochrane.org. Accessed Nov 20, 2013.

  • Cohen, A. M. C. (2014). Systematic drug class review gold standard data. http://skynet.ohsu.edu/~cohenaa/systematic-drug-class-review-data.html. Accessed April 2, 2014.

  • Cohen, A., Ersh, W., & Eterson, K. (2006). Reducing workload in systematic review preparation using automated citation classification. 206–219, doi:10.1197/jamia.M1929.The.

  • Cohen, A., Adams, C., Davis, J., Yu, C., Yu, P., Meng, W., et al. (2010). The Essential role of systematic reviews, and the need for automated text mining tools. 376–380.

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

    Google Scholar 

  • Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.

    Article  Google Scholar 

  • Frunza, O., Inkpen, D., & Matwin, S. (2010). Building systematic reviews using automatic text classification techniques. Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 303–311.

  • He, H., & Ma, Y. (2013). Imbalanced Learning: Foundations, Algorithms, and Applications: Technology & engineering.

  • Higgins, J., & Green, S. (2011). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration.

  • Joachims, T. (1998). Text categorization with support vector machines : learning with many relevant features. Universtat Dortmund, 1-19.

  • Kilicoglu, H., Demner-Fushman, D., Rindflesch, T. C., Wilczynski, N. L., & Haynes, R. B. (2009). Towards automatic recognition of scientifically rigorous clinical research evidence. Am Med Inform Assoc, 16(1), 25–31. doi:10.1197/jamia.M2996.

    Article  Google Scholar 

  • Kivinen, J., Warmuth, M., & Auer, P. (1995). The perceptron algorithm vs. winnow: Linear vs. logarithmic mistakes bounds when few input variables are relavant. Conference on Computational Learning Theory.

  • Liu, A. Y. (2004). The effect of oversampling and undersampling on classifying imbalanced text datasets. The University of Texas at Austin.

  • Liu, H., Johnson, S. B., & Friedman, C. (2002). Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS. [Evaluation Studies

  • Liu, T. Y., Xu, J., Qin, T., Xiong, W., & Li, H. (2007). Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of SIGIR 2007 workshop on learning to rank for information retrieval, 3–10.

  • Liu, X. Y., Wu, J., & Zhou, Z.-H. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions On SYSTEMS, Man, And Cybernetics—Part B: Cybernetics, 39(2), 539–550.

    Article  Google Scholar 

  • Matwin, S., Kouznetsov, A., Inkpen, D., Frunza, O., & O'Blenis, P. (2010). A new algorithm for reducing the workload of experts in performing systematic reviews. [research support, Non-U.S. Gov't]. Journal of the American Medical Informatics Association, 17(4), 446–453. doi:10.1136/jamia.2010.004325.

    Article  Google Scholar 

  • McGowan, J., & Sampson, M. (2005). Systematic reviews need systematic searchers. Journal of the Medical Library Association, 93(1), 74–80.

    Google Scholar 

  • Mulrow, C. (1994). Rationale for systematic reviews. BMJ, 309, 597–599.

    Article  Google Scholar 

  • Research Support, U.S. Gov't, P.H.S.]. J Am Med Inform Assoc, 9(6), 621–636.

  • Robertson, S. (2004). Understanding inverse document frequency: on theoretical arguments for IDF. Journal of Documentation, 60(5), 503–520.

    Article  Google Scholar 

  • Shemilt, I., Simon, A., Hollands, G. J., Marteau, T. M., Ogilvie, D., O'Mara-Eves, A., et al. (2013). Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. Research Synthesis Methods, n/a-n/a. doi:10.1002/jrsm.1093.

    Google Scholar 

  • Shojania, K. G., Sampson, M., Ansari, M. T., & Garritty, C. (2007a). Updating systematic reviews. AHRQ, 16.

  • Shojania, K. G., Sampson, M., Ansari, M. T., Garritty, C., Doucette, S., Rader, T., et al. (2007b). Updating Systematic Reviews. Agency for Healthcare Research and Quality, Contract No. 290–02–0021.

  • Stanford (2014). Soft margin classification. http://nlp.stanford.edu/IR-book/html/htmledition/soft-margin-classification-1.html. Accessed June 11, 2014.

  • Stevens, S. (2001). Systematic reviews: the heart of evidence-based practice. AACN Clinical Issues: Advanced Practice in Acute & Critical Care, 12(4), 529–538.

    Article  Google Scholar 

  • Tsafnat, G., Glasziou, P., Choong, M. K., Dunn, A., Galgani, F., & Coiera, E. (2014). Systematic review automation technologies. Syst Rev, 3, 74. doi:10.1186/2046-4053-3-74.

    Article  Google Scholar 

  • US National Library of Medicine (2014). Unified Medical Language System® (UMLS®). http://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html2014.

  • Wallace, B. C., Trikalinos, T. a., Lau, J., Brodley, C., & Schmid, C. H. (2010). Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics, 11, 55. doi:10.1186/1471-2105-11-55.

    Article  Google Scholar 

  • Wells, S. Role of information technology in evidence based medicine: advantages and limitations (2006). The Internet Journal of Healthcare Administration, 4, 2.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prem Timsina.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Timsina, P., Liu, J. & El-Gayar, O. Advanced analytics for the automation of medical systematic reviews . Inf Syst Front 18, 237–252 (2016). https://doi.org/10.1007/s10796-015-9589-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-015-9589-7

Keywords

Navigation