Active learning in automated text classification: a case study exploring bias in predicted model performance metrics

Varghese, Arun; Hong, Tao; Hunter, Chelsea; Agyeman-Badu, George; Cawley, Michelle

doi:10.1007/s10669-019-09717-3

Active learning in automated text classification: a case study exploring bias in predicted model performance metrics

Published: 17 January 2019

Volume 39, pages 269–280, (2019)
Cite this article

Environment Systems and Decisions Aims and scope Submit manuscript

Arun Varghese ORCID: orcid.org/0000-0001-9882-884X¹,
Tao Hong¹,
Chelsea Hunter¹,
George Agyeman-Badu¹ &
…
Michelle Cawley²

632 Accesses
7 Citations
Explore all metrics

Abstract

Machine learning has emerged as a cost-effective innovation to support systematic literature reviews in human health risk assessments and other contexts. Supervised machine learning approaches rely on a training dataset, a relatively small set of documents with human-annotated labels indicating their topic, to build models that automatically classify a larger set of unclassified documents. “Active” machine learning has been proposed as an approach that limits the cost of creating a training dataset by interactively and sequentially focussing on training only the most informative documents. We simulate active learning using a dataset of approximately 7000 abstracts from the scientific literature related to the chemical arsenic. The dataset was previously annotated by subject matter experts with regard to relevance to two topics relating to toxicology and risk assessment. We examine the performance of alternative sampling approaches to sequentially expanding the training dataset, specifically looking at uncertainty-based sampling and probability-based sampling. We discover that while such active learning methods can potentially reduce training dataset size compared to random sampling, predictions of model performance in active learning are likely to suffer from statistical bias that negates the method’s potential benefits. We discuss approaches and the extent to which the bias resulting from skewed sampling can be compensated. We propose a useful role for active learning in contexts in which the accuracy of model performance metrics is not critical and/or where it is beneficial to rapidly create a class-balanced training dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Article Open access 22 September 2023

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Article Open access 19 November 2018

Notes

References

Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF (2005) Text categorization models for high-quality article retrieval in internal medicine. J Am Med Inform Assoc 12:207–216
Article Google Scholar
Bekhuis T, Demner-Fushman D (2012) Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artif Intell Med 55(3):197–207
Article Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Google Scholar
Chen Y, Mani S, Xu H (2012) Applying active learning to assertion classification of concepts in clinical text. J Biomed Inform 45(2):265–272. https://doi.org/10.1016/j.jbi.2011.11.003
Article Google Scholar
Dasgupta S (2009) The two faces of active learning. In: Proceedings of the twentieth conference on algorithmic learning theory
Ertekin S, Huang J, Bottou L, Giles L (2007) Learning on the border: active learning in imbalanced data classification. In: Proceedings of the 16th ACM conference on information and knowledge management, ACM, pp 127–136
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58
Article Google Scholar
Griffiths T, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235. https://doi.org/10.1073/pnas.0307752101
Article CAS Google Scholar
Harris ZS (1954) Distributional structure. WORD 10:146–162
Article Google Scholar
Ingersoll GS, Morton TS, Farris AL (2013) Taming text: how to find, organize, and manipulate it. Manning Publications Co., New York
Google Scholar
Jonnalagadda S, Goyal P, Huffman M (2015) Automating data extraction in systematic reviews: a systematic review. Syst Rev 15(4):78. https://doi.org/10.1186/s13643-015-0066-7
Article Google Scholar
Lewis D, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the international conference on machine learning (ICML). Morgan Kaufmann, Burlington, pp 148–156
Lewis D, Gale W (1994) A sequential algorithm for training text classifiers. In Proceedings of the ACM SIGIR conference on research and development in information retrieval. ACM/Springer, pp 3–12
O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S (2015) Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 4:5
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Google Scholar
Python Software Foundation. Python language reference (Version 2.7)
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the international conference on machine learning (ICML). Morgan Kaufmann, Burlington, pp 441–448
Settles B (2010) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison, Madison
Settles B, Craven M, Ray S (2008) Multiple-instance active learning. Adv Neural Inf Process Syst 20:1289–1296
Google Scholar
Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In Proceedings of the ACM workshop on computational learning theory, pp 287–294
Shemilt I et al (2014) Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. Res Synth Methods 5(1):31–49
Article Google Scholar
Tomanek K, Olsson F (2009) A web survey on the use of active learning to support annotation of text data. In Proceedings of the NAACL HLT workshop on active learning for natural language processing. ACL Press, pp 45–48
U.S. EPA (2015) IRIS toxicological review of dibutyl phthalate (Dbp) (preliminary assessment materials). U.S. Environmental Protection Agency, Washington, DC, EPA/635/R-13/302
Varghese A, Cawley M, Hong T (2017) Supervised clustering for automated document classification and prioritization: a case study using toxicological abstracts. Environ Syst Decis. https://doi.org/10.1007/s10669-017-9670-5
Article Google Scholar
Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH (2010) Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinform 11:55
Article Google Scholar

Download references

Acknowledgements

The development of the methods presented here was fully supported by ICF. The results presented here were generated for the purposes of this paper alone. We thank Gregory Carter for review and helpful comments.

Author information

Authors and Affiliations

ICF, 2635 Meridian Parkway, Durham, 27713, NC, USA
Arun Varghese, Tao Hong, Chelsea Hunter & George Agyeman-Badu
Health Sciences Library, Clinical Academic and Research Engagement, University of North Carolina, Chapel Hill, 27599, NC, USA
Michelle Cawley

Authors

Arun Varghese
View author publications
You can also search for this author in PubMed Google Scholar
Tao Hong
View author publications
You can also search for this author in PubMed Google Scholar
Chelsea Hunter
View author publications
You can also search for this author in PubMed Google Scholar
George Agyeman-Badu
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Cawley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arun Varghese.

Electronic supplementary material

Below is the link to the electronic supplementary material.

10669_2019_9717_MOESM1_ESM.docx

The supplementary data include 18 tables that correspond to the results generated in the simulations summarized as trends in Figs. 2–5. In the interests of brevity, these tables present simulation results only up to the point where the actual omission fraction of relevant documents in less than the required threshold of 0.05. Each table is supplied with a proposed interpretation of apparent trends in the context of the theoretical discussions in Section 2. Supplementary material 1 (DOCX 68 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Varghese, A., Hong, T., Hunter, C. et al. Active learning in automated text classification: a case study exploring bias in predicted model performance metrics. Environ Syst Decis 39, 269–280 (2019). https://doi.org/10.1007/s10669-019-09717-3

Download citation

Published: 17 January 2019
Issue Date: 15 September 2019
DOI: https://doi.org/10.1007/s10669-019-09717-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active learning in automated text classification: a case study exploring bias in predicted model performance metrics

Abstract

Access this article

Similar content being viewed by others

Literature reviews as independent studies: guidelines for academic practice

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

10669_2019_9717_MOESM1_ESM.docx

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Active learning in automated text classification: a case study exploring bias in predicted model performance metrics

Abstract

Access this article

Similar content being viewed by others

Literature reviews as independent studies: guidelines for academic practice

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

10669_2019_9717_MOESM1_ESM.docx

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation