A deep learning classifier for sentence classification in biomedical and computer science abstracts

Gonçalves, Sérgio; Cortez, Paulo; Moro, Sérgio

doi:10.1007/s00521-019-04334-2

A deep learning classifier for sentence classification in biomedical and computer science abstracts

Brain inspired Computing & Machine Learning Applied Research-BISMLARE
Published: 10 July 2019

Volume 32, pages 6793–6807, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

1874 Accesses
26 Citations
2 Altmetric
Explore all metrics

Abstract

The automatic classification of abstract sentences into its main elements (background, objectives, methods, results, conclusions) is a key tool to support scientific database querying, to summarize relevant literature works and to assist in the writing of new abstracts. In this paper, we propose a novel deep learning approach based on a convolutional layer and a bidirectional gated recurrent unit to classify sentences of abstracts. First, the proposed neural network was tested on a publicly available repository containing 20 thousand abstracts from the biomedical domain. Competitive results were achieved, with weight-averaged Precision, Recall and F1-score values around 91%, and an area under the ROC curve (AUC) of 99%, which are higher when compared to a state-of-the-art neural network. Then, a crowdsourcing approach using gamification was adopted to create a new comprehensive set of 4111 classified sentences from the computer science domain, focused on social media abstracts. The results of applying the same deep learning modeling technique trained with 3287 (80%) of the available sentences were below the ones obtained for the larger biomedical dataset, with weight-averaged Precision, Recall and F1-score values between 73 and 76%, and an AUC of 91%. Considering the dataset dimension as a likely important factor for such performance decrease, a data augmentation approach was further applied. This involved the use of text mining to translate sentences of the computer science abstract corpus while retaining the same meaning. Such approach resulted in slight improvements (around 2 percentage points) for the weight-averaged Recall and F1-score values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Notes

References

Michalska-Smith MJ, Allesina S (2017) And, not or: quality, quantity in scientific publishing. PLoS ONE 12(6):e0178074
Google Scholar
Khabsa M, Giles CL (2014) The number of scholarly documents on the public web. PLoS ONE 9(5):e93949
Google Scholar
Atanassova I, Bertin M, Larivière V (2016) On the composition of scientific abstracts. J Doc 72(4):636–647
Google Scholar
Zubiaga Arkaitz, Kochkina E, Liakata M, Procter R, Lukasik M, Bontcheva K, Cohn T, Augenstein I (2018) Discourse-aware rumour stance classification in social media using sequential classifiers. Inf Process Manag 54(2):273–290. https://doi.org/10.1016/j.ipm.2017.11.009
Google Scholar
Dernoncourt F, Lee JY, Szolovits P (2017) Neural networks for joint sentence classification in medical paper abstracts. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, short papers, vol 2, pp 694–700
Cornuel E (2005) A vision for business schools, vol 24. Emerald Group Publishing, Bingley
Google Scholar
Kitchenham B, Brereton P (2013) A systematic review of systematic review process research in software engineering. Inf Softw Technol 55(12):2049–2075
Google Scholar
Moro S, Cortez P, Rita P (2015) Business intelligence in banking: a literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Syst Appl 42(3):1314–1324
Google Scholar
Liu Y, Wu F, Liu M, Liu B (2013) Abstract sentence classification for scientific papers based on transductive svm. Comput Inf Sci 6(4):125
Google Scholar
Dernoncourt F, Lee JY (2017) Pubmed 200k RCT: a dataset for sequential sentence classification in medical abstracts. In: Proceedings of the eighth international joint conference on natural language processing (volume 2: short papers), vol 2, pp 308–313
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, Cambridge
MATH Google Scholar
Boudin F, Nie JY, Bartlett JC, Grad R, Pluye P, Dawes M (2010) Combining classifiers for robust pico element detection. BMC Med Inform Decis Mak 10(1):29
Google Scholar
Dellermann D, Ebel P, Söllner M, Leimeister JM (2019) Hybrid intelligence. Bus Inf Syst Eng. https://doi.org/10.1007/s12599-019-00595-2
Google Scholar
Tsapatsoulis N, Djouvas C (2019) Opinion mining from social media short texts: does collective intelligence beat deep learning? Front Robot AI. https://doi.org/10.3389/frobt.2018.00138
Google Scholar
Brabham DC (2008) Crowdsourcing as a model for problem solving: an introduction and cases. Convergence 14(1):75–90
Google Scholar
Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost-effective labels. In: 2010 IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 25–32
Morschheuser B, Hamari J, Koivisto J (2016) Gamification in crowdsourcing: a review. In: 2016 49th Hawaii international conference on system sciences (HICSS). IEEE, pp 4375–4384
Hossain M (2012) Users’ motivation to participate in online crowdsourcing platforms. In: 2012 international conference on innovation management and technology research (ICIMTR). IEEE, pp 310–315
Massung E, Coyle D, Cater KF, Jay M, Preist C (2013) Using crowdsourcing to support pro-environmental community activism. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 371–380
Zheng H, Li D, Hou W (2011) Task design, motivation, and participation in crowdsourcing contests. Int J Electron Commer 15(4):57–88
Google Scholar
Moro S, Ramos P, Esmerado J, Jalali SMJ (2019) Can we trace back hotel online reviews characteristics using gamification features? Int J Inf Manag 44:88–95
Google Scholar
Canito J, Ramos P, Moro S, Rita P (2018) Unfolding the relations between companies and technologies under the big data umbrella. Comput Ind 99:1–8
Google Scholar
Di Bitetti MS, Ferreras JA (2017) Publish (in English) or perish: the effect on citation rate of using languages other than English in scientific publications. Ambio 46(1):121–127
Google Scholar
Feinerer I, Buchta C, Geiger W, Rauch J, Mair P, Hornik K (2013) The textcat package for n-gram based text categorization in R. J Stat Softw 52(6):1–17
Google Scholar
Panettieri J (2017) Cloud market share 2017: Amazon aws, microsoft azure, ibm, google. ChannelE2E
Suciu G, Scheianu A, Vochin M (2017) Disaster early warning using time-critical iot on elastic cloud workbench. In: 2017 IEEE International Black Sea conference on communications and networking (BlackSeaCom). IEEE, pp 1–5
Stewart O, Lubensky D, Huerta JM (2010) Crowdsourcing participation inequality: a scout model for the enterprise domain. In: Proceedings of the ACM SIGKDD workshop on human computation. ACM, pp 30–33
Moro S, Laureano R, Cortez P (2011) Using data mining for bank direct marketing: an application of the crisp-dm methodology. In: Proceedings of European simulation and modelling conference-ESM’2011, EUROSIS-ETI, pp 117–121
Yuan X, Liao X, Li S, Shi Q, Wu J, Li K (2019) Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification. CoRR abs/1901.08351, arxiv:1901.08351
Ng A (2017) Machine learning yearning. Stanford Press
Myers MD et al (1997) Qualitative research in information systems. Manag Inf Syst Q 21(2):241–242
Google Scholar
Zhang Q, Yang LT, Chen Z, Li P (2018) A survey on deep learning for big data. Inf Fusion 42:146–157
Google Scholar
LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE international symposium on circuits and systems, pp 253–256
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1724–1734
Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
Witten I, Frank E, Hall M, Pal C (2017) Data mining: practical machine learning tools and techniques, 4th edn. Morgan Kaufmann, San Franscico
Google Scholar
Moro S, Cortez P, Rita P (2017) A framework for increasing the value of predictive data-driven models by enriching problem domain characterization with novel features. Neural Comput Appl 28(6):1515–1523
Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874
Google Scholar
Stoean R (2018) Analysis on the potential of an EA-surrogate modelling tandem for deep learning parametrization: an example for cancer classification from medical images. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3709-5
Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Rebai I, BenAyed Y, Mahdi W (2016) Deep multilayer multiple kernel learning. Neural Comput Appl 27(8):2305–2314
Google Scholar
Bawa VS, Kumar V (2018) Emotional sentiment analysis for a group of people based on transfer learning with a multi-modal system. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3867-5
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 649–657
Google Scholar
Torres JP, de Piñerez Reyes RG, Bucheli VA (2018) Support vector machines for semantic relation extraction in Spanish language. In: Colombian conference on computing. Springer, pp 326–337
Seyler D, Li L, Zhai C (2018) Identifying compromised accounts on social media using statistical text analysis. arXiv preprint arXiv:180407247
Cortez P, Embrechts MJ (2013) Using sensitivity analysis and visualization techniques to open black box data mining models. Inf Sci 225:1–17
Google Scholar
Ribeiro MT, Singh S, Guestrin C (2016) “why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, August 13–17, 2016, pp 1135–1144

Download references

Acknowledgements

This work was supported by Fundação para a Ciência e Tecnologia (FCT) within the Project Scope: UID/CEC/00319/2019.

Author information

Authors and Affiliations

ALGORITMI Centre, Department of Information Systems, University of Minho, Guimarães, Portugal
Sérgio Gonçalves & Paulo Cortez
Instituto Universitário de Lisboa (ISCTE-IUL), ISTAR-IUL, Lisboa, Portugal
Sérgio Moro

Authors

Sérgio Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Cortez
View author publications
You can also search for this author in PubMed Google Scholar
Sérgio Moro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sérgio Moro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gonçalves, S., Cortez, P. & Moro, S. A deep learning classifier for sentence classification in biomedical and computer science abstracts. Neural Comput & Applic 32, 6793–6807 (2020). https://doi.org/10.1007/s00521-019-04334-2

Download citation

Received: 27 December 2018
Accepted: 28 June 2019
Published: 10 July 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s00521-019-04334-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep learning classifier for sentence classification in biomedical and computer science abstracts

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A deep learning classifier for sentence classification in biomedical and computer science abstracts

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on deep learning approaches for text-to-SQL

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation