Abstract
Twitter is an intensely utilized platform for disaster events and emergencies. Therefore, Twitter is an important resource for providing the essential information. Named entity recognition (NER), which is the process of determining the elementary units in a text and classifying them with pre-defined categories, plays a significant role to extract essential and usefulness information. However, NER is a challenging task due to the utilized informal text in the Twitter platform such as grammatical errors and nonstandard abbreviations. In this paper, recurrent neural network (RNN)-based approaches considering diversity of activation functions and optimization functions with NER tools are utilized to extract named entities such as organization, person, and location from the tweets. Inputs for RNN models are provided via two different NER tools which are natural language toolkit (NLTK) and general architecture for text engineering (Gate). Then, pre-labeled data are trained via GloVe word embedding technique, and RNN model variants such as LSTM, BLSTM, and GRU are demonstrated. Therefore, outperforming models among RNN variants are presented for predicting named entities. Yellowbrick interpreter is used for evaluation of the proposed method and Wilcoxon signed-rank test are applied on results of two different data sets to demonstrate consistency of the proposed method. In addition, comparison is made with existing machine learning methods. The experiments by utilizing the Nepal earthquake Twitter data set show that the RNN-based approaches achieve good results in finding named entities. In emergencies, the results of this paper can help in reducing the efforts of event location detection and provide better disaster management.
Similar content being viewed by others
References
Aarthi D, Viswanathan V, Nandhini B, Ilakiyaselvan N (2019) Question classification using a rule based model. Int J Innov Technol Explor Eng 9(1):4172–4176
Aboaoga M, Ab Aziz MJ (2013) Arabic person names recognition by using a rule based approach. J Comput Sci 9(7):922–927
Agrawal A, Tripathi S, Vardhan M (2021) Active learning approach using a modified least confidence sampling strategy for named entity recognition. Prog Artif Intell
Ajees AP, Manju K, Mary Idicula S (2019) An ımproved word representation for deep learning based NER in Indian languages. Information (Switzerland) 10(6)
Alajlan S, Coenen F, Konev B, Mandya A (2019) Ontology learning from twitter data. In: IC3K 2019—Proceedings of the 11th ınternational joint conference on knowledge discovery, knowledge engineering and knowledge management, vol 2, pp 94–103
Alam F, Ofli F, Imran M, Aupetit M (2018a) A twitter tale of three hurricanes: harvey, ırma, and maria. In: Proceedings of the ınternational ISCRAM conference 2018–May, pp 553–572
Alam F, Joty S, Imran M (2018b) Domain adaptation with adversarial training and graph embeddings. In: Accepted for publication at the 56th annual meeting of the association for computational linguistics (ACL). Melbourne, Australia, pp 1077–1087
Alam F, Ofli F, Imran M (2019) Descriptive and visual summaries of disaster events using artificial ıntelligence techniques: case studies of hurricanes harvey, ırma, and maria. Behav Inf Technol 1–31
Alifi RM, Supangkat SH (2018) Information extraction of traffic condition from social media using bidirectional LSTM-CNN. In: 2018 International seminar on research of ınformation technology and ıntelligent systems, ISRITI 2018, pp 637–640
Balgasem SS, Zakaria LQ (2018) A hybrid method of rule-based approach and statistical measures for recognizing narrators name in Hadith. In: Proceedings of the 2017 6th ınternational conference on electrical engineering and ınformatics: sustainable society through digital ınnovation, ICEEI 2017 2017–Novem, pp 1–5
Batbaatar E, Ryu KH (2019) Ontology-based healthcare named entity recognition from twitter messages using a recurrent neural network approach. Int J Environ Res Public Health 16(19):1–19
Bengfort B, Bilbro R (2019) Yellowbrick: visualizing the scikit-learn model selection process. 4:1–5
Carlson A, Gaffney S, Vasile F (2009) “Learning a named entity tagger from gazetteers with the partial perceptron. In: AAAI spring symposium—technical report SS-09-07, pp 7–13
Cho M, Ha J, Park C, Park S (2020) Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition. J Biomed Inform 103
Coelho da Silva TL et al (2019) Improving named entity recognition using deep learning with human in the loop. In: Advances in database technology—EDBT 2019–March, pp 594–597
Copara J, Ochoa J, Thorne C, Glavaˇ G (2016) Spanish NER with word representations and conditional random fields Spanish NER with word representations and conditional random fields. (October)
Cruz BMD et al (2019) Named-entity recognition for disaster related filipino news articles. In: IEEE Region 10 annual ınternational conference, proceedings/TENCON 2018–Octob(October), pp 1633–1636
Cucerzan S, Yarowsky D (1999) Language ındependent NER using a unified model of ınternal and contextual evidence
Cui Z, Ke R, Wang Y (2019) Stacked bidirectional and unidirectional LSTM recurrent neural network for network-wide traffic speed prediction. ArXiv 1–11
Dabiri S, Heaslip K (2019) Developing a twitter-based traffic event detection model using deep learning architectures. Expert Syst Appl 118:425–439
Dereli T, Eligüzel N, Çetinkaya C (2021) Content analyses of the ınternational federation of red cross and red crescent societies (Ifrc) based on machine learning techniques through twitter. Nat Hazards 0123456789
Ding B, Qian H, Zhou J (2018) Activation functions and their characteristics in deep neural networks. In: Proceedings of the 30th Chinese control and decision conference, CCDC 2018, pp 1836–1841
Du J et al (2018) Extracting psychiatric stressors for suicide from social media using deep learning. BMC Med Inform Decis Mak 18(Suppl 2)
Eftimov T, Koroušić Seljak B, Korošec P (2017) A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. 12
El Bazi I, Laachfoubi N (2019) Arabic named entity recognition using deep learning approach. Int J Electr Comput Eng 9(3):2025–2032
Eligüzel N, Çetinkaya C, Dereli T (2020) Advanced engineering ınformatics comparison of different machine learning techniques on location extraction by utilizing geo-tagged tweets: a case study. Adv Eng Inform 46:101151
Farhadi F, Lodi Vahid APN (2017) Learning activation functions in deep neural networks
Ferreira J, Oliveira HG, Rodrigues R (2019) Improving NLTK for processing Portuguese. OpenAccess Ser Inform 74(18):1–9
Freihat AA, Bella G, Mubarak H, Giunchiglia F (2018) A single-model approach for arabic segmentation, POS tagging, and named entity recognition. In: 2nd International conference on natural language and speech processing, ICNLSP 2018, pp 1–8
Gabbard R, DeYoung J, Lignos C, Freedman M, Weischedel R (2018) Combining rule-based and statistical mechanisms for low-resource named entity recognition. Mach Transl 32(1–2):31–43
Gelernter J, Balaji S (2013) An algorithm for local geoparsing of microtext. GeoInformatica 17(4):635–667
Gelernter J, Mushegian N (2011) Geo-parsing messages from microtext. Trans GIS 15(6):753–773
Gillick D, Brunk C, Vinyals O, Subramanya A (1997) Multilingual language processing from bytes
Hernandez-Suarez A et al (2019) Using twitter data to monitor natural disaster social dynamics: a recurrent neural network approach with word embeddings and kernel density estimation. Sensors (Switzerland) 19(7)
Hoang TBN, Mothe J (2018) Location extraction from tweets. Inf Process Manag 54(2):129–144
Inuwa-Dutse I, Liptrott M, Korkontzelos I (2019) A deep semantic search method for random tweets. Online Soc Netw Media 13:100046
Joshi P, Chaudhary S, Kumar V (2012) Information extraction from social network for agro-produce marketing. In: Proceedings—ınternational conference on communication systems and network technologies, CSNT 2012, pp 941–44
Kannaiya Raja N, Bakala N, Suresh S (2019) NLP: rule based name entity recognition. Int J Innov Technol Explor Eng 8(11):4285–4290
Karagoz P et al (2016) Extracting location ınformation from crowd-sourced social network data. In: European handbook of crowdsourced geographic ınformation, pp 195–204
Kenekayoro P (2018) Identifying named entities in academic biographies with supervised learning. Scientometrics 116(2):751–765
Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations, ICLR 2015—conference track proceedings, pp 1–15
Kumar A, Singh JP (2019) Location reference ıdentification from tweets during emergencies: a deep learning approach. Int J Disaster Risk Reduct 33:365–375
Li P (2017) Optimization algorithms for deep learning. Retrieved December 17, 2019 http://lipiji.com/docs/li2017optdl.pdf
Li Y, Yang T (2018) Word embedding for understanding natural language: a survey. İn: Srinivasan (ed.) Guide to big data applications, studies in big data, vol 26. Springer, Berlin
Li M et al (2019) An unsupervised learning approach for NER based on online encyclopedia. In: Lecture notes in computer science (ıncluding subseries lecture notes in artificial ıntelligence and lecture notes in bioinformatics). LNCS, vol 11641, pp 329–44
Lin RTK et al (2009) A supervised learning approach to biological question answering. Integr Comput Aided Eng 16(3):271–281
Lin Y et al (2018) A multi-lingual multi-task architecture for low-resource sequence labeling, pp 799–809
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. ArXiv 1–38
Magumba MA, Nabende P, Mwebaze E (2018) Ontology boosted deep learning for disease name extraction from twitter messages. J Big Data 5(1)
Mao H, Thakur G, Sparks K, Sanyal J, Bhaduri B (2018) Mapping near-real-time power outages from social media. Int J Digit Earth 1–15
Maynard D et al (2002) Architectural elements of language engineering robustness. Nat Lang Eng 8(2–3):257–274
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (n.d.) Advances in pre-training distributed word representations. Retrieved https://nlp.stanford.edu/projects/glove/
Ni J, Liu X, Zhou Q, Cao L (2019) A knowledge graph based disaster storyline generation framework. In: Proceedings of the 31st Chinese control and decision conference, CCDC 2019, pp 4432–4437
Nie F, Hu Z, Li X (2018) An investigation for loss functions widely used in machine learning. Commun Inf Syst 18(1):37–52
Nowak J, Taspinar A, Scherer R (2017) LSTM recurrent neural networks for short text and sentiment classification. In: Lecture notes in computer science (ıncluding subseries lecture notes in artificial ıntelligence and lecture notes in bioinformatics), vol 10246 LNAI, pp 553–562
Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018) Activation functions: comparison of trends in practice and research for deep learning. ArXiv 1–20
Oudah M, Shaalan K (2017) NERA 2.0: improving coverage and performance of rule-based named entity recognition for Arabic. Nat Lang Eng 23(3):441–472
Pallavi KP, Sobha L, Ramya MM (2018) Named entity recognition for kannada using gazetteers list with conditional random fields. J Comput Sci 14(5):645–653
Palshikar GK (2013) Techniques for named entity recognition: a survey. Bioinform Concepts Methodol Tools Appl 1:400–426
Rosindell J, Wong Y (2018) Biodiversity, the tree of life, and science communication. Phylogenetic Divers Appl Chall Biodivers Sci 2:41–71
Salehinejad H, Sankar S, Barfett J, Colak E, Valaee S (2017) Recent advances in recurrent neural networks. ArXiv 1–21
Şerban O, Thapen N, Maginnis B, Hankin C, Foot V (2019) Real-time processing of social media with SENTINEL: a syndromic surveillance system incorporating deep learning for health classification. Inf Process Manag 56(3):1166–1184
Shah DN, Bhadka HB (2018) Named entity recognition from Gujarati text using rule-based approach. In: International conference on ıntelligent systems design and applications, vol 736, pp 797–805
Shardlow M et al (2019) A text mining pipeline using active and deep learning aimed at curating information in computational neuroscience. Neuroinformatics 17(3):391–406
Shin S, Jung H, Yi MY (2015) Building a business knowledge base by a supervised learning and rule-based method. KSII Trans Internet Inf Syst 9(1):407–420
Simon Haykin (McMaster University, Hamilton, Ontario, Canada) (2005) Neural Networks - A Comprehensive Foundation - Simon Haykin.Pdf. 823
Skeppstedt M, Kvist M, Nilsson GH, Dalianis H (2014) Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J Biomed Inform 49:148–158
Tahmassebi A, Gandomi AH, Fong S, Meyer-Baese A, Foo SY (2018) Multi-stage optimization of a deep model: a case study on ground motion modeling. PLoS ONE 13(9):1–24
Tjong EF, Sang K (2002) Introduction to the CoNLL-2002 shared task: language-ındependent named entity recognition
Tran VC, Nguyen NT, Fujita H, Hoang DT, Hwang D (2017) A combination of active learning and self-learning for named entity recognition on twitter using conditional random fields. Knowl Based Syst 132:179–187
Wibawa AS, Purwarianti A (2016) Indonesian named-entity recognition for 15 classes using ensemble supervised learning. Procedia Comput Sci 81:221–228
Wintaka DC, Bijaksana MA, Asror I (2019) Named-entity recognition on indonesian tweets using bidirectional LSTM-CRF. Procedia Comput Sci 157:221–228
Wu X, Wu Z, Jia J, Cai L (2012) Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers. In: 2012 8th International symposium on Chinese spoken language processing, ISCSLP 2012, pp 363–367
Xu C et al (2019) DLocRL: a deep learning pipeline for fine-grained location recognition and linking in tweets. In: The web conference 2019—proceedings of the world wide web conference, WWW 2019, pp 3391–3397
Zhou JT et al (2019) RoSeq: robust sequence labeling. In: IEEE Transactions on Neural Networks and Learning Systems, pp 1–11
Zukov-Gregoric A, Bachrach Y, Minkovsky P, Coope S, Maksak B (2018) Neural named entity recognition using a self-attention mechanism. In: Proceedings—international conference on tools with artificial ıntelligence, ICTAI 2017, pp 652–656
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Eligüzel, N., Çetinkaya, C. & Dereli, T. Application of named entity recognition on tweets during earthquake disaster: a deep learning-based approach. Soft Comput 26, 395–421 (2022). https://doi.org/10.1007/s00500-021-06370-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06370-4