Abstract
Question classification is a primary essential study for automatic question answering implementations. Linguistic features take a significant role to develop an accurate question classifier. Recently, deep learning systems have achieved remarkable success in various text-mining problems such as sentiment analysis, document classification, spam filtering, document summarization, and web mining. In this study, we explain our study on investigating some deep learning architectures for a question classification task in a highly inflectional language Turkish that is an agglutinative language where word structure is produced by adding suffixes (morphemes) to root word. As a non-Indo-European language, languages like Turkish have some unique features, which make it challenging for natural language processing. For instance, Turkish has no grammatical gender and noun classes. In this study, user questions in Turkish are used to train and test the deep learning architectures. In addition to this, the details of the deep learning architectures are compared in terms of test and 10-cross fold validation accuracy. We use two major deep learning models in our paper: long short-term memory (LSTM), Convolutional Neural Networks (CNN), and we also implemented the combination of CNN-LSTM, CNN-SVM structures and a number of various those architectures by changing vector sizes and the embedding types. As well as this, we have built word embeddings using the Word2vec method with a CBOW and skip gram models with different vector sizes on a large corpus composed of user questions. Our another investigation is the effect of using different Word2vec pre-trained word embeddings on these deep learning architectures. Experiment results show that the use of different Word2vec models has a significant impact on the accuracy rate on different deep learning models. Additionally, there is no Turkish question dataset labeled and so another contribution in this study is that we introduce new Turkish question dataset which is translated from UIUC English question dataset. By using these techniques, we have reached an accuracy of 94% on the question dataset.
Similar content being viewed by others
References
Blooma MJ, Goh DHL, Chua AYK, Ling Z (2008) Applying question classification to Yahoo! Answers. In: Applications of digital information and Web Technologies, ICADIWT 2008. First international conference on the IEEE, pp 229–234
Silva J, Luísa C, Mendes AC, Andreas W (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154
Mishra M, Mishra VK, Sharma HR (2013) Question classification using semantic, syntactic and lexical features. Int J Web Semant Technol 4(3):39
Zhiheng H, Marcus T, Zengchang Q (2008) Question classification using head words and their hypernyms. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 927–936
Ehsan S, Mojgan F (2014) A hybrid approach for question classification in Persian automatic question answering systems 2014. In: 4th international e conference on computer and knowledge engineering (ICCKE). IEEE, pp 279–284(2014)
Razzaghnoori M, Sajedi H, Jazani IK (2018) Question classification in Persian using word vectors and frequencies. Cogn Syst Res 47:16–27
Hao T, Xie W, Xu F (2015) A WordNet expansion-based approach for question targets identification and classification. In: Chinese computational linguistics and natural language processing based on naturally annotated Big Data. Springer, Cham, pp 333–344
Kim Y (2014) Convolutional neural networks for sentence classification. arxiv preprint: arxiv:1408.5882
Hu F, Li L, Zhang ZL (2017) Emphasizing essential words for sentiment classification based on recurrent neural networks. J Comput Sci Technol 32(4):785–795. https://doi.org/10.1007/s11390-017-1759-2
https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/
Le-Hong P, Phan XH, Nguyen TD (2015) Using dependency analysis to improve question classification. In Knowledge and Systems Engineering (pp. 653-665). Springer, Cham
Bilić P, Primorac J, Valtýsson B (eds) (2018) Technologies of labour and the politics of contradiction. Springer, Cham, p 85
Şahin G (2017) Turkish document classification based on Word2Vec and SVM classifier. In: Signal processing and communications applications conference (SIU), 2017 25th, IEEE, pp 1–4
Ozturkmenoglu O, Alpkocak A (2012) Comparison of different lemmatization approaches for information retrieval on Turkish text collection. In: 2012 International symposium on innovations in intelligent systems and applications. IEEE
Oflazer K (2014) Turkish and its challenges for language processing. Lang Resour Eval 48(4):639–653
Wang C (2016) What are the limitations of the Bag-of-Words model? [Online]. Available: https://www.quora.com/What-are-the-limitations-of-the-Bag-of-Words-model
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Bollegala D, Maehara T, Kawarabayashi K (2015) Embedding semantic relations into word representations. In: Twenty-fourth international joint conference on artificial intelligence
Abdi A et al (2019) Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion. Inf Process Manag 56(4):1245–1259
Ghulam H et al (2019) Deep learning-based sentiment analysis for Roman Urdu text. Procedia Comput Sci 147:131–135
Sangkeettrakarn C, Haruechaiyasak C, Theeramunkong T (2019) Fuzziness detection in Thai law texts using deep learning. In: 2019 10th International conference of information and communication technology for embedded systems (IC-ICTES). IEEE
Singh J et al (2018) Morphological evaluation and sentiment analysis of Punjabi text using deep learning classification. J King Saud Univ-Comput Inf Sci (2018)
Nguyen H-Q, Nguyen Q-U (2018) An ensemble of shallow and deep learning algorithms for Vietnamese Sentiment Analysis. In: 2018 5th NAFOSTED conference on information and computer science (NICS). IEEE
Nguyen T, Shcherbakov M (2018) A neural network based Vietnamese Chatbot. In: 2018 International conference on system modeling & advancement in research trends (SMART). IEEE
Dmitrin YV (dmitrinyuri@gmail.com), Botov DS Comparison of deep neural network architectures for authorship attribution of Russian Social Media Texts
Vo K et al (2019) Handling negative mentions on social media channels using deep learning. J Inf Telecommun 1–23
Heikal Maha, Torki Marwan, El-Makky Nagwa (2018) Sentiment analysis of Arabic Tweets using deep learning. Procedia Comput Sci 142:114–122
Hacioglu K, Ward W (2003) Question classification with support vector machines and error correcting codes. In: Proceedings of HLT-NAACL, Association for Computational Linguistics, Morristown, USA, vol 2, pp 28–30
Loni B (2011) A survey of state-of-the-art methods on question classification
Athira PM, Sreeja M, Reghuraj PC (2013) Architecture of an ontology-based domain-specific natural language question answering system. Int J Web Semant Technol 4(4):31
Hermjakob U (2001) Parsing and question classification for question answering. In: Proceedings of the workshop on open-domain question answering. Association for Computational Linguistics, vol 12, pp 1–6
Close LW (2002) Question classification using language modeling. Center of Intelligent Information Retrieval (CIIR). Technical report
Dell Z, Wee SL (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 26–32
Suzuki J, Taira H, Sasaki Y, Maeda E (2003) Question classification using HDAG kernel. In: Proceedings of the ACL 2003 workshop on multilingual summarization and question answering. Association for Computational Linguistics, vol 12, pp 61–68
Li X, Roth D (2002) Learning question classifiers. In: Proceeding of the 19th international conference on computational linguistics. Association for Computational Linguistics, Morristown, USA, vol 1, pp 1–7
Mollaei A, Rahati-Quchani S, Estaji A (2012) Question classification in Persian language based on conditional random fields. In: 2012 2nd international conference on computer and knowledge engineering (ICCKE). IEEE, pp 295–300
Loni BK, Seyedeh H, Wiggers P (2011) Latent semantic analysis for question classification with neural networks. In: 2011 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 437–442
Blunsom P, Kocik K, Curran J (2006) Question classification with log-linear models. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 615–616
Santosh K, Ray Shailendra S, Joshi BP (2010) A semantic approach for question classification using WordNet and Wikipedia. Pattern Recognit Lett 31(13):1935–1943
Lee C-H, Lee H-Y (2019) Cross-lingual transfer learning for question answering. arXiv preprint arXiv:1907.06042
Liu, Jiahua, et al. “XQA: A Cross-lingual Open-domain Question Answering Dataset.” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019
Singh J et al (2019) “XLDA: cross-lingual data augmentation for natural language inference and question answering. arXiv preprint arXiv:1905.11471
Faruqi MI, Purwarianti A (2011) An Indonesian question analyzer to enhance the performance of Indonesian-English CLQA. In: Proceedings of the 2011 international conference on electrical engineering and informatics. IEEE, pp 1–6
Sugiyama K, Mizukami M, Neubig G, Yoshino K, Sakti S, Toda T, Nakamura S (2015) An investigation of machine translation evaluation metrics in cross-lingual question answering. In: Proceedings of the tenth workshop on statistical machine translation, pp 442–449
Gupta D, Kumari S, Ekbal A, Bhattacharyya P (2018 MMQA: a multi-domain multi-lingual question-answering framework for English and Hindi. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC-2018)
Loginova E, Varanasi S, Neumann G (2018) Towards multilingual neural question answering. In: European conference on advances in databases and information systems. Springer, Cham, pp 274–285
Ture F, Boschee E (2016) Learning to translate for multilingual question answering. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 573–584
Mikolov, Tomas et al (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
https://medium.com/@mubuyuk51/word2vec-nedir-t%C3%BCrk%C3%A7e-f0cfab20d3ae
Mandelbaum A, Shalev A (2016) Word embeddings and their use in sentence classification tasks. arXiv preprint arXiv:1610.08229
Liu S, Bremer PT, Thiagarajan JJ, Srikumar V, Wang B, Livnat Y, Pascucci V (2018) Visual exploration of semantic relationships in neural word embeddings. IEEE Trans Vis Comput Graph 24(1):553–562
Chen Z et al (2018) Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases. BMC Med Inf Decis Mak 18(2):65
https://www.tensorflow.org/tutorials/representation/word2vec
Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820
Banerjee I et al (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79–88
Yang X, Macdonald C, Ounis I (2018) Using word embeddings in twitter election classification. Inf Retr J 21(2–3):183–207
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Liu Y et al (2018) Feature extraction based on information gain and sequential pattern for English question classification. IET Softw 12(6):520–526
Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Uncertain Fuzziness Knowl Syst 6(2):107–116
Zhou H et al (2016) Exploiting syntactic and semantics information for chemical–disease relation extraction. Database2016
https://github.com/thtrieu/qclass_dl/blob/master/ProjectDescription.pdf, https://github.com/thtrieu/qclass_dl/blob/master/ProjectPresentation.pdf
Gers FA, Schmidhuber J (2000) Recurrent nets that time and count. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, vol 3, pp 189–194
http://www.jackdermody.net/brightwire/article/Sequence_to_Sequence_with_LSTM
Alayba AM, Palade V, England M, Iqbal R (2018) A combined CNN and LSTM model for arabic sentiment analysis. In: International cross-domain conference for machine learning and knowledge extraction. Springer, Cham, pp 179–191
Derici C, Celik K, Kutbay E, Aydın Y, Güngör T, Özgür A, Kartal G (2015) Question analysis for a closed domain question answering system. In: International conference on intelligent text processing and computational linguistics. Springer, Cham, pp 468–482
Dönmez İ, Adalı E (2017) Turkish question answering application with course-grained semantic matrix representation of sentences. In: Computer science and engineering (UBMK), 2017 international conference on IEEE, pp 6–11
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yilmaz, S., Toklu, S. A deep learning analysis on question classification task using Word2vec representations. Neural Comput & Applic 32, 2909–2928 (2020). https://doi.org/10.1007/s00521-020-04725-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04725-w