Abstract
This paper describes a solution for the Question Answering problem in Natural Language Processing using LSTMs. We perform an analysis on the effect of choice of activation functions in the final layer of LSTM cell on the accuracy. Facebook Research’s bAbI dataset is used for our experiments. We also propose an alternative solution, which exploits the language structure and order of words in the English language, i.e. reversing the order of paragraph will introduce many short-term dependencies between the textual data and the initial tokens of a question. This method improves the accuracy in more than half of the tasks by more than 30% over the current state of the art. Our contributions in this paper are improving the accuracy of most of the Q&A tasks by reversing the order of words in the query and the story sections. Also, we have provided a comparison of different activation functions and their respective accuracies with respect to all the 20 different NLP tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sutskever, I., et al. (2014). Sequence to sequence learning with neural networks.
Zhou, G.-B., et al. (2016). Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3), 226–234.
Jurafsky, D., & James, H. M. (2017). Speech and language processing [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/28.pdf, August 7, 2017.
Szegedy, C., et al. (2013). Intriguing properties of neural networks. ArXiv preprint, arXiv:1312.6199 .
Severyn, A., & Moschitti. A. (2015). Unitn: Training deep convolutional neural network for twitter sentiment classification. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015).
Stroh, E., & Mathur, P. (2016). Question answering using deep learning [Online]. Available: https://cs224d.stanford.edu/reports/StrohMathur.pdf.
WildML. (2017). Introduction to RNNs [Online]. Available: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/, September 17, 2017.
Fan, E. (2000). Extended tanh-function method and its applications to nonlinear equations. Physics Letters A, 277(4-5), 212–218.
Elworthy, D. (2000). Question answering using a large NLP system. In TREC.
Dunne, R. A., & Campbell, N. A. (1997). On the pairing of the softmax activation and cross-entropy penalty functions and the derivation of the softmax activation function. In Proceedings of the 8th Australian Conference on the Neural Networks (Vol. 181). Melbourne.
Rajpurkar, P., et al. (2016). SQuAD: 100,000 + questions for machine comprehension of text. In EMNLP.
Colah’s Blog. (2015). Understanding LSTM networks [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/, August 27, 2015.
Facebook Research. (2015). Babi dataset [Online]. Available: https://research.fb.com/downloads/babi/.
Brownlee, J. (2016). Over and underfitting in ML [Online]. Available: https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/, March 21, 2016.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
Jin, X., et al. (2016). Deep learning with S-shaped rectified linear activation units. In AAAI.
Yu, L., et al. (2014). Deep learning for answer sentence selection. ArXiv preprint, arXiv:1412.1632.
Razvan, P., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In International Conference on Machine Learning.
Martin, J. H., & Jurafsky, D. (2009). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Pearson/Prentice Hall.
Severyn, A., & Moschitti, A. (2015). Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
Chung, J., et al. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv preprint, arXiv:1412.3555.
Weston, J., et al. (2015). Towards AI-complete question answering: A set of prerequisite toy tasks.
Sukhbaatar, S., et al. (2015). End-to-end memory networks.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chenna Keshava, B.S., Sumukha, P.K., Chandrasekaran, K., Usha, D. (2020). Role of Activation Functions and Order of Input Sequences in Question Answering. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1016. Springer, Singapore. https://doi.org/10.1007/978-981-13-9364-8_27
Download citation
DOI: https://doi.org/10.1007/978-981-13-9364-8_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9363-1
Online ISBN: 978-981-13-9364-8
eBook Packages: EngineeringEngineering (R0)