Role of Activation Functions and Order of Input Sequences in Question Answering

Chenna Keshava, B. S.; Sumukha, P. K.; Chandrasekaran, K.; Usha, D.

doi:10.1007/978-981-13-9364-8_27

B. S. Chenna Keshava¹⁷,
P. K. Sumukha¹⁷,
K. Chandrasekaran¹⁷ &
…
D. Usha¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1016))

1268 Accesses
1 Citations

Abstract

This paper describes a solution for the Question Answering problem in Natural Language Processing using LSTMs. We perform an analysis on the effect of choice of activation functions in the final layer of LSTM cell on the accuracy. Facebook Research’s bAbI dataset is used for our experiments. We also propose an alternative solution, which exploits the language structure and order of words in the English language, i.e. reversing the order of paragraph will introduce many short-term dependencies between the textual data and the initial tokens of a question. This method improves the accuracy in more than half of the tasks by more than 30% over the current state of the art. Our contributions in this paper are improving the accuracy of most of the Q&A tasks by reversing the order of words in the query and the story sections. Also, we have provided a comparison of different activation functions and their respective accuracies with respect to all the 20 different NLP tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sutskever, I., et al. (2014). Sequence to sequence learning with neural networks.
Google Scholar
Zhou, G.-B., et al. (2016). Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3), 226–234.
Article MathSciNet Google Scholar
Jurafsky, D., & James, H. M. (2017). Speech and language processing [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/28.pdf, August 7, 2017.
Szegedy, C., et al. (2013). Intriguing properties of neural networks. ArXiv preprint, arXiv:1312.6199 .
Severyn, A., & Moschitti. A. (2015). Unitn: Training deep convolutional neural network for twitter sentiment classification. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015).
Google Scholar
Stroh, E., & Mathur, P. (2016). Question answering using deep learning [Online]. Available: https://cs224d.stanford.edu/reports/StrohMathur.pdf.
WildML. (2017). Introduction to RNNs [Online]. Available: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/, September 17, 2017.
Fan, E. (2000). Extended tanh-function method and its applications to nonlinear equations. Physics Letters A, 277(4-5), 212–218.
Article MathSciNet Google Scholar
Elworthy, D. (2000). Question answering using a large NLP system. In TREC.
Google Scholar
Dunne, R. A., & Campbell, N. A. (1997). On the pairing of the softmax activation and cross-entropy penalty functions and the derivation of the softmax activation function. In Proceedings of the 8th Australian Conference on the Neural Networks (Vol. 181). Melbourne.
Google Scholar
Rajpurkar, P., et al. (2016). SQuAD: 100,000 + questions for machine comprehension of text. In EMNLP.
Google Scholar
Colah’s Blog. (2015). Understanding LSTM networks [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/, August 27, 2015.
Facebook Research. (2015). Babi dataset [Online]. Available: https://research.fb.com/downloads/babi/.
Brownlee, J. (2016). Over and underfitting in ML [Online]. Available: https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/, March 21, 2016.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
Article Google Scholar
Jin, X., et al. (2016). Deep learning with S-shaped rectified linear activation units. In AAAI.
Google Scholar
Yu, L., et al. (2014). Deep learning for answer sentence selection. ArXiv preprint, arXiv:1412.1632.
Razvan, P., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In International Conference on Machine Learning.
Google Scholar
Martin, J. H., & Jurafsky, D. (2009). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Pearson/Prentice Hall.
Google Scholar
Severyn, A., & Moschitti, A. (2015). Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
Google Scholar
Chung, J., et al. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv preprint, arXiv:1412.3555.
Weston, J., et al. (2015). Towards AI-complete question answering: A set of prerequisite toy tasks.
Google Scholar
Sukhbaatar, S., et al. (2015). End-to-end memory networks.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, Mangalore, India
B. S. Chenna Keshava, P. K. Sumukha & K. Chandrasekaran
St. Joseph Engineering College, Vamanjoor, Mangalore, India
D. Usha

Authors

B. S. Chenna Keshava
View author publications
You can also search for this author in PubMed Google Scholar
P. K. Sumukha
View author publications
You can also search for this author in PubMed Google Scholar
K. Chandrasekaran
View author publications
You can also search for this author in PubMed Google Scholar
D. Usha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. S. Chenna Keshava .

Editor information

Editors and Affiliations

Society for Data Science, Pune, Maharashtra, India
Neha Sharma
A.K. Choudhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India
Amlan Chakrabarti
Department of Automatics and Applied Software, Aurel Vlaicu University of Arad, Arad, Romania
Valentina Emilia Balas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chenna Keshava, B.S., Sumukha, P.K., Chandrasekaran, K., Usha, D. (2020). Role of Activation Functions and Order of Input Sequences in Question Answering. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1016. Springer, Singapore. https://doi.org/10.1007/978-981-13-9364-8_27

Download citation

DOI: https://doi.org/10.1007/978-981-13-9364-8_27
Published: 25 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9363-1
Online ISBN: 978-981-13-9364-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics