Skip to main content

Role of Activation Functions and Order of Input Sequences in Question Answering

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1016))

Abstract

This paper describes a solution for the Question Answering problem in Natural Language Processing using LSTMs. We perform an analysis on the effect of choice of activation functions in the final layer of LSTM cell on the accuracy. Facebook Research’s bAbI dataset is used for our experiments. We also propose an alternative solution, which exploits the language structure and order of words in the English language, i.e. reversing the order of paragraph will introduce many short-term dependencies between the textual data and the initial tokens of a question. This method improves the accuracy in more than half of the tasks by more than 30% over the current state of the art. Our contributions in this paper are improving the accuracy of most of the Q&A tasks by reversing the order of words in the query and the story sections. Also, we have provided a comparison of different activation functions and their respective accuracies with respect to all the 20 different NLP tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sutskever, I., et al. (2014). Sequence to sequence learning with neural networks.

    Google Scholar 

  2. Zhou, G.-B., et al. (2016). Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3), 226–234.

    Article  MathSciNet  Google Scholar 

  3. Jurafsky, D., & James, H. M. (2017). Speech and language processing [Online]. Available: https://web.stanford.edu/~jurafsky/slp3/28.pdf, August 7, 2017.

  4. Szegedy, C., et al. (2013). Intriguing properties of neural networks. ArXiv preprint, arXiv:1312.6199 .

  5. Severyn, A., & Moschitti. A. (2015). Unitn: Training deep convolutional neural network for twitter sentiment classification. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015).

    Google Scholar 

  6. Stroh, E., & Mathur, P. (2016). Question answering using deep learning [Online]. Available: https://cs224d.stanford.edu/reports/StrohMathur.pdf.

  7. WildML. (2017). Introduction to RNNs [Online]. Available: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/, September 17, 2017.

  8. Fan, E. (2000). Extended tanh-function method and its applications to nonlinear equations. Physics Letters A, 277(4-5), 212–218.

    Article  MathSciNet  Google Scholar 

  9. Elworthy, D. (2000). Question answering using a large NLP system. In TREC.

    Google Scholar 

  10. Dunne, R. A., & Campbell, N. A. (1997). On the pairing of the softmax activation and cross-entropy penalty functions and the derivation of the softmax activation function. In Proceedings of the 8th Australian Conference on the Neural Networks (Vol. 181). Melbourne.

    Google Scholar 

  11. Rajpurkar, P., et al. (2016). SQuAD: 100,000 + questions for machine comprehension of text. In EMNLP.

    Google Scholar 

  12. Colah’s Blog. (2015). Understanding LSTM networks [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/, August 27, 2015.

  13. Facebook Research. (2015). Babi dataset [Online]. Available: https://research.fb.com/downloads/babi/.

  14. Brownlee, J. (2016). Over and underfitting in ML [Online]. Available: https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/, March 21, 2016.

  15. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.

    Article  Google Scholar 

  16. Jin, X., et al. (2016). Deep learning with S-shaped rectified linear activation units. In AAAI.

    Google Scholar 

  17. Yu, L., et al. (2014). Deep learning for answer sentence selection. ArXiv preprint, arXiv:1412.1632.

  18. Razvan, P., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In International Conference on Machine Learning.

    Google Scholar 

  19. Martin, J. H., & Jurafsky, D. (2009). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Pearson/Prentice Hall.

    Google Scholar 

  20. Severyn, A., & Moschitti, A. (2015). Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.

    Google Scholar 

  21. Chung, J., et al. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv preprint, arXiv:1412.3555.

  22. Weston, J., et al. (2015). Towards AI-complete question answering: A set of prerequisite toy tasks.

    Google Scholar 

  23. Sukhbaatar, S., et al. (2015). End-to-end memory networks.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. S. Chenna Keshava .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chenna Keshava, B.S., Sumukha, P.K., Chandrasekaran, K., Usha, D. (2020). Role of Activation Functions and Order of Input Sequences in Question Answering. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1016. Springer, Singapore. https://doi.org/10.1007/978-981-13-9364-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9364-8_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9363-1

  • Online ISBN: 978-981-13-9364-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics