Skip to main content

Progress in Natural Language Processing and Language Understanding

  • Chapter
  • First Online:
Bridging Human Intelligence and Artificial Intelligence

Abstract

Computers have been successful in deciphering structured data, i.e., those that have a standardized format such as numbers and spreadsheets. Despite the success, a surge in the availability of unstructured data over the last decade has motivated the exploration of new approaches to process and analyze unstructured data. Natural language processing (NLP) concerns how computers can address the problem of analyzing natural language, a ubiquitous form of unstructured data. With ground-breaking technological advances in computational capabilities and the ever-expanding realm of data, research in natural language processing has grown exponentially in the last two decades. With a strong focus on natural language understanding, this chapter presents a discussion on the metamorphosis of the field leading up to the prominent, state-of-the-art learning techniques in language understanding, toward the overall goal of enabling human-like comprehension in machines.

Human language is complex and ambiguous, thus making language processing challenging. Complexity of human language allows for infinite ways of expression; however, this diversity hinders the creation of a consistent metric to evaluate performance. The chapter additionally presents recent developments in NLP benchmarks that have been instrumental in assessing the language understanding ability of machines. With these standardized benchmarks, NLP models have witnessed drastic improvements and have made their way into commercial applications. Despite the ongoing progress, addressing questions about fairness, ethical consequences, and the essence of “understanding” language become crucial as these attributes play a critical role in the development of machines that are fully capable of processing human language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The term crowdworkers refers to a large number of people who each contribute a small amount of labor to execute a given task (https://www.collinsdictionary.com/dictionary/english/crowdworking)

References

  • Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv, 1409.0473.

    Google Scholar 

  • Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (pp. 65–72).

    Google Scholar 

  • Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 5185–5198).

    Chapter  Google Scholar 

  • Bengio, S., & Bengio, Y. (2000). Taking on the curse of dimensionality in joint distributions using neural networks. IEEE Transactions on Neural Networks, 11, 550–557.

    Article  Google Scholar 

  • Bengio, Y., Ducharme, R., & Vincent, P. (2001). A neural probabilistic language model. In Advances in neural information processing systems (pp. 932–938).

    Google Scholar 

  • Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. The Journal of Machine Learning Research, 3, 1137–1155.

    Google Scholar 

  • Berant, J., Srikumar, V., Chen, P.-C., Vander Linden, A., Harding, B., Huang, B., … Manning, C. D. (2014). Modeling biological processes for reading comprehension. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1499–1510).

    Chapter  Google Scholar 

  • Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V., & Kalai, A. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv preprint arXiv, 1607.06520.

    Google Scholar 

  • Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … others. (2020). Language models are few-shot learners. arXiv preprint arXiv, 2005.14165.

    Google Scholar 

  • Cho, K., van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014a). On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of SSST-8, eighth workshop on syntax, semantics and structure in statistical translation (pp. 103–111).

    Chapter  Google Scholar 

  • Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014b). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv, 1406.1078.

    Google Scholar 

  • Columbus, L. (2018). Global state of enterprise analytics, 2018. Global state of enterprise analytics, 2018. Retrieved from https://www.forbes.com/sites/louiscolumbus/2018/08/08/global-state-of-enterprise-analytics-2018/?sh=528f58d76361

  • Dai, A. M., & Le, Q. V. (2015). Semi-supervised sequence learning. In Proceedings of the 28th international conference on neural information processing systems-volume 2 (pp. 3079–3087).

    Google Scholar 

  • Deloitte. (n.d.). The analytics advantage. In The analytics advantage. Retrieved from https://www2.deloitte.com/content/dam/Deloitte/global/Documents/Deloitte-Analytics/dttl-analytics-analytics-advantage-report-061913.pdf

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the north American chapter of the Association for Computational Linguistics: Human language technologies, volume 1 (long and short papers) (pp. 4171–4186).

    Google Scholar 

  • Doddington, G. (2002). Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on human language technology research (pp. 138–145).

    Chapter  Google Scholar 

  • Dunietz, J., Burnham, G., Bharadwaj, A., Rambow, O., Chu-Carroll, J., & Ferrucci, D. (2020). To test machine comprehension, start by defining comprehension. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 7839–7859).

    Chapter  Google Scholar 

  • Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211.

    Article  Google Scholar 

  • Goodman, J. T. (2001). A bit of progress in language modeling. Computer Speech & Language, 15, 403–434.

    Article  Google Scholar 

  • Hermann, K. M., Kočiskỳ, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., & Blunsom, P. (2015). Teaching machines to read and comprehend. In Proceedings of the 28th international conference on neural information processing systems-volume 1 (pp. 1693–1701).

    Google Scholar 

  • Hill, F., Bordes, A., Chopra, S., & Weston, J. (2015). The goldilocks principle: Reading children’s books with explicit memory representations. arXiv preprint arXiv, 1511.02301.

    Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9, 1735–1780.

    Article  Google Scholar 

  • Hovy, E. H. (1999). Toward finely differentiated evaluation metrics for machine translation. In Proceedings of the EAGLES workshop on standards and evaluation pisa, Italy, 1999.

    Google Scholar 

  • Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of the 56th annual meeting of the association for computational Linguistics (volume 1: Long papers) (pp. 328–339).

    Chapter  Google Scholar 

  • Jia, R., & Liang, P. (2017). Adversarial examples for evaluating Reading comprehension systems. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2021–2031).

    Google Scholar 

  • Katz, S. (1987). Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35, 400–401.

    Article  Google Scholar 

  • Kuchaiev, O., & Ginsburg, B. (2017). Factorization tricks for LSTM networks. arXiv preprint arXiv, 1703.10722.

    Google Scholar 

  • Levesque, H., Davis, E., & Morgenstern, L. (2012). The winograd schema challenge. In Thirteenth international conference on the principles of knowledge representation and reasoning.

    Google Scholar 

  • Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. Proceedings of the 2015 conference on empirical methods in natural language processing, (pp. 1412–11421).

    Google Scholar 

  • McCormick, C. (2019). GLUE explained: Understanding BERT through Benchmarks. Retrieved from https://mccormickml.com/2019/11/05/GLUE/

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv, 1301.3781.

    Google Scholar 

  • Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 311–318).

    Google Scholar 

  • Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: Human language technologies, volume 1 (long papers) (pp. 2227–2237).

    Google Scholar 

  • Pi School x Łukasz Kaiser. (2017). Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass. Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass. Retrieved from https://www.youtube.com/watch?v=rBCqOTEfxvg&t=2466s&ab_channel=PiSchool

  • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Retrieved from https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf

  • Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 2383–2392).

    Chapter  Google Scholar 

  • Rajpurkar, P., Jia, R., & Liang, P. (2018). Know what you Don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th annual meeting of the association for computational linguistics (volume 2: Short papers) (pp. 784–789).

    Chapter  Google Scholar 

  • Richardson, M., Burges, C. J., & Renshaw, E. (2013). Mctest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of the 2013 conference on empirical methods in natural language processing (pp. 193–203).

    Google Scholar 

  • Rosenfeld, R. (2000). Two decades of statistical language modeling: Where do we go from here? In Proceedings of the IEEE, 88 (pp. 1270–1278).

    Google Scholar 

  • Ruder, S. (2018). NLP’s ImageNet moment has arrived. In NLP’s ImageNet moment has arrived.

    Google Scholar 

  • Ruder, S. (2019). Neural Transfer Learning for Natural Language Processing. Ph.D. dissertation, National University of Ireland, Galway.

    Google Scholar 

  • Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q., Hinton, G., & Dean, J. (2017). Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv, 1701.06538.

    Google Scholar 

  • Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 27, 3104–3112.

    Google Scholar 

  • Tatman, R. (2017). Gender and dialect bias in YouTube’s automatic captions. In Proceedings of the first ACL workshop on ethics in natural language processing (pp. 53–59).

    Chapter  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st international conference on neural information processing systems (pp. 6000–6010).

    Google Scholar 

  • Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., & Hinton, G. (2015). Grammar as a foreign language. Advances in Neural Information Processing Systems, 28, 2773–2781.

    Google Scholar 

  • Wang, J., Zhou, R., Li, J., & Wang, G. (2014). A distributed rule engine based on message-passing model to deal with big data. Lecture Notes on Software Engineering, 2, 275.

    Article  Google Scholar 

  • Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP workshop BlackboxNLP: Analyzing and interpreting neural networks for NLP (pp. 353–355).

    Chapter  Google Scholar 

  • Wang, A., Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., … Bowman, S. R. (2019). SuperGLUE: A stickier benchmark for general-purpose language understanding systems. Advances in Neural Information Processing Systems, 32.

    Google Scholar 

  • Weissenborn, D., Wiese, G., & Seiffe, L. (2017). Making neural QA as simple as possible but not simpler. In Proceedings of the 21st conference on computational natural language learning (CoNLL 2017) (pp. 271–280).

    Chapter  Google Scholar 

  • White, J. S., O’Connell, T. A., & O’Mara, F. E. (1994). The ARPA MT evaluation methodologies: Evolution, lessons, and future approaches. In Proceedings of the first conference of the association for machine translation in the Americas.

    Google Scholar 

  • Wilson, B. (2012, 6). The AI Dictionary. The AI dictionary. Retrieved from http://www.cse.unsw.edu.au/billw/dictionaries/aidict.html

  • Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., … others. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv, 1609.08144.

    Google Scholar 

  • Zellers, R., Bisk, Y., Schwartz, R., & Choi, Y. (2018). SWAG: A large-scale adversarial dataset for grounded Commonsense inference. In Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 93–104).

    Chapter  Google Scholar 

  • Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K.-W. (2018). Gender bias in Coreference resolution: Evaluation and debiasing methods. In Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 2 (Short papers) (pp. 15–20).

    Google Scholar 

Download references

Acknowledgments

We would like to extend our sincere thanks to Sridhar Nandigam, Arvind Ganesh, and Thasina Tabashum for being the early reviewers and critiques of the chapter while the writing was in progress. Their timely feedback on the very first draft of this chapter provided valuable inputs and suggestions which constructively helped us in tailoring our chapter more specifically for the intended audience.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phillip Nelson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nelson, P., Urs, N.V., Kasicheyanula, T.R. (2022). Progress in Natural Language Processing and Language Understanding. In: Albert, M.V., Lin, L., Spector, M.J., Dunn, L.S. (eds) Bridging Human Intelligence and Artificial Intelligence. Educational Communications and Technology: Issues and Innovations. Springer, Cham. https://doi.org/10.1007/978-3-030-84729-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84729-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84728-9

  • Online ISBN: 978-3-030-84729-6

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics