Skip to main content

Processing Natural Language without Natural Language Processing

Part of the Lecture Notes in Computer Science book series (LNCS,volume 2588)

Abstract

We can still create computer programs displaying only the most rudimentary natural language processing capabilities. One of the greatest barriers to advanced natural language processing is our inability to overcome the linguistic knowledge acquisition bottleneck. In this paper, we describe recent work in a number of areas, including grammar checker development, automatic question answering, and language modeling, where state of the art accuracy is achieved using very simple methods whose power comes entirely from the plethora of text currently available to these systems, as opposed to deep linguistic analysis or the application of state of the art machine learning techniques. This suggests that the field of NLP might benefit by concentrating less on technology development and more on data acquisition.

Keywords

  • Natural Language Processing
  • Latent Semantic Analysis
  • Question Answering
  • Training Corpus
  • British National Corpus

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/3-540-36456-0_37
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   39.99
Price excludes VAT (USA)
  • ISBN: 978-3-540-36456-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   54.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banko, M. and Brill, E. Scaling to Very Very Large Corpora for Natural Language Disambiguation. Proceedings of the Association for Computational Linguistics, 2001.

    Google Scholar 

  2. Banko, M. and Brill, E. Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing. Human Language Technologies Conference, 2001.

    Google Scholar 

  3. C. Clarke, G. Cormack and T. Lyman. Exploiting redundancy in question answering. In Proceedings of SIGIR’2001.

    Google Scholar 

  4. Dumais, S., Banko, M., Brill, E., Lin, J. and Ng, A. Web question answering: is more always better? In Proceedings of SIGIR 2002.

    Google Scholar 

  5. Golding, A.R. and Roth, D. A Winnow-Based Approach to Context-Sensitive Spelling Correction. Machine Learning, 34:107–130.

    Google Scholar 

  6. Golding, A.R. and Schabes, Y. Combining trigram-based and feature-based methods for context-sensitive spelling correction. In Proc. 34th Annual Meeting of the Association for Computatoin Lingusitcs. Santa,Cruz, Ca.

    Google Scholar 

  7. Jones, M. P. and Martin, J. H. Contextual spelling correction using latent semantic analysis.

    Google Scholar 

  8. Keller, F., Lapata, M. Ourioupina, O. Using the Web to Overcome Data Sparseness. In Proceedings of the Conference on Empirical Methods in Natural Langauge Processing.

    Google Scholar 

  9. Kwok, C., Etzioni, O. and Weld, D. (2001). Scaling question answering to the Web. In Proceedings of WWW’10.

    Google Scholar 

  10. Mangu, L and Brill, E. Automatic rule acquisition for spelling correction. In Proc. 14th International Conference on Machine Learing. Morgan Kaufmann.

    Google Scholar 

  11. Sapir, E. Language: An Introduction to the Study of Speech. 1921.

    Google Scholar 

  12. Yarowsky, D. Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In Proc. 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM.

    Google Scholar 

  13. Zhu, X. and Rosenfeld, R.. Improving Trigram Language Modeling with the World Wide Web. In proceedings of International Conference on Acoustics, Speech, and Signal Processing, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brill, E. (2003). Processing Natural Language without Natural Language Processing. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_37

Download citation

  • DOI: https://doi.org/10.1007/3-540-36456-0_37

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00532-2

  • Online ISBN: 978-3-540-36456-6

  • eBook Packages: Springer Book Archive