Skip to main content

Natural Language Processing, Moving from Rules to Data

  • Conference paper
  • First Online:
Theory and Applications of Models of Computation (TAMC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10185))

  • 887 Accesses

Abstract

During the last decade, we assist to a major change in the direction that theoretical models used in natural language processing follow. We are moving from rule-based systems to corpus-oriented paradigms. In this paper, we analyze several generative formalisms together with newer statistical and data-oriented linguistic methodologies. We review existing methods belonging to deep or shallow learning applied in various subfields of computational linguistics. The continuous, fast improvements obtained by practical, applied machine learning techniques may lead us to new theoretical developments in the classic models as well. We discuss several scenarios for future approaches.

This work was partially supported by the Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) through the project UID/MAT/00297/2013 (Centro de Matemática e Aplicações).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Angluin et al. [5] use the term “automaton with output”. In formal language books, like Hopcroft and Ullman [38], this definition corresponds to a Moore automaton, and the notion of acceptors that we define next, corresponds to a DFA.

  2. 2.

    Note that infinite derivations are not allowed because input strings are finite. Empty elementary trees can be avoided in the same way as eliminating \(\epsilon \)-productions from CFGs.

References

  1. Association for Computational Linguistics (ACL). https://www.aclweb.org/. Accessed 19 Jan 2017

  2. The 2012 ACM Computing Classification System. http://www.acm.org/publications/class-2012. Accessed 25 Jan 2017

  3. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  4. Angluin, D., Becerra-Bonache, L.: Learning meaning before syntax. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 1–14. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88009-7_1

    Chapter  Google Scholar 

  5. Angluin, D., Becerra-Bonache, L., Dediu, A.H., Reyzin, L.: Learning finite automata using label queries. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS (LNAI), vol. 5809, pp. 171–185. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04414-4_17

    Chapter  Google Scholar 

  6. Bangalore, S., Joshi, A.K.: Supertagging: an approach to almost parsing. Comput. Linguist. 25, 237–265 (1999)

    Google Scholar 

  7. Bangalore, S., Joshi, A.K. (eds.): Supertagging. A Bradford Book. The MIT Press, Cambridge (2010)

    Google Scholar 

  8. Bellegarda, J.R., Monz, C.: State of the art in statistical methods for language and speech processing. Comput. Speech Lang. 35, 163–184 (2016). http://dx.doi.org/10.1016/j.csl.2015.07.001

    Article  Google Scholar 

  9. Bikel, D.M.: Intricacies of Collins’ parsing model. Comput. Linguist. 30(4), 479–511 (2004)

    Article  MATH  Google Scholar 

  10. Branscombe, M.: Review: Nuance dragon for windows offers strong voice recognition. Computer World, January 2016. http://www.computerworld.com/article/3018071/desktop-apps/review-nuance-dragon-for-windows-offers-strong-voice-recognition.html

  11. Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)

    MathSciNet  Google Scholar 

  12. Chierchia, G.: Anaphora and dynamic binding. Linguist. Philos. 15, 111–183 (1992)

    Article  Google Scholar 

  13. Cole, R. (ed.): Survey of the State of the Art in Human Language Technology. Cambridge University Press, New York (1997)

    Google Scholar 

  14. Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia, PA (1999)

    Google Scholar 

  15. Crossley, S.A., Kyle, K., McNamara, D.S.: The tool for the automatic analysis of text cohesion (taaco): automatic assessment of local, global, and text cohesion. Behav. Res. Methods 2015, 1–11 (2015)

    Google Scholar 

  16. De Beaugrande, R., Dressler, W.: Introduction to Text Linguistics. Longman Linguistics Library. Routledge, London (2016). https://books.google.pt/books?id=gQrrjwEACAAJ

    Google Scholar 

  17. Dediu, A.-H., Klempien-Hinrichs, R., Kreowski, H.-J., Nagy, B.: Contextual hypergraph grammars – a new approach to the generation of hypergraph languages. In: Ibarra, O.H., Dang, Z. (eds.) DLT 2006. LNCS, vol. 4036, pp. 327–338. Springer, Heidelberg (2006). doi:10.1007/11779148_30

    Chapter  Google Scholar 

  18. Dediu, A.H., Tîrnăucă, C.I.: Evolutionary algorithms for parsing tree adjoining grammars. In: Bel-Enguix, G., Jiménez-López, M. (eds.) Bio-Inspired Models for Natural and Formal Languages, pp. 277–304. Cambrige Scholars (2011)

    Google Scholar 

  19. Deep Learning. https://en.m.wikipedia.org/wiki/Deep_learning. Accessed 31 Jan 2017

  20. Dekker, P.: Coreference and representationalism. In: von Heusinger, K., Egli, U. (eds.) Reference and Anaphorical Relations, pp. 287–310. Kluwer, Dordrecht (2000)

    Chapter  Google Scholar 

  21. Dempsey, I., O’Neill, M., Brabazon, A.: Foundations in Grammatical Evolution for Dynamic Environments. Springer, Heidelberg (2009)

    Book  Google Scholar 

  22. Denkowski, M.: A Survey of Techniques for Unsupervised Word Sense Induction. Lang. Stat. II Lit. Rev. (2009)

    Google Scholar 

  23. Dorow, B., Widdows, D.: Discovering corpus-specific word senses. In: 82. Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary (2003)

    Google Scholar 

  24. Erekhinskaya, T., Moldovan, D.: Lexical chains on wordnet and extensions. In: Proceedings of the Twenty-Sixth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2013 (2013)

    Google Scholar 

  25. Fahrenberg, U., Biondi, F., Corre, K., Jegourel, C., Kongshøj, S., Legay, A.: Measuring global similarity between texts. In: Besacier, L., Dediu, A.-H., Martín-Vide, C. (eds.) SLSP 2014. LNCS (LNAI), vol. 8791, pp. 220–232. Springer, Cham (2014). doi:10.1007/978-3-319-11397-5_17

    Google Scholar 

  26. Fellbaum, C. (ed.): WordNet: An Electronic Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  27. Ferenčík, M.: A Survey of English Stylistics. http://www.pulib.sk/elpub2/FF/Ferencik/INDEX.HTM. Accessed 22 Jan 2017

  28. Fortu, O., Moldovan, D.: Identification of textual contexts. In: Dey, A., Kokinov, B., Leake, D., Turner, R. (eds.) CONTEXT 2005. LNCS (LNAI), vol. 3554, pp. 169–182. Springer, Heidelberg (2005). doi:10.1007/11508373_13

    Chapter  Google Scholar 

  29. Freund, Y., Kearns, M.J., Ron, D., Rubinfeld, R., Schapire, R.E., Sellie, L.: Efficient learning of typical finite automata from random walks. Inf. Comput. 138(1), 23–48 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  30. Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984)

    MATH  Google Scholar 

  31. Gécseg, F., Steinby, M.: Tree languages. In: Salomaa, A., Rozenberg, G. (eds.) Handbook of Formal Languages. Beyond Words, vol. 3, pp. 1–68. Springer, New York (1997)

    Google Scholar 

  32. Global WordNet Association. http://globalwordnet.org/. Accessed 24 Jan 2017

  33. Gold, E.M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  34. Harabagiu, S.M.: From lexical cohesion to textual coherence: a data driven perspective. Int. J. Patt. Recognit. Artif. Intell. 13(2), 247–265 (1999)

    Article  Google Scholar 

  35. Hemberg, E.A.P.: An Exploration of Grammars in Grammatical Evolution. Ph.D. thesis, University College Dublin, September 2010

    Google Scholar 

  36. Hirschberg, J., Manning, C.D.: Advances in natural language processing. Science 349(6245), 261–266 (2015). http://dx.doi.org/10.1126/science.aaa8685

    Article  MathSciNet  MATH  Google Scholar 

  37. Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Addison-Wesley, Reading (2006)

    MATH  Google Scholar 

  38. Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (1979)

    MATH  Google Scholar 

  39. Hovy, E., Lin, C.Y.: Automated text summarization in SUMMARIST. In: Proceedings of the Intelligent Scalable Text Summarization Workshop, pp. 18–24 (1997)

    Google Scholar 

  40. Joshi, A., Levy, L., Takahashi, M.: Tree adjunct grammars. J. Comput. Syst. Sci. 10(1), 136–163 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  41. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 1st edn. Prentice Hall PTR, Upper Saddle River (2000)

    Google Scholar 

  42. Kakkonen, T.: Framework and resources for natural language parser evaluation. Computing Research Repository abs/0712.3705 (2007)

    Google Scholar 

  43. Kamp, H., Reyle, U.: From Discourse to Logic. Kluwer, Dordrecht (1993)

    MATH  Google Scholar 

  44. Kastner, I., Monz, C.: Automatic single-document key fact extraction from newswire articles. In: Proceedings of the 12th Conference on European Chapter of the ACL (EACL 2009), Athens, Greece, pp. 415–423 (2009)

    Google Scholar 

  45. Kay, M.: Machine translation: the disappointing past and present. In: Cole, R. (ed.) Survey of the State of the Art in Human Language Technology, pp. 248–250. Cambridge University Press, New York (1997). http://dl.acm.org/citation.cfm?id=278696.278813

  46. Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif. Intell. 13(1), 91–107 (2001)

    MATH  Google Scholar 

  47. Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  48. Kudlek, M., Martín-Vide, C., Mateescu, A., Mitrana, V.: Contexts and the concept of mild context-sensitivity. Linguist. Philos. 26, 703–725 (2002)

    Article  Google Scholar 

  49. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  50. Lopez, A.: Statistical machine translation. ACM Comput. Surv. 40(3), 1–8 (2008)

    Article  Google Scholar 

  51. Manning, C.D.: Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: Gelbukh, A.F. (ed.) CICLing 2011. LNCS, vol. 6608, pp. 171–189. Springer, Heidelberg (2011). doi:10.1007/978-3-642-19400-9_14

    Chapter  Google Scholar 

  52. Marcus, S.: Contextual grammars. Rev. Roum. Math. Pures et Appl. 14(10), 1525–1534 (1969). http://citeseer.ist.psu.edu/marcus69contextual.html

    MathSciNet  MATH  Google Scholar 

  53. Mariòo, J.B., Banchs, R.E., Crego, J.M., de Gispert, A., Lambert, P., Fonollosa, J.A.R., Costa-Jussà, M.R.: N-gram-based machine translation. Comput. Linguist. Arch. 32(4), 527–549 (2006). MIT Press, Cambridge

    Article  MathSciNet  MATH  Google Scholar 

  54. McCarthy, J.: Notes on formalizing context. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, IJCAI 1993, vol. 1, pp. 555–560. Morgan Kaufmann Publishers Inc., San Francisco (1993). http://dl.acm.org/citation.cfm?id=1624025.1624103

  55. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

  56. Miller, G.A.: Dictionaries of the mind. In: Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics, ACL 1985, pp. 305–314. Association for Computational Linguistics, Stroudsburg (1985). http://dx.doi.org/10.3115/981210.981248

  57. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995). http://doi.acm.org/10.1145/219717.219748

    Article  Google Scholar 

  58. Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997). http://dl.acm.org/citation.cfm?id=972695.972698

    MathSciNet  Google Scholar 

  59. Montague, R.: Universal grammar. Theoria 36, 373–398 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  60. Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput. Linguist. 17(1), 21–48 (1991). http://dl.acm.org/citation.cfm?id=971738.971740

    Google Scholar 

  61. MultiJEDI - Multilingual joint word sense disambiguation. http://multijedi.org/. Accessed 28 Jan 2017

  62. Muskens, R.: Combining Montague semantics and discourse representation. Linguist. Philos. 19(2), 143–186 (1996)

    Article  MathSciNet  Google Scholar 

  63. Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. (CSUR) 41(2), 1–69 (2009)

    Article  Google Scholar 

  64. Navigli, R., Ponzetto, S.P.: Babelnet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). http://dx.doi.org/10.1016/j.artint.2012.07.001

    Article  MathSciNet  MATH  Google Scholar 

  65. Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans. Patt. Anal. Mach. Intell 27(7), 1075–1088 (2005)

    Article  Google Scholar 

  66. O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language. Kluwer, Dordrecht (2003)

    Book  MATH  Google Scholar 

  67. Pal, A.R., Saha, D.: Word sense disambiguation: a survey. Int. J. Control Theor. Comput. Model. (IJCTCM) 5(3), 1–16 (2015)

    Article  Google Scholar 

  68. Păun, G.: Marcus Contextual Grammars. Kluwer Academic Publishers, Norwell (1997)

    Book  MATH  Google Scholar 

  69. Raganato, A., Bovi, C.D., Navigli, R.: Automatic construction and evaluation of a large semantically enriched wikipedia. In: Proceedings of 25th International Joint Conference on Artificial Intelligence (IJCAI 2016), New York, USA, July 2016

    Google Scholar 

  70. Rieger, B.B.: On distributed representation in word semantics. Technical report, Forschungsbericht TR-91-012, International Computer Science Institute (ICSI) (1991)

    Google Scholar 

  71. Roen, D.H.: The effects of cohesive conjunctions, reference, response rhetorical predicates, and topic on reading rate and written free recall. J. Read. Behav. 16(1), 15–26 (1984)

    Article  Google Scholar 

  72. Rothlauf, F.: Design of Modern Heuristics: Principles and Application, 1st edn. Springer, Heidelberg (2011)

    Book  MATH  Google Scholar 

  73. Rowcliffe, I.C.: Seven Standards of Textuality? http://web.letras.up.pt/icrowcli/textual.html. Accessed 22 Jan 2017

  74. Sahlgren, M.: The distributional hypothesis. Ital. J. Linguist. 20(1), 33–54 (2008)

    Google Scholar 

  75. Schabes, Y., Joshi, A.K.: An Earley-type parsing algorithm for tree adjoining grammars. In: Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics (ACL 1988), pp. 258–269. Association for Computational Linguistics (1988)

    Google Scholar 

  76. Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: International Conference on New Methods in Language Processing, Manchester, UK, pp. 44–49 (1994)

    Google Scholar 

  77. SemEval Portal. https://www.aclweb.org/aclwiki/index.php?title=SemEval_Portal. Accessed 19 Jan 2017

  78. Sikkel, K.: Parsing Schemata: A Framework for Specification and Analysis of Parsing Algorithms, 1st edn. Springer, Heidelberg (2013)

    MATH  Google Scholar 

  79. Sudkamp, T.A.: Languages and Machines: An Introduction to the Theory of Computer Science, 3rd edn. Addison-Wesley, Reading (2006)

    Google Scholar 

  80. Ulbaek, I.: Second order coherence: a new way of looking at incoherence in texts. Linguist. Beyond and Within 2, 167–179 (2016)

    Google Scholar 

  81. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  MATH  Google Scholar 

  82. Vijay-Shanker, K., Joshi, A.K.: Some computational properties of tree adjoining grammars. In: Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics (ACL 1985), pp. 82–93. Association for Computational Linguistics (1985)

    Google Scholar 

  83. Véronis, J.: Hyperlex: lexical cartography for information retrieval. Comput. Speech Lang. 18(3), 223–252 (2004)

    Article  Google Scholar 

  84. Weng, F., Angkititrakul, P., Shriberg, E., Heck, L.P., Peters, S., Hansen, J.H.L.: Conversational in-vehicle dialog systems: the past, present, and future. IEEE Signal Process. Mag. 33(6), 49–60 (2016). http://dx.doi.org/10.1109/MSP.2016.2599201

    Article  Google Scholar 

  85. Widdows, D., Dorow, B.: A graph model for unsupervised lexical acquisition. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING, Taipei, Taiwan, pp. 1–7 (2002)

    Google Scholar 

  86. Winston, P.H.: Artificial Intelligence, 3rd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1992)

    MATH  Google Scholar 

  87. Zajic, D., Dorr, B., Schwartz, R., Monz, C., Lin, J.: A sentence-trimming approach to multi-document summarization. In: Proceedings of EMNLP 2005 Workshop on Text Summarization (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian-Horia Dediu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Dediu, AH., M. Matos, J., Martín-Vide, C. (2017). Natural Language Processing, Moving from Rules to Data. In: Gopal, T., Jäger , G., Steila, S. (eds) Theory and Applications of Models of Computation. TAMC 2017. Lecture Notes in Computer Science(), vol 10185. Springer, Cham. https://doi.org/10.1007/978-3-319-55911-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55911-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55910-0

  • Online ISBN: 978-3-319-55911-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics