Skip to main content

Artificial Intelligence and Language

  • Chapter
  • First Online:
A Guided Tour of Artificial Intelligence Research

Abstract

This chapter provides an overview of the role of artificial intelligence in natural language processing. We follow the chronology of the development of natural language processing systems (Sect. 2). This review is necessarily partial and subjective: rather than providing a general introduction to natural language processing, it focuses on logical and discursive aspects (Sect. 3) and on the contributions of machine learning (Sect. 4).

This chapter extends an earlier version written in French with Laurence Danlos.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This defines selectional restrictions.

References

  • Agichtein E, Gravano L (2000) Snowball: extracting relations from large plain-text collections. In: Proceedings of the fifth ACM Conference on Digital Libraries, ACM, New York, NY, USA, DL ’00, pp 85–94. https://doi.org/10.1145/336597.336644

  • Allauzen A, Yvon F (2012) Statistical methods for machine translation. In: Gaussier E, Yvon F (eds) Textual Information Access, ISTE/Wiley, Paris, chap 7, pp 223–304

    Google Scholar 

  • Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843. https://doi.org/10.1145/182.358434

  • Apidianaki M (2008) Translation-oriented word sense induction based on parallel corpora. In: Calzolari N, Khalid Choukri BM, Mariani J, Odijk J, Piperidis S, Tapias D (eds) Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), European Language Resources Association (ELRA), Marrakech, Morocco

    Google Scholar 

  • Asher N (1992) A default, truth conditional semantics for the progressive. Linguist Philos 15:463–508

    Article  MATH  Google Scholar 

  • Asher N (1993) Reference to abstract objects in discourse. Kluwer Academic Publishers, Netherlands

    Book  Google Scholar 

  • Asher N (2011) Lexical meaning in context: a web of words. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Asher N, Lascarides A (2003) Logics of conversation. Cambridge University Press, Cambridge

    Google Scholar 

  • Asher N, Luo Z (2012) Formalization of coecions in lexical semantics. In: Chemla E (ed) Proceedings of Sinn und Bedeutung, Paris

    Google Scholar 

  • Asher N, Morreau M (1995) What some generic sentences mean. In: Carlson G, Pelletier F (eds) The generic book. University of Chicago Press, Chicago, pp 300–339

    Google Scholar 

  • Asher N, Paul S (2016) Evaluating conversational success: weighted message exchange games. In: Hunter J, Simons M, Stone M (eds) 20th Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL), New Jersey, USA

    Google Scholar 

  • Asher N, van de Cruys T, Bride A, Abrusán M (2016) Integrating type theory and distributional semantics: a case study on adjective-noun compositions. Comput Linguist 42(4):703–725

    Article  MathSciNet  Google Scholar 

  • Asher N, Paul S, Venant A (2017) Message exchange games in strategic conversations. J Philos Log 46(4):355–404. https://doi.org/10.1007/s10992-016-9402-1

  • Barker C, Shan K (2006) Types as graphs: continuations in type logical grammar. J Log Lang Inf 15(4):331–370

    Article  MathSciNet  MATH  Google Scholar 

  • Barnard K, Johnson M, Forsyth D (2003) Word sense disambiguation with pictures. In: Regina Barzilay ER, Siskind JM (eds) Proceedings of the HLT-NAACL 2003 Workshop on Learning Word Meaning from Non-Linguistic Data, pp 1–5. http://www.aclweb.org/anthology/W03-0601.pdf

  • Bethard S, Carpuat M, Apidianaki M, Mohammad SM, Cer D, Jurgens D (eds) (2017a) Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Association for Computational Linguistics. http://www.aclweb.org/anthology/S17-2000

  • Bethard S, Savova G, Palmer M, Pustejovsky J (2017b) SemEval-2017 Task 12: Clinical TempEval. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Association for Computational Linguistics, pp 565–572. https://doi.org/10.18653/v1/S17-2093

  • Biber D (1989) A typology of English texts. Linguistics 27:3–43

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet Allocation. J Mach Learn Res 3:993–1022

    Google Scholar 

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146. http://aclweb.org/anthology/Q17-1010

  • Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, SIGMOD ’08, pp 1247–1250. https://doi.org/10.1145/1376616.1376746

  • Bonzon E (2007) Modélisation des interactions entre agents rationnels: les jeux booléens. PhD thesis, Université Paul Sabatier, Toulouse

    Google Scholar 

  • Bonzon E, Lagasquie-Schiex MC, Lang J, Zanuttini B (2006) Boolean games revisited. In: Brewka G, Coradeschi S, Perini A, Traverso P (eds) ECAI 2006. IOS Press, pp 265–270

    Google Scholar 

  • Bouaud J, Bachimont B, Zweigenbaum P (1996) Processing metonymy: a domain-model heuristic graph traversal approach. In: Tsujii JI (ed) Proceedings of the 16th COLING, Copenhagen, Denmark, pp 137–142

    Google Scholar 

  • Boutilier C (1999) Sequential optimality and coordination and multiagent systems. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99), Stokholm, pp 527–534

    Google Scholar 

  • Brachman RJ, Schmolze J (1985) An overview of the KL-ONE knowledge representation system. Cogn Sci 9:171–216

    Article  Google Scholar 

  • Brent MR (1999) An efficient, probabilistically sound algorithm for segmentation and word discovery. Mach Learn 34(1):71–105

    Article  MATH  Google Scholar 

  • Bresnan J (ed) (1982) The mental representation of grammatical relations. MIT Press, Cambridge

    Google Scholar 

  • Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 21(4):543–565

    MathSciNet  Google Scholar 

  • Brin S (1999) Extracting patterns and relations from the world wide web. In: Atzeni P, Mendelzon A, Mecca G (eds) The The World Wide Web and databases. Springer, Berlin, Heidelberg, pp 172–183

    Google Scholar 

  • Brown PF, Pietra VJD, de Souza PV, Lai JC, Mercer RL (1992) Class-based n-gram models of natural language. Comput Linguist 18(4):467–79

    Google Scholar 

  • Bruni E, Tran GB, Baroni M (2011) Distributional semantics from text and images. In: Proceedings of the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics, Association for Computational Linguistics, Edinburgh, UK, pp 22–32. http://www.aclweb.org/anthology/W11-2503

  • Brysbaert M, Stevens M, Mandera P, Keuleers E (2016) How many words do we know? practical estimates of vocabulary size dependent on word definition, the degree of language input and the participant’s age. Front Psychol 7

    Google Scholar 

  • Burnard L (1995) Users Reference Guide for the British National Corpus. British National Corpus Consortium, Oxford University Computing Services, Oxford, UK, version 1.0 edn, 524p. http://homepages.abdn.ac.uk/k.vdeemter/pages/teaching/NLP/practicals/bnc-doc.pdf

  • Carreras X, Màrquez L (2005) Introduction to the CoNLL-2005 shared task: semantic role labeling. In: Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Association for Computational Linguistics, Ann Arbor, Michigan, pp 152–164. http://www.aclweb.org/anthology/W/W05/W05-0620

  • Chandar APS, Lauly S, Larochelle H, Khapra MM, Ravindran B, Raykar VC, Saha A (2014) An autoencoder approach to learning bilingual word representations. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems. Montreal, Quebec, Canada, pp 1853–1861

    Google Scholar 

  • Chen X, Xu L, Liu Z, Sun M, Luan H (2015) Joint learning of character and word embeddings. In: Proceedings of IJCAI, pp 1236–1242

    Google Scholar 

  • Chomsky N (1964) Syntactic structures. Mouton and Co., The Hague

    MATH  Google Scholar 

  • Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge, MA

    Google Scholar 

  • Chomsky N (1981) Lectures on government and binding. Foris, Dordrecht

    Google Scholar 

  • Church A (1940) A formulation of the simple theory of types. J Symb Log 5:56–68

    Article  MathSciNet  MATH  Google Scholar 

  • Collins M (1997) Three generative, lexicalized models for statistical parsing. In: Proceedings of the Conference on 35th Annual Meeting and 8th Conference of the European Chapter, Association for Computational Linguistics, pp 16–23

    Google Scholar 

  • Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, ACM, New York, NY, USA, ICML ’08, pp 160–167. https://doi.org/10.1145/1390156.1390177

  • Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537. http://dl.acm.org/citation.cfm?id=1953048.2078186

  • Conneau A, Lample G, Ranzato M, Denoyer L, Jgou H (2018) Word translation without parallel data. In: ICLR

    Google Scholar 

  • Cooper R (2011) Copredication, quantification and frames. In: Logical Aspects of Computational Linguistics (LACL’2011) LNAI, vol 6736

    Google Scholar 

  • Crawford V, Sobel J (1982) Strategic information transmission. Econometrica 50(6):1431–1451

    Article  MathSciNet  MATH  Google Scholar 

  • Creutz M, Lagus K (2002) Unsupervised discovery of morphemes. In: Proceedings of the ACL-02 Workshop on Morphological and Phonological Learning, Association for Computational Linguistics, pp 21–30. https://doi.org/10.3115/1118647.1118650, http://www.aclweb.org/anthology/W02-0603

  • Davies M (2011) Google books (American English) Corpus (155 billion words, 1810–2009). Available online at http://googlebooks.byu.edu/

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by Latent Semantic Analysis. J Am Soc Inf Sci 41(6):391–407

    Google Scholar 

  • Deng Y, Kim J, Klein G, Kobus C, Segal N, Servan C, Wang B, Zhang D, Crego J, Senellart J (2017) SYSTRAN purely neural MT engines for WMT2017. In: Proceedings of the Second Conference on Machine Translation, Association for Computational Linguistics, Copenhagen, Denmark, pp 265–270. http://www.aclweb.org/anthology/W17-4722

  • Diab M (2000) An unsupervised method for multilingual word sense tagging using parallel corpora. In: ACL-2000 Workshop on Word Senses and Multi-Linguality, Association for Computational Linguistics, Hong Kong, China, pp 1–9. https://doi.org/10.3115/1117724.1117725. http://aclweb.org/anthology/W00-0801

  • Domshlak C (2002) Modeling and reasoning about preferences with CP nets. PhD thesis, Ben Gurion University

    Google Scholar 

  • Dowty DR (1979) Word meaning and Montague grammar: the semantics of verbs and times in generative semantics and Montague’s PTQ, vol 7. Studies in linguistics and philosophy. Kluwer, Dordrecht

    Book  Google Scholar 

  • Eldan R, Shamir O (2016) The power of depth for feedforward neural networks. In: Conference on Learning Theory, pp 907–940

    Google Scholar 

  • Farrell J (1993) Meaning and credibility in cheap talk games. Games Econ Behav 5:514–531

    Article  MathSciNet  MATH  Google Scholar 

  • Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Gothenburg, Sweden, pp 462–471. http://www.aclweb.org/anthology/E14-1049

  • Faruqui M, Dyer C (2015) Non-distributional word vector representations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: short papers), Association for Computational Linguistics, pp 464–469. https://doi.org/10.3115/v1/P15-2076

  • Faruqui M, Dodge J, Jauhar SK, Dyer C, Hovy E, Smith NA (2015) Retrofitting word Vectors to Semantic Lexicons. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Denver, Colorado, pp 1606–1615. http://www.aclweb.org/anthology/N15-1184

  • Fellbaum C (ed) (1998) WordNet: an electronic database. MIT Press, Cambridge, MA

    MATH  Google Scholar 

  • Ferré A, Zweigenbaum P, Nédellec C (2017) Representation of complex terms in a vector space structured by an ontology for a normalization task. In: BioNLP 2017, Association for Computational Linguistics, Vancouver, Canada, pp 99–106. http://www.aclweb.org/anthology/W17-2312.pdf

  • FitzGerald N, Täckström O, Ganchev K, Das D (2015) Semantic role labeling with neural network factors. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 960–970. http://aclweb.org/anthology/D15-1112

  • Fox B (1987) Discourse structure and anaphora. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Frey W, Reyle U, Rohrer C (1983) Automatic construction of a knowledge base by analyzing texts in natural language. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 727–730

    Google Scholar 

  • Goikoetxea J, Soroa A, Agirre E, (2015) Random walks and neural network language models on knowledge bases. In: Human Language technologies: the (2015) conference of the North American Chapter of the Association for Computational Linguistics. Denver, CO, pp 1434–1439

    Google Scholar 

  • Gouws S, Bengio Y, Corrado G (2015) BilBOWA: fast bilingual distributed representations without word alignments. In: Bach F, Blei D (eds) Proceedings of the 32nd International Conference on Machine Learning, Lille, France, JMLR Workshop and Conference Proceedings, vol 37

    Google Scholar 

  • Graff D, Cieri C (2003) English Gigaword, LDC2003T05. Linguistic Data Consortium, Philadelphia, web download

    Google Scholar 

  • Grefenstette G (1994) Explorations in Automatic Thesaurus Discovery. Natural Language Processing and Machine Translation, Kluwer Academic Publishers, London

    Google Scholar 

  • Grégoire F, Langlais P (2017) BUCC 2017 Shared Task: a first attempt toward a deep learning framework for identifying parallel sentences in comparable corpora. In: Proceedings of the 10th Workshop on Building and Using Comparable Corpora, Association for Computational Linguistics, Vancouver, Canada, pp 46–50. http://aclweb.org/anthology/W17-2509

  • Grice HP (1975) Logic and conversation. In: Cole P, Morgan JL (eds) Syntax and semantics volume 3: speech acts. Academic Press, Cambridge, pp 41–58

    Google Scholar 

  • Groenendijk J, Stokhof M (1991) Dynamic predicate logic. Linguist Philos 14:39–100

    Article  MATH  Google Scholar 

  • de Groote P (2006) Towards a Montagovian account of dynamcs. In: SALT 16, CLC Publications, pp 148–155

    Google Scholar 

  • Grosz B (1979) Utterance and objective: issues in natural language communication. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 1979), pp 1067–1076

    Google Scholar 

  • Grosz B, Sidner C (1986) Attention, intentions and the structure of discourse. Comput Linguist 12:175–204

    Google Scholar 

  • Grosz BJ, Kraus S (1993) Collaborative plans for group activities. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Morgan Kaufmann, Los Altos, California, pp 367–373

    Google Scholar 

  • Habert B, Zweigenbaum P (2002) Contextual acquisition of information categories: what has been done and what can be done automatically? In: Nevin BE, Johnson SM (eds) The Legacy of Zellig Harris: Language and Information into the 21st Century, vol 2. Mathematics and Computability of Language. John Benjamins, Amsterdam, pp 203–231

    Google Scholar 

  • Habert B, Naulleau E, Nazarenko A (1996) Symbolic word clustering for medium-size corpora. In: Proceedings of 16th International Conference on Computational Linguistics. Denmark, Copenhagen, pp 490–495

    Google Scholar 

  • Hall D, Jurafsky D, Manning CD (2008) Studying the history of ideas using topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics

    Google Scholar 

  • Halpern J, Fagin R, Moses Y, Vardi M (1995) Reasoning About Knowledge. MIT Press, Cambridge, MA

    Google Scholar 

  • Harel D (1984) Dynamic logic. In: Gabbay D, Guenther F (eds) Handbook of Philosophical Logic, Volume II Extensions of Classical Logic, vol 2. D. Reidel Publishing Co., Dordrecht, pp 497–604

    Google Scholar 

  • Harris ZS (1954) Distributional structure. Word 10(2–3):146–162

    Article  Google Scholar 

  • Hirschman L, Grishman R, Sager N (1975) Grammatically-based automatic word class formation. Inf Process Manag 11:39–57

    Article  Google Scholar 

  • Hobbs JR (1985) On the coherence and structure of discourse. Tech Rep csli-85-37, Center for the Study of Language and Information, Stanford University

    Google Scholar 

  • Hobbs JR, Stickel M, Appelt D, Martin P (1993) Interpretation as abduction. Artif Intell 63(1–2):69–142

    Article  Google Scholar 

  • Hofmann T (1999) Probabilistic Latent Semantic Indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference

    Google Scholar 

  • Huang E, Socher R, Manning C, Ng A (2012) Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: long papers), Association for Computational Linguistics, Jeju Island, Korea, pp 873–882. http://www.aclweb.org/anthology/P12-1092

  • Iacobacci I, Pilehvar MT, Navigli R (2015) SensEmbed: learning sense embeddings for word and relational similarity. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: long papers), Association for Computational Linguistics, Beijing, China, pp 95–105. http://www.aclweb.org/anthology/P15-1010

  • Johannsen A, Martínez Alonso H, Søgaard A (2015) Any-language frame-semantic parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 2062–2066. http://aclweb.org/anthology/D15-1245

  • Johnson C, Fillmore CJ (2000) The FrameNet tagset for frame-semantic and syntactic coding of predicate-argument structure. In: Proceedings ANLP-NAACL, Seattle, WA

    Google Scholar 

  • Johnson M (1989) Parsing as deduction: the use of knowledge of language. J Psycholinguist Res 18(1):105–128

    Article  Google Scholar 

  • Joshi AK, Levy LS, Takahashi M (1975) Tree adjunct grammars. J Comput Syst Sci 10(1):136–163

    Article  MathSciNet  MATH  Google Scholar 

  • Kameyama M (1995) Indefeasible semantics and defeasible pragmatics. Quantifiers, deduction and context, vol 57. CSLI lecture notes, pp 476–482

    Google Scholar 

  • Kamp H (1973) Free choice permission. Proc Aristot Soc 74:57–74

    Article  MathSciNet  Google Scholar 

  • Kamp H, Reyle U (1993) From Discourse to the Lexicon: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Kluwer Academic Publishers

    Google Scholar 

  • Kaplan R, Bresnan J (1983) Lexical-functional grammar: a formal system for grammatical representation. In: Bresnan J (ed) The Mental Representation of Grammatical Relations. MIT Press, Cambridge MA

    Google Scholar 

  • Kehler A, Kertz L, Rohde H, Elman J (2008) Coherence and coreference revisited. J Semant (Special Issue on Processing Meaning) 25(1):1–44

    Google Scholar 

  • Kilgarriff A, Palmer M (2000) Special issue on senseval: evaluating word sense disambiguation programs. Comput Humanities 34(1–2):1–13

    Google Scholar 

  • Klavans JL, Resnik P (eds) (1996) The Balancing Act. MIT Press, Cambridge, MA

    Google Scholar 

  • Klementiev A, Titov I, Bhattarai B (2012) Inducing crosslingual distributed representations of words. In: Proceedings of COLING 2012, The COLING 2012 Organizing Committee, Mumbai, India, pp 1459–1474. http://www.aclweb.org/anthology/C12-1089

  • Lambek J (1958) The mathematics of sentence structure. Am Math Mon 65:154–170

    Article  MathSciNet  MATH  Google Scholar 

  • Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, Association for Computational Linguistics, San Diego, CA, pp 260–270

    Google Scholar 

  • Lascarides A, Asher N (1993) Temporal interpretation, discourse relations and commonsense entailment. Linguist Philos 16(5):437–493

    Article  Google Scholar 

  • Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp 1188–1196

    Google Scholar 

  • Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, vol 13. pp 556–562

    Google Scholar 

  • Lenat D (1995) Cyc: a large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38

    Article  Google Scholar 

  • Levy O, Goldberg Y (2014a) Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (volume 2: short papers), Association for Computational Linguistics, Baltimore, Maryland, pp 302–308. http://www.aclweb.org/anthology/P14-2050

  • Levy O, Goldberg Y (2014b) Neural word embedding as implicit matrix factorization. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in Neural Information Processing Systems 27, Curran Associates Inc., pp 2177–2185

    Google Scholar 

  • Lewis D (1969) Convention: a Philosophical Study. Harvard University Press, Cambridge

    Google Scholar 

  • Li J, Jurafsky D (2015) Do multi-sense embeddings improve natural language understanding? In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 1722–1732. http://aclweb.org/anthology/D15-1200

  • Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 829–834

    Google Scholar 

  • Ling W, Dyer C, Black AW, Trancoso I, Fermandez R, Amir S, Marujo L, Luis T (2015) Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 1520–1530. http://aclweb.org/anthology/D15-1176

  • Löbner S (2014) Evidence for frames from human language. In: Gamerschlag T, Gerland D, Osswald R, Petersen W (eds) Frames and Concept Types, vol 94. Studies in Linguistics and Philosophy. Springer, Cham

    Google Scholar 

  • Lochbaum KE (1998) A collaborative planning model of intentional structure. Comput Linguist 24(4):525–572

    Google Scholar 

  • Luo Z (2010) Type-Theoretical Semantics with Coercive Subtyping. SALT20, Vancouver

    Google Scholar 

  • Luo Z (2012) Formal Semantics in Modern Type Theories with Coercive Subtyping. Linguist Philos

    Google Scholar 

  • Mann WC, Thompson SA (1986) Rhetorical structure theory: description and construction of text structures. In: Kempen G (ed) Natural Language Generation: New Results in Artificial Intelligence. Springer, Dordrecht, pp 279–300

    Google Scholar 

  • Mann WC, Thompson SA (1987) Rhetorical structure theory: a framework for the analysis of texts. Int Pragmat Assoc Pap Pragmat 1:79–105

    Google Scholar 

  • Manning CD, Schütze H (1999) Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA

    Google Scholar 

  • Marcu D (1997) The rhetorical parsing of unrestricted natural language texts. In: Cohen PR, Wahlster W (eds) Proceedings of the Thirty-Fifth Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Somerset, New Jersey, pp 96–103

    Google Scholar 

  • Marcus MP, Santorini B, Marcinkiewicz MA (1993) Building a large annotated corpus of English: the Penn treebank. Comput Linguist 19(2):313–330

    Google Scholar 

  • McCarthy J (1980) Circumscription-a form of non-monotonic reasoning. Artif Intell 13(1–2):27–39

    Article  MATH  Google Scholar 

  • Mercer R (1987) A default logic approach to the derivation of natural language presuppositions. PhD thesis, University of British Columbia

    Google Scholar 

  • Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR’13

    Google Scholar 

  • Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ (1990) Introduction to WordNet: an on-line lexical database. Int J Lexicogr 3(4):235–244

    Article  Google Scholar 

  • Mimno D, Wallach HM, Naradowsky J, Smith DA, McCallum A (2009) Polylingual topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’09, pp 880–889. http://dl.acm.org/citation.cfm?id=1699571.1699627

  • Minsky M (1969) Semantic Information Processing. The MIT Press, Cambridge

    Google Scholar 

  • Minsky M (1994) A framework for representing knowledge. In: Winston PH (ed) Psychology of Computer Vision. McGraw-Hill, New York

    Google Scholar 

  • Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: volume 2 - volume 2, Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’09, pp 1003–1011. http://dl.acm.org/citation.cfm?id=1690219.1690287

  • Misawa S, Taniguchi M, Miura Y, Ohkuma T (2017) Character-based bidirectional LSTM-CRF with words and characters for Japanese named entity recognition. In: Proceedings of the First Workshop on Subword and Character Level Models in NLP, Association for Computational Linguistics, Copenhagen, Denmark

    Google Scholar 

  • Mnih A, Kavukcuoglu K (2013) Learning word embeddings efficiently with noise-contrastive estimation. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in Neural Information Processing Systems 26, Curran Associates, Inc., pp 2265–2273. http://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf

  • MUC5 (1993) Fifth message understanding conference (MUC-5). Defense Advanced Research Projects Agency, Morgan Kaufmann, San Francisco, Ca

    Google Scholar 

  • Nagata R, Whittaker E (2013) Reconstructing an Indo-European family tree from non-native English texts. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (volume 2: short papers), Association for Computational Linguistics, Sofia, Bulgaria

    Google Scholar 

  • Oehrle R, Bach E, Wheeler D (eds) (1988) Categorial Grammars and Natural Language Structures, vol 32. Studies in Linguistics and Philosophy. Kluwer, Dordrecht

    Google Scholar 

  • Pelletier FJ, Asher N (1997) Generics and defaults. In: van Benthem J, ter Meulen A (eds) Handbook of Logic and Language. North Holland, Amsterdam, pp 1125–1177

    Google Scholar 

  • Polanyi L (1985) A theory of discourse structure and discourse coherence. In: W H Eilfort PDK, Peterson KL (eds) Papers from the General Session at the 21st Regional Meeting of the Chicago Linguistics Society

    Google Scholar 

  • Pollard C (1984) Generalized phrase structure grammars, head grammars and natural language. PhD thesis, Stanford University

    Google Scholar 

  • Pollard C, Sag IA (1987) Information-based syntax and semantics. vol 1: fundamentals. CSLI

    Google Scholar 

  • Prasad R, Dinesh N, Lee A, Miltsakaki E, Robaldo L, Joshi A, Webber B (2008) The penn discourse treebank 2.0. In: Calzolari N, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Tapias D (eds) Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), European Language Resources Association (ELRA), Marrakech, Morocco

    Google Scholar 

  • Pustejovsky J (1995) The Generative Lexicon. MIT Press, Cambridge

    Google Scholar 

  • Rabin M (1990) Communication between rational agents. J Econ Theory 51:144–170

    Article  MathSciNet  MATH  Google Scholar 

  • Ravichandran D, Hovy E (2002) Learning surface text patterns for a question answering system. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp 41–47. https://doi.org/10.3115/1073083.1073092. http://www.aclweb.org/anthology/P02-1006

  • Reiter R (1980) A logic for default reasoning. Artif Intell 13:91–132

    Article  MathSciNet  MATH  Google Scholar 

  • Ruder S, Vulić I, Søgaard A (2017) A survey of cross-lingual embedding models. CoRR abs/1706.04902v2

  • Schank R, Abelson R (1977) Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum Associates, Hillsdale, New Jersey

    Google Scholar 

  • Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, pp 44–49

    Google Scholar 

  • Schütze H (1992) Dimensions of meaning. In: Proceedings of Supercomputing’92, Minneapolis, pp 787–796

    Google Scholar 

  • Searle JR (1980) Minds, brains, and programs. Behav Brain Sci 3(3):417–424

    Article  Google Scholar 

  • Sharp R, Surdeanu M, Jansen P, Clark P, Hammond M (2016) Creating causal embeddings for question answering with minimal supervision. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Austin, Texas, pp 138–148. https://aclweb.org/anthology/D16-1014

  • Sowa JF (1984) Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, London

    Google Scholar 

  • Steyvers M, Griffiths T (2006) Probabilistic topic models. In: Landauer T, McNamara D, Dennis S, Kintsch W (eds) Latent Semantic Analysis: A Road to Meaning. Lawrence Erlbaum

    Google Scholar 

  • Styler W 4th, Bethard S, Finan S, Palmer M, Pradhan S, de Groen P, Erickson B, Miller T, Lin C, Savova G, Pustejovsky J (2014) Temporal annotation in the clinical domain. Trans Assoc Comput Linguist 2:143–154

    Article  Google Scholar 

  • Sun Y, Lin L, Tang D, Yang N, Xiaolong Wang ZJ (2014) Radical-enhanced Chinese character embedding. In: Proceedings of ICONIPS

    Google Scholar 

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - volume 2, MIT Press, Cambridge, MA, USA, NIPS’14, pp 3104–3112. http://dl.acm.org/citation.cfm?id=2969033.2969173

  • Thomason RH (ed) (1974) Formal Philosophy. Yale University Press, New Haven

    Google Scholar 

  • Touretsky D, Horty J, Thomason R (1987) A clash of intuitions: The current state of nonmonotonic multiple inheritance systems. In: Proceedings of the International Joint Conference on Artificial Intelligence, vol 1. pp 476–482

    Google Scholar 

  • Tourille J, Ferret O, Tannier X, Névéol A (2017) LIMSI-COT at SemEval-2017 Task 12: Neural architecture for temporal information extraction from clinical narratives. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Association for Computational Linguistics, Vancouver, Canada, pp 597–602. http://www.aclweb.org/anthology/S17-2098

  • Turian J, Ratinov LA, Bengio Y (2010) Word representations: A simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Uppsala, Sweden, pp 384–394. http://www.aclweb.org/anthology/P10-1040

  • van Rooij R (2004) Signalling games select horn strategies. Linguist Philos 27:493–527

    Article  Google Scholar 

  • van Rooij R, Schultz K (2004) Exhaustive interpretation of complex sentences. J Log Lang Inf 13(4):491–519

    Google Scholar 

  • Veltman F (1996) Defaults in update semantics. J Philos Log 25:221–261

    Article  MathSciNet  MATH  Google Scholar 

  • Voorhees EM, Harman DK (2005) TREC: Experiment and Evaluation in Information Retrieval. The MIT Press, Cambridge

    Google Scholar 

  • Vulić I, Smet WD, Moens MF (2011) Identifying word translations from comparable corpora using latent topic models. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), pp 479–484

    Google Scholar 

  • Wang L, Li Y, Lazebnik S (2016) Learning deep structure-preserving image-text embeddings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  • Weizenbaum J (1966) Eliza-a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):1–5

    Article  Google Scholar 

  • Yao L, Riedel S, McCallum A (2012) Unsupervised relation discovery with sense disambiguation. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics

    Google Scholar 

  • Yarowsky D (1995) Unsupervised word sense disambiguation rivalling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Cambridge, Massachusetts, USA

    Google Scholar 

  • Young P, Lai A, Hodosh M, Hockenmaier J (2014) From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Transactions of the Association for Computational Linguistics 2:67–78. https://transacl.org/ojs/index.php/tacl/article/view/229

  • Yu M, Dredze M (2014) Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (volume 2: short papers), Association for Computational Linguistics, Baltimore, Maryland, pp 545–550. http://www.aclweb.org/anthology/P14-2089

  • Zhang M, Liu Y, Luan H, Sun M (2017) Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp 1924–1935

    Google Scholar 

  • Zhao B, Xing EP (2007) HM-BiTAM: Bilingual topic exploration, word alignment, and translation. In: NIPS

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicholas Asher .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Asher, N., Zweigenbaum, P. (2020). Artificial Intelligence and Language. In: Marquis, P., Papini, O., Prade, H. (eds) A Guided Tour of Artificial Intelligence Research. Springer, Cham. https://doi.org/10.1007/978-3-030-06170-8_4

Download citation

Publish with us

Policies and ethics