Skip to main content
Log in

Designing a Uniform Meaning Representation for Natural Language Processing

  • Technical Contribution
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

In this paper we present Uniform Meaning Representation (UMR), a meaning representation designed to annotate the semantic content of a text. UMR is primarily based on Abstract Meaning Representation (AMR), an annotation framework initially designed for English, but also draws from other meaning representations. UMR extends AMR to other languages, particularly morphologically complex, low-resource languages. UMR also adds features to AMR that are critical to semantic interpretation and enhances AMR by proposing a companion document-level representation that captures linguistic phenomena such as coreference as well as temporal and modal dependencies that potentially go beyond sentence boundaries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. As can be seen from this example, the document-level representation is a list of triples in the form of <dependent relation parent>, and deviates from the Penn notation used for the sentence-level representation.

References

  1. Abend O, Rappoport A (2013) UCCA. A semantics-based grammatical annotation scheme. In: Proceedings of the 10th international conference on computational semantics, Potsdam, Germany, pp 1–12

  2. Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, short papers, Valencia, Spain, pp 242–247

  3. Asher N, Asher NM, Lascarides A (2003) Logics of conversation. Cambridge University Press, Cambridge

    Google Scholar 

  4. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd international conference on learning representations (ICLR 2015)

  5. Baker CF, Fillmore CJ, Lowe JB (1998) The berkeley framenet project. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics-volume 1, association for computational linguistics, pp 86–90

  6. Banarescu L, Bonial C, Cai S, Georgescu M, Griffitt K, Hermjakob U, Knight K, Koehn P, Palmer M, Schneider N (2013) Abstract meaning representation for sembanking. In: Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pp 178–186

  7. Barker C (2002) Continuations and the nature of quantification. Nat Lang Semant 10(3):211–242

    Article  Google Scholar 

  8. Basile V, Bos J, Evang K, Venhuizen N (2012) Developing a large semantically annotated corpus. LREC 12:3196–3200

    Google Scholar 

  9. Bender EM, Flickinger D, Oepen S, Packard W, Copestake A (2015) Layers of interpretation: on grammar and compositionality. In: Proceedings of the 11th international conference on computational semantics, London, UK, pp 239–249

  10. Bentivogli L, Bisazza A, Cettolo M, Federico M (2016) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 257–267

  11. Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huck M, Jimeno Yepes A, Koehn P, Logacheva V, Monz C, Negri M, Neveol A, Neves M, Popel M, Post M, Rubino R, Scarton C, Specia L, Turchi M, Verspoor K, Zampieri M (2016) Findings of the 2016 conference on machine translation. In: Proceedings of the first conference on machine translation, Berlin, pp 131–198

  12. Bos J (2016) Expressive power of abstract meaning representations. Comput Linguist 42(3):527–535

    Article  MathSciNet  Google Scholar 

  13. Bos J, Basile V, Evang K, Venhuizen N, Bjerva J (2017) The Groningen Meaning Bank. In: Ide N, Pustejovsky J (eds) Handbook of linguistic annotation, vol 2. Springer, Berlin, pp 463–496

    Chapter  Google Scholar 

  14. Boye K (2012) Epistemic meaning: a crosslinguistic and functional-cognitive study, Empirical Approaches to Language Typology, vol 43. De Gruyter Mouton, Berlin

    Book  Google Scholar 

  15. Cai D, Lam W (2020) AMR parsing via graph-sequence iterative inference. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp 1290–1301. https://doi.org/10.18653/v1/2020.acl-main.119

  16. Cai Q, Yates A (2013a) Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 423–433

  17. Cai Q, Yates A (2013b) Semantic parsing freebase: towards open-domain semantic parsing. In: Second joint conference on lexical and computational semantics (* SEM), volume 1: proceedings of the main conference and the shared task: semantic textual similarity, vol 1, pp 328–338

  18. Castilho S, Moorkens J, Gaspari F, Sennrich R, Sosoni V, Georgakopoulou Y, Lohar P, Way A, Barone AVM, Gialama M (2017) A comparative quality evaluation of PBSMT and NMT using professional translators. In: Proceedings of machine translation summit XVI, Nagoya, Japan

  19. Choe H, Han J, Park H, Kim H (2019) Copula and case-stacking annotations for korean amr. In: Proceedings of the first international workshop on designing meaning representations, pp 128–135

  20. Copestake A, Lascarides A, Flickinger D (2001) An algebra for semantic construction in constraint-based grammars. In: Proceedings of the 39th meeting of the association for computational linguistics, Toulouse, France, pp 140–147

  21. Copestake A, Flickinger D, Pollard C, Sag IA (2005) Minimal recursion semantics: an introduction. Res Lang Comput 3(2–3):281–332

    Article  Google Scholar 

  22. Croft W (2012) Verbs, aspect and causal structure. Oxford University Press, Oxford

    Book  Google Scholar 

  23. Croft W (2013) Agreement as anaphora, anaphora as coreference. Lang Across Bound Stud Mem Anna Siewierska 95:117

    Google Scholar 

  24. Croft W, Pešková P, Regan M (2017) Integrating decompositional event structures into storylines. In: Proceedings of the events and stories in the news workshop, association for computational linguistics, Vancouver, Canada, pp 98–109. https://doi.org/10.18653/v1/W17-2713

  25. Cysouw M (2003) The paradigmatic structure of person marking. Oxford University Press, Oxford

    Google Scholar 

  26. Dixon RM, Aikhenvald AY, et al. (2002) Word: A cross-linguistic typology, chap Word: a typological framework, pp 1–41

  27. Donatelli L, Regan M, Croft W, Schneider N (2018) Annotation of tense and aspect semantics for sentential AMR. In: Proceedings of the joint workshop on linguistic annotation, multiword expressions and constructions (LAW-MWE-CxG-2018), pp 96–108

  28. Donatelli L, Schneider N, Croft W, Regan M (2019) Tense and aspect semantics for sentential AMR. Proc Soc Comput Linguist 2:346–348

    Google Scholar 

  29. Dorr BJ (1993) Machine translation: a view from the Lexicon. MIT Press, Chicago

    Book  Google Scholar 

  30. Dorr BJ (1994) Machine translation divergences: a formal description and proposed solution. Comput Linguist 20(4):597–633

    Google Scholar 

  31. Eriguchi A, Tsuruoka Y, Cho K (2017) Learning to parse and translate improves neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: Short Papers), Vancouver, Canada, pp 72–78

  32. Flanigan J, Thomson S, Carbonell JG, Dyer C, Smith NA (2014) A discriminative graph-based parser for the abstract meaning representation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: long papers), pp 1426–1436

  33. Flickinger D (2000) On building a more efficient grammar by exploiting types. Nat Lang Eng 6(1):15–28

    Article  Google Scholar 

  34. Flickinger D, Bender EM, Oepen S (2014) Towards an encyclopedia of compositional semantics: documenting the interface of the English resource grammar. In: LREC, pp 875–881

  35. Ge R, Mooney RJ (2005) A statistical semantic parser that integrates syntax and semantics. In: Proceedings of the ninth conference on computational natural language learning, pp 9–16

  36. Hajič J, Panevová J, Urešová Z, Bémová A, Kolářová V, Pajas P (2003) PDT-VALLEX: creating a large-coverage valency lexicon for treebank annotation. In: Nivre J, Hinrichs E (eds) Proceedings of the second workshop on treebanks and linguistic theories, vol 9. Vaxjo University Press, Vaxjo, pp 57–68

    Google Scholar 

  37. Hajič J, Panevová J, Hajičová E, Sgall P, Pajas P, Štěpánek J, Havelka J, Mikulová M, Žabokrtský Z, Ševčíková-Razímová M, Urešová Z (2006) Prague Dependency Treebank 2.0 (PDT 2.0). http://hdl.handle.net/11858/00-097C-0000-0001-B098-5

  38. Hajič J, Bejček E, Bémová A, Buráňová E, Hajičová E, Havelka J, Homola P, Kárník J, Kettnerová V, Klyueva N, Kolářová V, Kučová L, Lopatková M, Mikulová M, Mírovský J, Nedoluzhko A, Pajas P, Panevová J, Poláková L, Rysová M, Sgall P, Spoustová J, Straňák P, Synková P, Ševčíková M, Štěpánek J, Urešová Z, Vidová Hladká B, Zeman D, Zikánová Š, Žabokrtský Z (2018) Prague dependency treebank 3.5. http://hdl.handle.net/11234/1-2621

  39. Hajič J, Hajičová E, Panevová J, Sgall P, Bojar O, Cinková S, Fučíková E, Mikulová M, Pajas P, Popelka J, Semecký J, Šindlerová J, Štěpánek J, Toman J, Urešová Z, Žabokrtský Z (2012) Announcing Prague Czech-English Dependency Treebank 2.0. In: Proceedings of the eighth international conference on Language Resources and Evaluation, Istanbul, pp 3153–3160

  40. Haspelmath M (2013) Argument indexing: a conceptual framework for the syntactic status of bound person forms. Lang Across Bound Stud Mem Anna Siewierska 197:226

    Google Scholar 

  41. Haspelmath M, Hartmann I (2015) Comparing verbal valency across languages. Valency Class World’s Lang 1:41–72

    Google Scholar 

  42. Helmreich S, Farwell D, Dorr B, Habash N, Levin L, Mitamura T, Reeder F, Miller K, Hovy E, Rambow O et al (2004) Interlingual annotation of multilingual text corpora. In: Proceedings of the HLT-EACL Workshop on Frontiers in Corpus Annotation

  43. Hershcovich D, Aizenbud Z, Choshen L, Sulem E, Rappoport A, Abend O (2019) SemEval-2019 task 1: cross-lingual semantic parsing with UCCA. In: Proceedings of the 13th international workshop on semantic evaluation, Minneapolis, Minnesota, USA, pp 1–10. https://doi.org/10.18653/v1/S19-2001

  44. Kamp H, Reyle U (1993) From discourse to logic: introduction to model theoretic semantics of natural language, formal logic and discourse representation theory. Kluwer, Dordrecht

    Book  Google Scholar 

  45. Kamp H, Reyle U (2013) From discourse to logic: introduction to model-theoretic semantics of natural language, formal logic and discourse representation theory, vol 42. Springer, Berlin

    Google Scholar 

  46. Kate RJ, Mooney RJ (2007) Learning language semantics from ambiguous supervision. AAAI 7:895–900

    Google Scholar 

  47. Li B, Wen Y, QU W, Bu L, Xue N (2016) Annotating The Little Prince with Chinese AMRs. In: Proceedings of the 10th linguistic annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), association for computational linguistics, pp 7–15. https://doi.org/10.18653/v1/W16-1702

  48. Li B, Wen Y, Song L, Qu W, Xue N (2019) Building a Chinese AMR bank with concept and relation alignments. LiLT (Linguistic Issues in Language Technology) 18

  49. Liang P, Jordan MI, Klein D (2009) Learning semantic correspondences with less supervision. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 1, pp 91–99

  50. Lyu C, Titov I (2018) AMR parsing as graph prediction with latent alignment. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers), Melbourne, Australia, pp 397–407. https://doi.org/10.18653/v1/P18-1037

  51. Marimon M (2010) The Spanish resource grammar. In: Chair) NCC, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh international conference on language resources and evaluation (LREC’10), European Language Resources Association (ELRA), Valletta, Malta

  52. Mikulová M, Bémová A, Hajič J, Hajičová E, Havelka J, Kolářová V, Kučová L, Lopatková M, Pajas P, Panevová J, Razímová M, Sgall P, Štěpánek J, Urešová Z, Veselá K, Žabokrtský Z (2006) Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Tech. Rep. 30, Institute of Formal and Applied Linguistics, Charles Univ., Prague, Czech Rep

  53. Miller GA (1998) WordNet: an electronic lexical database. MIT Press, Chicago

    MATH  Google Scholar 

  54. Mithun M (1984) The evolution of noun incorporation. Language 60:847–94

    Article  Google Scholar 

  55. Mithun M (2015) Morphological complexity and language contact in languages indigenous to North America. Linguist Discov 13(2):37–59

    Article  Google Scholar 

  56. Nyberg EH, Mitamura T (1992) The KANT system: Fast, accurate, high-quality translation in practical domains. In: Proceedings of the 14th conference on Computational linguistics-Volume 3, pp 1069–1073

  57. Oepen S, Flickinger D, Toutanova K, Manning CD (2004) Lingo redwoods. Res Lang Comput 2(4):575–596

    Article  Google Scholar 

  58. O’Gorman T, Regan M, Griffitt K, Hermjakob U, Knight K, Palmer M (2018) AMR beyond the sentence: the multi-sentence AMR corpus. In: Proceedings of the 27th international conference on computational linguistics, pp 3693–3702

  59. Palmer M, Gildea D, Kingsbury P (2005) The proposition bank: an annotated corpus of semantic roles. Comput Linguist 31(1):71–106

    Article  Google Scholar 

  60. Prange J, Schneider N, Abend O (2019) Semantically constrained multilayer annotation: The case of coreference. In: Proceedings of the First International Workshop on Designing Meaning Representations, Florence, Italy, pp 164–176. https://doi.org/10.18653/v1/W19-3319

  61. Pustejovsky J, Castano JM, Ingria R, Sauri R, Gaizauskas RJ, Setzer A, Katz G, Radev DR (2003) TimeML: robust specification of event and temporal expressions in text. New Direct Quest Answer 3:28–34

    Google Scholar 

  62. Pustejovsky J, Xue N, Lai K (2019) Modeling quantification and scope in abstract meaning representations. In: Proceedings of the first international workshop on designing meaning representations, pp 28–33

  63. Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp 2383–2392. https://doi.org/10.18653/v1/D14

  64. Reddy S, Täckström O, Petrov S, Steedman M, Lapata M (2017) Universal semantic parsing. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp 89–101. https://doi.org/10.18653/v1/D17-1009

  65. Sag IA, Wasow T, Bender EM, Sag IA (1999) Syntactic theory: a formal introduction, vol 92. Center for the Study of Language and Information, Stanford

    MATH  Google Scholar 

  66. Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: A pain in the neck for NLP. In: International conference on intelligent text processing and computational linguistics, Springer, pp 1–15

  67. Saurí R, Pustejovsky J (2009) FactBank: a corpus annotated with event factuality. Lang Resour Evaluat 43:227–268

    Article  Google Scholar 

  68. Schuler KK (2005) Verbnet: a broad-coverage, comprehensive verb lexicon. PhD thesis, University of Pennsylvania

  69. Smith N (2017) Squashing computational linguistics, invited talk at the 55th acl. https://homes.cs.washington.edu/~nasmith/slides/acl-8-1-17.pdf

  70. Stassen L (1997) Intransitive predication. Oxford University Press, Oxford, UK

    Google Scholar 

  71. Stassen L (2009) Predicative possession. Oxford University Press, Oxford, UK

    Google Scholar 

  72. Steedman M (2000) The syntactic process, vol 24. MIT Press, Cambridge

    MATH  Google Scholar 

  73. Urešová Z, Fučíková E, Šindlerová J (2016) CzEngVallex: a bilingual Czech-English valency lexicon. The Prague Bulletin of Mathematical Linguistics 105:17–50

    Article  Google Scholar 

  74. Urešová Z, Fučíková E, Hajičová E, Hajič J (2020) Synsemclass linked lexicon: Mapping synonymy between languages. In: Proceedings of the 2020 Globalex Workshop on Linked Lexicography (LREC 2020), European Language Resources Association, Marseille, France, pp 10–19

  75. Van Gysel JEL, Vigus M, Kalm P, Lee Sk, Regan M, Croft W (2019) Cross-linguistic semantic annotation: Reconciling the language-specific and the universal. In: Proceedings of the First International Workshop on Designing Meaning Representations, Florence, Italy, pp 1–14. https://doi.org/10.18653/v1/W19-3301

  76. Vendler Z (1967) Linguistics in philosophy. Cornell University Press, Ithaca, chap Verbs and times, pp 97–121

    Book  Google Scholar 

  77. Vigus M, Van Gysel JE, Croft W (2019) A dependency structure annotation for modality. In: Proceedings of the First International Workshop on Designing Meaning Representations, pp 182–198

  78. Vigus M, Van Gysel JEL, O’Gorman T, Cowell A, Vallejos R, Croft W (2020) Cross-lingual annotation: a road map for low- and no-resource languages. In: Proceedings of the Second International Workshop on Designing Meaning Representations, Barcelona Spain (online), pp 30–40, https://www.aclweb.org/anthology/2020.dmr-1.4

  79. Wang C, Xue N, Pradhan S (2015) A transition-based algorithm for AMR parsing. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 366–375

  80. Wang Z, Mi H, Hamza W, Florian R (2016) Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:161204211

  81. Wong YW, Mooney RJ (2006) Learning for semantic parsing with statistical machine translation. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp 439–446

  82. Wu S, Zhang D, Yang N, Li M, Zhou M (2017) Sequence-to-dependency neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, pp 698–707

  83. Xiong C, Zhong V, Socher R (2016) Dynamic coattention networks for question answering. arXiv preprint arXiv:161101604

  84. Xue N, Zhong H, Chen KY (2008) Annotating ‘tense’ in a tense-less language. In: LREC, Marrakech, Morocco

  85. Xue N, Bojar O, Hajič J, Palmer M, Urešová Z, Zhang X (2014) Not an interlingua, but close: Comparison of English AMRs to Chinese and Czech. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC)

  86. Yao J, Qiu H, Min B, Xue N (2020) Annotating temporal dependency graphs via crowdsourcing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 5368–5380

  87. Zhang S, Ma X, Duh K, Van Durme B (2019) AMR parsing as sequence-to-graph transduction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp 80–94. https://doi.org/10.18653/v1/P19-1009

  88. Zhang Y, Xue N (2018) Structured interpretation of temporal relations. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), Miyazaki, Japan

  89. Zingler T (2020) Wordhood issues: typology and grammaticalization. PhD thesis, University of New Mexico

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nianwen Xue.

Additional information

This work is supported in part by a grant from the IIS Division of National Science Foundation (Awards Nos. 1763926, 1764048, 1764091) entitled “Building a Uniform Meaning Representation for Natural Language Processing” awarded to Nianwen Xue, James Pustejovsky, Martha Palmer and William Croft. All views expressed in this paper are those of the authors and do not necessarily represent the view of the National Science Foundation. This work is supported in part by a grant from the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2018101) and in part by the Czech Science Foundation (Award No. GX20-16819X).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Van Gysel, J.E.L., Vigus, M., Chun, J. et al. Designing a Uniform Meaning Representation for Natural Language Processing. Künstl Intell 35, 343–360 (2021). https://doi.org/10.1007/s13218-021-00722-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-021-00722-w

Keywords

Navigation