Selected Challenges in Grammar-Based Text Generation from the Semantic Web

Mille, Simon

doi:10.1007/978-3-030-33274-7_5

Simon Mille ORCID: orcid.org/0000-0002-8852-2764¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11866))

2311 Accesses
1 Citations

Abstract

In this paper, based on the recent outcome of two shared tasks on structured data verbalisation, and examining one system in particular, we present some evidence why grammar-based systems are particularly relevant for the verbalisation of structured data as found in the Semantic Web. We then define possible future lines of research, centered around the FORGe system and the linguistic grounding of Semantic Web datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData.
2.
http://lod-cloud.net/.
3.
https://wiki.dbpedia.org/.
4.
https://www.wikidata.org/wiki/Wikidata:Main_Page.
5.
This information appears in the infobox of the corresponding Wikipedia page: https://en.wikipedia.org/wiki/Arr%C3%B2s_negre.
6.
http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=45093.
7.
One of the first papers mentioning multiple datasets was published in 2012 [10].
8.
It took about two hours to adapt FORGe to a hundred new DBpedia properties.
9.
See also [6] for an overview of models to represent linked data and their issues.

References

Androutsopoulos, I., Lampouras, G., Galanis, D.: Generating natural language descriptions from OWL ontologies: the naturalowl system. J. Artif. Intell. Res. 48, 671–715 (2013)
Article Google Scholar
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Google Scholar
Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. J. Nat. Lang. Eng. 14(4), 431–455 (2008)
Article Google Scholar
Belz, A., White, M., Espinosa, D., Kow, E., Hogan, D., Stent, A.: The first surface realisation shared task: overview and evaluation results. In: Proceedings of the Generation Challenges Session at the 13th European Workshop on Natural Language Generation (ENLG), Nancy, France, pp. 217–226 (2011)
Google Scholar
Bontcheva, K., Wilks, Y.: Automatic report generation from ontologies: The MIAKT approach. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 324–335. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27779-8_28
Chapter Google Scholar
Bosque-Gil, J., Gracia, J., Montiel-Ponsoda, E., Gómez-Pérez, A.: Models to represent linguistic linked data. Nat. Lang. Eng. 24(6), 811–859 (2018)
Article Google Scholar
Bouayad-Agha, N., Casamayor, G., Mille, S., Wanner, L.: Perspective-oriented generation of football match summaries: old tasks, new challenges. ACM Trans. Speech Lang. Process. 9(2), 3:1–3:31 (2012)
Article Google Scholar
Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)
Google Scholar
Corcoglioniti, F., Rospocher, M., Aprosio, A.P., Tonelli, S.: PreMON: a lemon extension for exposing predicate models as linked data. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), pp. 877–884 (2016)
Google Scholar
Dannélls, D., Damova, M., Enache, R., Chechev, M.: Multilingual online generation from semantic web ontologies. In: Proceedings of the 21st International Conference on World Wide Web, pp. 239–242. ACM (2012)
Google Scholar
Elder, H., Gehrmann, S., O’Connor, A., Liu, Q.: E2E NLG challenge submission: towards controllable generation of diverse natural language. In: Proceedings of the 11th International Conference on Natural Language Generation, pp. 457–462 (2018)
Google Scholar
Fillmore, C.J., Baker, C.F., Sato, H.: The FrameNet database and software tools. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1157–1160 (2002)
Google Scholar
Galanis, D., Androutsopoulos, I.: Generating multilingual descriptions from linguistically annotated OWL ontologies: the naturalowl system. In: Proceedings of the Eleventh European Workshop on Natural Language Generation, pp. 143–146. Association for Computational Linguistics (2007)
Google Scholar
Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: Creating training corpora for micro-planners. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, August 2017
Google Scholar
Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: generating text from RDF data. In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 124–133 (2017)
Google Scholar
Gatt, A., Krahmer, E.: Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Intell. Res. 61, 65–170 (2018)
Article MathSciNet Google Scholar
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Article Google Scholar
Kingsbury, P., Palmer, M.: From TreeBank to PropBank. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1989–1993 (2002)
Google Scholar
Kwok, C., Etzioni, O., Weld, D.S.: Scaling question answering to the web. ACM Trans. Inf. Syst. (TOIS) 19(3), 242–262 (2001)
Article Google Scholar
Lareau, F., Lambrey, F., Dubinskaite, I., Galarreta-Piquette, D., Nejat, M.: GenDR: a generic deep realizer with complex lexicalization. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan, pp. 3018–3025 (2018)
Google Scholar
Mel’čuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press, Albany (1988)
Google Scholar
Meyers, A., et al.: The NomBank project: an interim report. In: Proceedings of the Workshop on Frontiers in Corpus Annotation, Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), Boston, MA, USA, pp. 24–31 (2004)
Google Scholar
Mille, S., Belz, A., Bohnet, B., Graham, Y., Pitler, E., Wanner, L.: The first multilingual surface realisation shared task (SR 2018): overview and evaluation results. In: Proceedings of the 1st Workshop on Multilingual Surface Realisation (MSR), 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, pp. 1–12 (2018)
Google Scholar
Mille, S., Carlini, R., Burga, A., Wanner, L.: FORGe at SemEval-2017 task 9: deep sentence generation based on a sequence of graph transducers. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 917–920. Association for Computational Linguistics, August 2017. http://www.aclweb.org/anthology/S17-2158
Mille, S., Wanner, L.: Towards large-coverage detailed lexical resources for data-to-text generation. In: Proceedings of the First International Workshop on Data-to-text Generation, Edinburgh, Scotland (2015)
Google Scholar
Nayak, N., Hakkani-Tür, D., Walker, M.A., Heck, L.P.: To plan or not to plan? discourse planning in slot-value informed sequence to sequence models for language generation. In: Proceedings of INTERSPEECH, Stockholm, Sweden, pp. 3339–3343 (2017)
Google Scholar
Novikova, J., Dušek, O., Rieser, V.: The E2E dataset: new challenges for end-to-end generation. In: Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Saarbrücken, Germany (2017). https://arxiv.org/abs/1706.09254, arXiv:1706.09254
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Perez-Beltrachini, L., Gardent, C.: Learning embeddings to lexicalise RDF properties. In: * SEM 2016, The Fifth Joint Conference on Lexical and Computational Semantics, pp. 219–228 (2016)
Google Scholar
Rambow, O., Korelsky, T.: Applied text generation. In: Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP), Trento, Italy, pp. 40–47 (1992)
Google Scholar
Schuler, K.K.: VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2005)
Google Scholar
Shimorina, A., Gardent, C., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: report on human evaluation. Technical report, Université de Lorraine, Nancy, France (2017)
Google Scholar
Stevens, R., Malone, J., Williams, S., Power, R., Third, A.: Automating generation of textual class definitions from OWL to English. J. Biomed. Semant. 2, S5 (2011). BioMed Central
Article Google Scholar
Walter, S., Unger, C., Cimiano, P.: M-ATOLL: a framework for the lexicalization of ontologies in multiple languages. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 472–486. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_30
Chapter Google Scholar
Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: MARQUIS: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24(10), 914–952 (2010)
Article Google Scholar

Download references

Acknowledgements

The work reported in this paper has been partly supported by the European Commission in the framework of the H2020 Programme under the contract numbers 700475-IA, 700024-RIA, 779962-RIA, 786731-RIA and 825079-ICT-STARTS.

Author information

Authors and Affiliations

Universitat Pompeu Fabra, Barcelona, Spain
Simon Mille

Authors

Simon Mille
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Mille .

Editor information

Editors and Affiliations

Federal Research Center "Computer Science and Control", Moscow, Russia
Gennady S. Osipov
Federal Research Center "Computer Science and Control", Moscow, Russia
Aleksandr I. Panov
Federal Research Center "Computer Science and Control", Moscow, Russia
Konstantin S. Yakovlev

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mille, S. (2019). Selected Challenges in Grammar-Based Text Generation from the Semantic Web. In: Osipov, G., Panov, A., Yakovlev, K. (eds) Artificial Intelligence. Lecture Notes in Computer Science(), vol 11866. Springer, Cham. https://doi.org/10.1007/978-3-030-33274-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-33274-7_5
Published: 14 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33273-0
Online ISBN: 978-3-030-33274-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics