Skip to main content

Selected Challenges in Grammar-Based Text Generation from the Semantic Web

  • Chapter
  • First Online:
Artificial Intelligence

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11866))

Abstract

In this paper, based on the recent outcome of two shared tasks on structured data verbalisation, and examining one system in particular, we present some evidence why grammar-based systems are particularly relevant for the verbalisation of structured data as found in the Semantic Web. We then define possible future lines of research, centered around the FORGe system and the linguistic grounding of Semantic Web datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData.

  2. 2.

    http://lod-cloud.net/.

  3. 3.

    https://wiki.dbpedia.org/.

  4. 4.

    https://www.wikidata.org/wiki/Wikidata:Main_Page.

  5. 5.

    This information appears in the infobox of the corresponding Wikipedia page: https://en.wikipedia.org/wiki/Arr%C3%B2s_negre.

  6. 6.

    http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=45093.

  7. 7.

    One of the first papers mentioning multiple datasets was published in 2012 [10].

  8. 8.

    It took about two hours to adapt FORGe to a hundred new DBpedia properties.

  9. 9.

    See also [6] for an overview of models to represent linked data and their issues.

References

  1. Androutsopoulos, I., Lampouras, G., Galanis, D.: Generating natural language descriptions from OWL ontologies: the naturalowl system. J. Artif. Intell. Res. 48, 671–715 (2013)

    Article  Google Scholar 

  2. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)

    Google Scholar 

  3. Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. J. Nat. Lang. Eng. 14(4), 431–455 (2008)

    Article  Google Scholar 

  4. Belz, A., White, M., Espinosa, D., Kow, E., Hogan, D., Stent, A.: The first surface realisation shared task: overview and evaluation results. In: Proceedings of the Generation Challenges Session at the 13th European Workshop on Natural Language Generation (ENLG), Nancy, France, pp. 217–226 (2011)

    Google Scholar 

  5. Bontcheva, K., Wilks, Y.: Automatic report generation from ontologies: The MIAKT approach. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 324–335. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27779-8_28

    Chapter  Google Scholar 

  6. Bosque-Gil, J., Gracia, J., Montiel-Ponsoda, E., Gómez-Pérez, A.: Models to represent linguistic linked data. Nat. Lang. Eng. 24(6), 811–859 (2018)

    Article  Google Scholar 

  7. Bouayad-Agha, N., Casamayor, G., Mille, S., Wanner, L.: Perspective-oriented generation of football match summaries: old tasks, new challenges. ACM Trans. Speech Lang. Process. 9(2), 3:1–3:31 (2012)

    Article  Google Scholar 

  8. Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)

    Google Scholar 

  9. Corcoglioniti, F., Rospocher, M., Aprosio, A.P., Tonelli, S.: PreMON: a lemon extension for exposing predicate models as linked data. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), pp. 877–884 (2016)

    Google Scholar 

  10. Dannélls, D., Damova, M., Enache, R., Chechev, M.: Multilingual online generation from semantic web ontologies. In: Proceedings of the 21st International Conference on World Wide Web, pp. 239–242. ACM (2012)

    Google Scholar 

  11. Elder, H., Gehrmann, S., O’Connor, A., Liu, Q.: E2E NLG challenge submission: towards controllable generation of diverse natural language. In: Proceedings of the 11th International Conference on Natural Language Generation, pp. 457–462 (2018)

    Google Scholar 

  12. Fillmore, C.J., Baker, C.F., Sato, H.: The FrameNet database and software tools. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1157–1160 (2002)

    Google Scholar 

  13. Galanis, D., Androutsopoulos, I.: Generating multilingual descriptions from linguistically annotated OWL ontologies: the naturalowl system. In: Proceedings of the Eleventh European Workshop on Natural Language Generation, pp. 143–146. Association for Computational Linguistics (2007)

    Google Scholar 

  14. Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: Creating training corpora for micro-planners. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, August 2017

    Google Scholar 

  15. Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: generating text from RDF data. In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 124–133 (2017)

    Google Scholar 

  16. Gatt, A., Krahmer, E.: Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Intell. Res. 61, 65–170 (2018)

    Article  MathSciNet  Google Scholar 

  17. Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)

    Article  Google Scholar 

  18. Kingsbury, P., Palmer, M.: From TreeBank to PropBank. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1989–1993 (2002)

    Google Scholar 

  19. Kwok, C., Etzioni, O., Weld, D.S.: Scaling question answering to the web. ACM Trans. Inf. Syst. (TOIS) 19(3), 242–262 (2001)

    Article  Google Scholar 

  20. Lareau, F., Lambrey, F., Dubinskaite, I., Galarreta-Piquette, D., Nejat, M.: GenDR: a generic deep realizer with complex lexicalization. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan, pp. 3018–3025 (2018)

    Google Scholar 

  21. Mel’čuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press, Albany (1988)

    Google Scholar 

  22. Meyers, A., et al.: The NomBank project: an interim report. In: Proceedings of the Workshop on Frontiers in Corpus Annotation, Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), Boston, MA, USA, pp. 24–31 (2004)

    Google Scholar 

  23. Mille, S., Belz, A., Bohnet, B., Graham, Y., Pitler, E., Wanner, L.: The first multilingual surface realisation shared task (SR 2018): overview and evaluation results. In: Proceedings of the 1st Workshop on Multilingual Surface Realisation (MSR), 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, pp. 1–12 (2018)

    Google Scholar 

  24. Mille, S., Carlini, R., Burga, A., Wanner, L.: FORGe at SemEval-2017 task 9: deep sentence generation based on a sequence of graph transducers. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 917–920. Association for Computational Linguistics, August 2017. http://www.aclweb.org/anthology/S17-2158

  25. Mille, S., Wanner, L.: Towards large-coverage detailed lexical resources for data-to-text generation. In: Proceedings of the First International Workshop on Data-to-text Generation, Edinburgh, Scotland (2015)

    Google Scholar 

  26. Nayak, N., Hakkani-Tür, D., Walker, M.A., Heck, L.P.: To plan or not to plan? discourse planning in slot-value informed sequence to sequence models for language generation. In: Proceedings of INTERSPEECH, Stockholm, Sweden, pp. 3339–3343 (2017)

    Google Scholar 

  27. Novikova, J., Dušek, O., Rieser, V.: The E2E dataset: new challenges for end-to-end generation. In: Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Saarbrücken, Germany (2017). https://arxiv.org/abs/1706.09254, arXiv:1706.09254

  28. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  29. Perez-Beltrachini, L., Gardent, C.: Learning embeddings to lexicalise RDF properties. In: * SEM 2016, The Fifth Joint Conference on Lexical and Computational Semantics, pp. 219–228 (2016)

    Google Scholar 

  30. Rambow, O., Korelsky, T.: Applied text generation. In: Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP), Trento, Italy, pp. 40–47 (1992)

    Google Scholar 

  31. Schuler, K.K.: VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2005)

    Google Scholar 

  32. Shimorina, A., Gardent, C., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: report on human evaluation. Technical report, Université de Lorraine, Nancy, France (2017)

    Google Scholar 

  33. Stevens, R., Malone, J., Williams, S., Power, R., Third, A.: Automating generation of textual class definitions from OWL to English. J. Biomed. Semant. 2, S5 (2011). BioMed Central

    Article  Google Scholar 

  34. Walter, S., Unger, C., Cimiano, P.: M-ATOLL: a framework for the lexicalization of ontologies in multiple languages. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 472–486. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_30

    Chapter  Google Scholar 

  35. Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: MARQUIS: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24(10), 914–952 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

The work reported in this paper has been partly supported by the European Commission in the framework of the H2020 Programme under the contract numbers 700475-IA, 700024-RIA, 779962-RIA, 786731-RIA and 825079-ICT-STARTS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Mille .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mille, S. (2019). Selected Challenges in Grammar-Based Text Generation from the Semantic Web. In: Osipov, G., Panov, A., Yakovlev, K. (eds) Artificial Intelligence. Lecture Notes in Computer Science(), vol 11866. Springer, Cham. https://doi.org/10.1007/978-3-030-33274-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33274-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33273-0

  • Online ISBN: 978-3-030-33274-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics