Skip to main content

A Framework for Automatic Population of Ontology-Based Digital Libraries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10037))

Abstract

Maintaining updated ontology-based digital libraries faces two main issues. First, documents are often unstructured and in heterogeneous data formats, making it even more difficult to extract information and search in. Second, manual ontology population is time consuming and therefore automatic methods to support this process are needed.

In this paper, we present an ontology-based framework aiming at populating ontologies. In particular, we propose an approach for triplet extraction from heterogeneous and unstructured documents in order to automatically populate ontology-based digital libraries. Finally, we evaluate the proposed framework on a real world case study.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www-nlpir.nist.gov/related_projects/muc/.

  2. 2.

    http://stardog.com.

  3. 3.

    We used OpenNLP default parameters, namely cutoff frequencies set to 5, number of iterations set to 100.

References

  1. Bush, V.: As we may think. ACM SIGPC Notes 1(4), 36–44 (1979)

    Article  Google Scholar 

  2. Zghal, H.B., Moreno, A.: A system for information retrieval in a medical digital library based on modular ontologies and query reformulation. Multimedia Tools Appl. 72(3), 2393–2412 (2014)

    Article  Google Scholar 

  3. Li, N., Zhu, L., Mitra, P., Mueller, K., Poweleit, E., Giles, C.L.: Orechem chemxseer: a semantic digital library for chemistry. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 245–254. ACM (2010)

    Google Scholar 

  4. Doerr, M., Gradmann, S., Hennicke, S., Isaac, A., Meghini, C., van de Sompel, H.: The Europeana data model (edm). In: World Library and Information Congress: 76th IFLA General Conference and Assembly, pp. 10–15 (2010)

    Google Scholar 

  5. Ruiz-Martınez, J.M., Minarro-Giménez, J.A., Castellanos-Nieves, D., Garcıa-Sánchez, F., Valencia-Garcia, R.: Ontology population: an application for the e-tourism domain. Int. J. Innovative Comput. Inf. Control (IJICIC) 7(11), 6115–6134 (2011)

    Google Scholar 

  6. Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving gate to meet new challenges in language engineering. Nat. Lang. Eng. 10(3–4), 349–373 (2004)

    Article  Google Scholar 

  7. Faria, C., Serra, I., Girardi, R.: A domain-independent process for automatic ontology population from text. Sci. Comput. Program. 95, 26–43 (2014)

    Article  Google Scholar 

  8. Antinucci, F., Cinque, G.: Sull’ordine delle parole in italiano: l’emarginazione. Studi di grammatica italiana VI, pp. 121–146 (1977)

    Google Scholar 

  9. Boschi, S.: La comunicazione vista dal nostro cervello. Lampi di stampa, Milan (2008)

    Google Scholar 

  10. Sabatini, F.: La comunicazione e gli usi della lingua. Loescher, Torino (1991)

    Google Scholar 

  11. Adrian, W.T., Leone, N., Manna, M.: Ontology-driven information extraction. arXiv preprint arXiv:1512.06034 (2015)

  12. Benammar, R., Trémeau, A., Maret, P.: An approach for ontology population based on information extraction techniques. In: Debruyne, C., Panetto, H., Meersman, R., Dillon, T., Weichhart, G., An, Y., Agostino Ardagna, C. (eds.) On the Move to Meaningful Internet Systems: OTM 2015 Conferences. LNCS, vol. 9415, pp. 397–404. Springer, Switzerland (2015). doi:10.1007/978-3-319-26148-5_26

    Chapter  Google Scholar 

  13. Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenic, D.: Triplet extraction from sentences. In: Proceedings of the 10th International Multiconference Information Society-IS, pp. 8–12 (2007)

    Google Scholar 

  14. Adorni, G., Maratea, M., Pandolfo, L., Pulina, L.: An ontology-based archive for historical research. In: Proceedings of the 28th International Workshop on Description Logics. CEUR Workshop Proceedings, Athens, Greece, 7–10 June 2015, vol. 1350. CEUR-WS.org (2015)

    Google Scholar 

  15. Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)

    Article  Google Scholar 

  16. Motik, B., Patel-Schneider, P.F., Parsia, B., Bock, C., Fokoue, A., Haase, P., Hoekstra, R., Horrocks, I., Ruttenberg, A., Sattler, U., et al.: Owl 2 web ontology language: structural specification and functional-style syntax. W3C Recommendation 27(65), 159 (2009)

    Google Scholar 

  17. Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Rudolph, S.: Owl 2 Web Ontology Language Primer, 2nd edn. W3C Recommendation, December 2012

    Google Scholar 

  18. Dale, R., Moisl, H., Somers, H.: Handbook of Natural Language Processing. CRC Press, Boca Raton (2000)

    Google Scholar 

  19. Jiang, J.: Information extraction from text. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 11–41. Springer, Heidelberg (2012). doi:10.1007/978-1-4614-3223-4_2

    Chapter  Google Scholar 

  20. Piskorski, J., Yangarber, R.: Information extraction: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 23–49. Springer, Heidelberg (2013). doi:10.1007/978-3-642-28569-1_2

    Chapter  Google Scholar 

  21. Saggion, H., Funk, A., Maynard, D., Bontcheva, K.: Ontology-Based Information Extraction for Business Intelligence. Springer, Heidelberg (2007)

    Book  Google Scholar 

  22. Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. J. Inf. Sci. 36, 306 (2010)

    Article  Google Scholar 

  23. The opennlp project (2005). http://opennlp.apache.org. Accessed June 2016

  24. Horridge, M., Bechhofer, S.: The owl API: a java api for owl ontologies. Semant. Web 2(1), 11–21 (2011)

    Google Scholar 

  25. Harris, S., Seaborne, A., Prudhommeaux, E.: Sparql 1.1 Query Language, vol. 21. W3C Recommendation (2013)

    Google Scholar 

  26. Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: Proceedings of the 1st International Conference on Global WordNet, pp. 293–302 (2002)

    Google Scholar 

  27. Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., Hendler, J.: N3logic: a logical framework for the world wide web. Theor. Pract. Logic Program. 8(03), 249–269 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  28. Adorni, G., Maratea, M., Pandolfo, L., Pulina, L.: An ontology for historical research documents. In: Cate, B., Mileo, A. (eds.) RR 2015. LNCS, vol. 9209, pp. 11–18. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22002-4_2

    Chapter  Google Scholar 

  29. Adorni, G., Maratea, M., Mura, S., Pandolfo, L., Pulina, L., Soddu, F.: A domain ontology for historical research documents. In: Artificial Intelligence for Cultural Heritage, pp. 25–48 Cambridge Scholars Publishing (2016)

    Google Scholar 

  30. Kontchakov, R., Pandolfo, L., Pulina, L., Ryzhikov, V., Zakharyaschev, M.: Temporal and spatial OBDA with many-dimensional Halpern-Shoham logic. To appear in Proceedings of IJCAI (2016)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Dr. Anastasia Di Nunzio for helpful discussion on linguistic typology studies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laura Pandolfo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Pandolfo, L., Pulina, L., Adorni, G. (2016). A Framework for Automatic Population of Ontology-Based Digital Libraries. In: Adorni, G., Cagnoni, S., Gori, M., Maratea, M. (eds) AI*IA 2016 Advances in Artificial Intelligence. AI*IA 2016. Lecture Notes in Computer Science(), vol 10037. Springer, Cham. https://doi.org/10.1007/978-3-319-49130-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49130-1_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49129-5

  • Online ISBN: 978-3-319-49130-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics