Legislative Document Content Extraction Based on Semantic Web Technologies

A Use Case About Processing the History of the Law
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11503)


This paper describes the system architecture for generating the History of the Law developed for the Chilean National Library of Congress (BCN). The production system uses Semantic Web technologies, Akoma-Ntoso, and tools that automate the marking of plain text to XML, enriching and linking documents. These documents semantically annotated allow to develop specialized political and legislative services, and to extract knowledge for a Legal Knowledge Base for public use. We show the strategies used for the implementation of the automatic markup tools, as well as describe the knowledge graph generated from semantic documents. Finally, we show the contrast between the time of document processing using semantic technologies versus manual tasks, and the lessons learnt in this process, installing a base for the replication of a technological model that allows the generation of useful services for diverse contexts.


Linked Open Data Legal information systems Legal domain Legal Knowledge Base Automatic markup Semantic Web 



We wish to thank David Vilches, Eridan Otto, and Christian Sifaqui by their contribution to the development of the HL project, that was funded by the Library of Congress of Chile. The described research activities were partially funded by the Spanish Ministry of Economy and Competitiveness (Society challenges: TIN2017-88877-R).


  1. 1.
    Abolhassani, M., Fuhr, N., Gövert, N.: Information extraction and automatic markup for XML documents. In: Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 159–174. Springer, Heidelberg (2003). Scholar
  2. 2.
    Akhtar, S., Reilly, R.G., Dunnion, J.: Automating XML markup using machine learning techniques. J. Systemics Cybern. Inform. 2(5), 12–16 (2004)Google Scholar
  3. 3.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search, vol. 82. Pearson Education Ltd., Harlow (2011)Google Scholar
  4. 4.
    Bolioli, A., Dini, L., Mercatali, P., Romano, F.: For the automated mark-up of Italian legislative texts in XML. In: Legal Knowledge and Information Systems (Jurix 2002), pp. 21–30. IOS Press (2002)Google Scholar
  5. 5.
    Burget, R.: Automatic document structure detection for data integration. In: Abramowicz, W. (ed.) BIS 2007. LNCS, vol. 4439, pp. 391–397. Springer, Heidelberg (2007). Scholar
  6. 6.
    Bizer, C., Hartig, O.: How to Publish Linked Data on the Web - Half-day Tutorial at the 7th International Semantic Web Conference (2008)Google Scholar
  7. 7.
    Cifuentes-Silva, F.: Service-Oriented Architecture for automatic markup of documents. An use case for legal documents. In: IFLA 2014, Lyon, p. 10 (2014)Google Scholar
  8. 8.
    Cifuentes-Silva, F., Sifaqui, C., Labra-Gayo, J.E.: Towards an architecture and adoption process for linked data technologies in open government contexts. In: Proceedings of the 7th International Conference on Semantic Systems - I-Semantics 2011, pp. 79–86 (2011)Google Scholar
  9. 9.
    Gacitua B.R., Aravena-Diaz, V., Cares, C., Cifuentes-Silva, F.: Conceptual distinctions for traceability of history of law. In: Rocha, A. (ed.) 11th Iberian Conference on Information Systems and Technologies (CISTI). IEEE (2016)Google Scholar
  10. 10.
    Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: KORE: keyphrase overlap relatedness for entity disambiguation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (2012)Google Scholar
  11. 11.
    Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). Scholar
  12. 12.
    Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web J. (2018)Google Scholar
  13. 13.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, pp. 1–8. ACM, New York (2011)Google Scholar
  14. 14.
    Palmirani, M., Vitali, F.: Legislative XML: principles and technical tools. Technical report, Inter-American Development Bank (2012)Google Scholar
  15. 15.
    Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (2009)CrossRefGoogle Scholar
  16. 16.
    Usbeck, R., et al.: AGDISTIS - graph-based disambiguation of named entities using Linked Data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 457–471. Springer, Cham (2014). Scholar
  17. 17.
    Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., Van de Walle, R.: Web-scale querying through Linked Data fragments. In: Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1184 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of OviedoOviedoSpain
  2. 2.Biblioteca del Congreso Nacional de Chile - BCNValparaísoChile

Personalised recommendations