Abstract
This paper describes the system architecture for generating the History of the Law developed for the Chilean National Library of Congress (BCN). The production system uses Semantic Web technologies, Akoma-Ntoso, and tools that automate the marking of plain text to XML, enriching and linking documents. These documents semantically annotated allow to develop specialized political and legislative services, and to extract knowledge for a Legal Knowledge Base for public use. We show the strategies used for the implementation of the automatic markup tools, as well as describe the knowledge graph generated from semantic documents. Finally, we show the contrast between the time of document processing using semantic technologies versus manual tasks, and the lessons learnt in this process, installing a base for the replication of a technological model that allows the generation of useful services for diverse contexts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
A test tool of the Automatic XML Marker can be found in http://bcn.cl/28n7h.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
References
Abolhassani, M., Fuhr, N., Gövert, N.: Information extraction and automatic markup for XML documents. In: Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 159–174. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45194-5_11
Akhtar, S., Reilly, R.G., Dunnion, J.: Automating XML markup using machine learning techniques. J. Systemics Cybern. Inform. 2(5), 12–16 (2004)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search, vol. 82. Pearson Education Ltd., Harlow (2011)
Bolioli, A., Dini, L., Mercatali, P., Romano, F.: For the automated mark-up of Italian legislative texts in XML. In: Legal Knowledge and Information Systems (Jurix 2002), pp. 21–30. IOS Press (2002)
Burget, R.: Automatic document structure detection for data integration. In: Abramowicz, W. (ed.) BIS 2007. LNCS, vol. 4439, pp. 391–397. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72035-5_30
Bizer, C., Hartig, O.: How to Publish Linked Data on the Web - Half-day Tutorial at the 7th International Semantic Web Conference (2008)
Cifuentes-Silva, F.: Service-Oriented Architecture for automatic markup of documents. An use case for legal documents. In: IFLA 2014, Lyon, p. 10 (2014)
Cifuentes-Silva, F., Sifaqui, C., Labra-Gayo, J.E.: Towards an architecture and adoption process for linked data technologies in open government contexts. In: Proceedings of the 7th International Conference on Semantic Systems - I-Semantics 2011, pp. 79–86 (2011)
Gacitua B.R., Aravena-Diaz, V., Cares, C., Cifuentes-Silva, F.: Conceptual distinctions for traceability of history of law. In: Rocha, A. (ed.) 11th Iberian Conference on Information Systems and Technologies (CISTI). IEEE (2016)
Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: KORE: keyphrase overlap relatedness for entity disambiguation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (2012)
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web J. (2018)
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, pp. 1–8. ACM, New York (2011)
Palmirani, M., Vitali, F.: Legislative XML: principles and technical tools. Technical report, Inter-American Development Bank (2012)
Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (2009)
Usbeck, R., et al.: AGDISTIS - graph-based disambiguation of named entities using Linked Data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 457–471. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_29
Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., Van de Walle, R.: Web-scale querying through Linked Data fragments. In: Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1184 (2014)
Acknowledgements
We wish to thank David Vilches, Eridan Otto, and Christian Sifaqui by their contribution to the development of the HL project, that was funded by the Library of Congress of Chile. The described research activities were partially funded by the Spanish Ministry of Economy and Competitiveness (Society challenges: TIN2017-88877-R).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cifuentes-Silva, F., Labra Gayo, J.E. (2019). Legislative Document Content Extraction Based on Semantic Web Technologies. In: Hitzler, P., et al. The Semantic Web. ESWC 2019. Lecture Notes in Computer Science(), vol 11503. Springer, Cham. https://doi.org/10.1007/978-3-030-21348-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-21348-0_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21347-3
Online ISBN: 978-3-030-21348-0
eBook Packages: Computer ScienceComputer Science (R0)