Legislative Document Content Extraction Based on Semantic Web Technologies

Cifuentes-Silva, Francisco; Labra Gayo, Jose Emilio

doi:10.1007/978-3-030-21348-0_36

Francisco Cifuentes-Silva^16,17 &
Jose Emilio Labra Gayo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11503))

Included in the following conference series:

European Semantic Web Conference

2626 Accesses
3 Citations
3 Altmetric

Abstract

This paper describes the system architecture for generating the History of the Law developed for the Chilean National Library of Congress (BCN). The production system uses Semantic Web technologies, Akoma-Ntoso, and tools that automate the marking of plain text to XML, enriching and linking documents. These documents semantically annotated allow to develop specialized political and legislative services, and to extract knowledge for a Legal Knowledge Base for public use. We show the strategies used for the implementation of the automatic markup tools, as well as describe the knowledge graph generated from semantic documents. Finally, we show the contrast between the time of document processing using semantic technologies versus manual tasks, and the lessons learnt in this process, installing a base for the replication of a technological model that allows the generation of useful services for diverse contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://docs.oasis-open.org/legaldocml/ns/akn/3.0.
2.
https://code.google.com/archive/p/weso-desh/.
3.
http://purl.org/vocab/frbr/core.
4.
http://www.geonames.org/ontology.
5.
http://www.w3.org/2003/01/geo/wgs84_pos.
6.
http://datos.bcn.cl/sparql.
7.
https://www.leychile.cl.
8.
https://datos.bcn.cl/es/ontologias.
9.
A test tool of the Automatic XML Marker can be found in http://bcn.cl/28n7h.
10.
https://nlp.stanford.edu/software/CRF-NER.shtml.
11.
https://spacy.io/usage/linguistic-features#section-named-entities.
12.
https://opennlp.apache.org.
13.
http://lime.cirsfid.unibo.it.
14.
https://xcential.com/legispro-xml-tech/.
15.
https://at4am.eu.
16.
https://github.com/bungeni-org.
17.
http://www.ittig.cnr.it/lab/xmlegeseditor.
18.
https://ec.europa.eu/isa2/solutions/leos.
19.
https://www.bcn.cl/historiadelaley.
20.
https://www.bcn.cl/laborparlamentaria.
21.
https://www.bcn.cl/presupuesto.

References

Abolhassani, M., Fuhr, N., Gövert, N.: Information extraction and automatic markup for XML documents. In: Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 159–174. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45194-5_11
Chapter Google Scholar
Akhtar, S., Reilly, R.G., Dunnion, J.: Automating XML markup using machine learning techniques. J. Systemics Cybern. Inform. 2(5), 12–16 (2004)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search, vol. 82. Pearson Education Ltd., Harlow (2011)
Google Scholar
Bolioli, A., Dini, L., Mercatali, P., Romano, F.: For the automated mark-up of Italian legislative texts in XML. In: Legal Knowledge and Information Systems (Jurix 2002), pp. 21–30. IOS Press (2002)
Google Scholar
Burget, R.: Automatic document structure detection for data integration. In: Abramowicz, W. (ed.) BIS 2007. LNCS, vol. 4439, pp. 391–397. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72035-5_30
Chapter Google Scholar
Bizer, C., Hartig, O.: How to Publish Linked Data on the Web - Half-day Tutorial at the 7th International Semantic Web Conference (2008)
Google Scholar
Cifuentes-Silva, F.: Service-Oriented Architecture for automatic markup of documents. An use case for legal documents. In: IFLA 2014, Lyon, p. 10 (2014)
Google Scholar
Cifuentes-Silva, F., Sifaqui, C., Labra-Gayo, J.E.: Towards an architecture and adoption process for linked data technologies in open government contexts. In: Proceedings of the 7th International Conference on Semantic Systems - I-Semantics 2011, pp. 79–86 (2011)
Google Scholar
Gacitua B.R., Aravena-Diaz, V., Cares, C., Cifuentes-Silva, F.: Conceptual distinctions for traceability of history of law. In: Rocha, A. (ed.) 11th Iberian Conference on Information Systems and Technologies (CISTI). IEEE (2016)
Google Scholar
Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: KORE: keyphrase overlap relatedness for entity disambiguation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (2012)
Google Scholar
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Chapter Google Scholar
Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web J. (2018)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, I-Semantics 2011, pp. 1–8. ACM, New York (2011)
Google Scholar
Palmirani, M., Vitali, F.: Legislative XML: principles and technical tools. Technical report, Inter-American Development Bank (2012)
Google Scholar
Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (2009)
Article Google Scholar
Usbeck, R., et al.: AGDISTIS - graph-based disambiguation of named entities using Linked Data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 457–471. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_29
Chapter Google Scholar
Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., Van de Walle, R.: Web-scale querying through Linked Data fragments. In: Proceedings of the 7th Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1184 (2014)
Google Scholar

Download references

Acknowledgements

We wish to thank David Vilches, Eridan Otto, and Christian Sifaqui by their contribution to the development of the HL project, that was funded by the Library of Congress of Chile. The described research activities were partially funded by the Spanish Ministry of Economy and Competitiveness (Society challenges: TIN2017-88877-R).

Author information

Authors and Affiliations

Department of Computer Science, University of Oviedo, Oviedo, Asturias, Spain
Francisco Cifuentes-Silva & Jose Emilio Labra Gayo
Biblioteca del Congreso Nacional de Chile - BCN, Valparaíso, Chile
Francisco Cifuentes-Silva

Authors

Francisco Cifuentes-Silva
View author publications
You can also search for this author in PubMed Google Scholar
Jose Emilio Labra Gayo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Francisco Cifuentes-Silva or Jose Emilio Labra Gayo .

Editor information

Editors and Affiliations

Wright State University, Dayton, OH, USA
Pascal Hitzler
KMi, The Open University, Milton Keynes, UK
Miriam Fernández
University of California, Santa Barbara, CA, USA
Krzysztof Janowicz
Maastricht University, Maastricht, The Netherlands
Amrapali Zaveri
Heriot-Watt University, Edinburgh, UK
Alasdair J.G. Gray
IBM Research, Dublin, Ireland
Vanessa Lopez
The Australian National University, Canberra, ACT, Australia
Armin Haller
Jönköping University, Jönköping, Sweden
Karl Hammar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cifuentes-Silva, F., Labra Gayo, J.E. (2019). Legislative Document Content Extraction Based on Semantic Web Technologies. In: Hitzler, P., et al. The Semantic Web. ESWC 2019. Lecture Notes in Computer Science(), vol 11503. Springer, Cham. https://doi.org/10.1007/978-3-030-21348-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-21348-0_36
Published: 25 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21347-3
Online ISBN: 978-3-030-21348-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics