Chapter

Approaches to Legal Ontologies

Volume 1 of the series Law, Governance and Technology Series pp 179-200

Date:

From Thesaurus Towards Ontologies in Large Legal Databases

  • Ángel Sancho FerrerAffiliated withResearch and Development Department, Wolters Kluwer Spain Email author 
  • , Carlos Fernández HernándezAffiliated withResearch and Development Department, Wolters Kluwer Spain
  • , José Manuel Mateo RiveroAffiliated withResearch and Development Department, Wolters Kluwer Spain

* Final gross prices may vary according to local VAT.

Get Access

Abstract

We are in the middle of an historical paradigm shift. It is a change similar in scale to those confronting the Library of Alexandria, twenty-two centuries ago. Metadata, indexes and taxonomies were the paradigm during the age of paper and print, and librarians and publishers leveraged them for searching. Now the amount of documents has grown to levels that make those traditional tools less efficient for users and less affordable for publishers. But, in the last three decades, search technologies have created new solutions such as direct queries, relevance ranking or faceted results, as well as the promises of conceptual search engines and ontologies. However, this integration of legal knowledge has not yet proven scalable in large databases: the improvements in recall have a negative effect on precision and performance. We have focused in one key behavior of legal experts in legal searches: the creation of “better queries” as a result of knowledge of the domain and search techniques. This is the same that happens on taxonomical classical searches, but in full-text we could try to encode part of that knowledge in a search engine. To achieve this goal, we have developed both the technology to semantically analyze documents and queries, and a methodology to fill a dictionary with 10,000 concepts and 40,000 expressions. This has been put in production with a 3 million legal documents database. In addition to the semantic improvements, these developments have created significant improvements in the relevance algorithm and complementary tools such as dynamic summaries and query reformulation trough local context analysis.