PACE: A General-Purpose Tool for Authority Control

  • Paolo Manghi
  • Marko Mikulicic
Conference paper

DOI: 10.1007/978-3-642-24731-6_8

Volume 240 of the book series Communications in Computer and Information Science (CCIS)
Cite this paper as:
Manghi P., Mikulicic M. (2011) PACE: A General-Purpose Tool for Authority Control. In: García-Barriocanal E., Cebeci Z., Okur M.C., Öztürk A. (eds) Metadata and Semantic Research. MTSR 2011. Communications in Computer and Information Science, vol 240. Springer, Berlin, Heidelberg

Abstract

Curating the records of an authority file is an activity as important as committing for many organizations, which have to rely on experts equipped with so-called authority control tools, capable of automatically supporting complex disambiguation workflows through user-friendly interfaces. This paper presents PACE, an open source authority control tool which offers user interfaces for (i) customizing the structure (ontology) of authority files, (ii) tune-up probabilistic disambiguation of authority files through a set of similarity functions for detecting record candidates for duplication and overload (iii) curate such authority files by applying record merges and splitting actions, and (iv) expose authority files to third-party consumers in several ways. PACE’s back-end is based on Cassandra’s “NOSQL”technology to offer (i) read-write performances that scale up linearly with the number of records and (ii) parallel and efficient (MapReduce-based) record sorting and matching algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Paolo Manghi
    • 1
  • Marko Mikulicic
    • 1
  1. 1.Istituto di Scienza e Tecnologie dell’Informazione “Alessandro Faedo”, Consiglio Nazionale delle RicerchePisaItaly