Skip to main content

AIR: A Semi-Automatic System for Archiving Institutional Repositories

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5723))

Abstract

Manual population of institutional repositories with citation data is an extremely time- and resource-consuming process. These costs act as a bottleneck on the fast growth and update of large repositories. This paper aims to describe the AIR system developed at the university of Wolverhampton to address this problem. The system implements a semi-automatic approach for archiving institutional repositories: firstly, it automatically discovers and extracts bibliographical data from the university web site, and, secondly, it interacts with users, authors or librarians, who verify and correct extracted data. The system is integrated with the Wolverhampton Intellectual Repository and E-theses (WIRE), which was designed on the basis of standard software adopted by many UK universities. In this paper we demonstrate that the system can considerably increase the intake of new publication data into an institutional repository without any compromise to its quality.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Haase, P., Broekstra, J., Ehrig, M., Menken, M., Plechawski, M., Pyszlak, P., Schnizler, B., Siebes, R., Staab, S., Tempich, C.: Bibster - a semantics-based bibliographic peer-to-peer system. In: Proceedings of the Third International Semantic Web Conference, pp. 122–136 (2004)

    Google Scholar 

  2. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  3. Kohavi, R.: The power of decision tables. In: Proceedings of the European Conference on Machine Learning, pp. 174–189. Springer, Heidelberg (1995)

    Google Scholar 

  4. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods: support vector learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  5. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  6. McCallum, A.K.: Mallet: A machine learning for language toolkit (2002), http://mallet.cs.umass.edu

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ponomareva, N., Gomez, J.M., Pekar, V. (2010). AIR: A Semi-Automatic System for Archiving Institutional Repositories. In: Horacek, H., Métais, E., Muñoz, R., Wolska, M. (eds) Natural Language Processing and Information Systems. NLDB 2009. Lecture Notes in Computer Science, vol 5723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12550-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12550-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12549-2

  • Online ISBN: 978-3-642-12550-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics