Skip to main content

Estimating Digitization Costs in Digital Libraries Using DiCoMo

  • Conference paper
Book cover Research and Advanced Technology for Digital Libraries (ECDL 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6273))

Included in the following conference series:

Abstract

The estimate of digitization costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and the times involved in the development of their contents. The common practice when we start digitizing a new collection is to set a schedule, and a firm commitment to fulfill it (both in terms of cost and deadlines), even before the actual digitization work starts. As it happens with software development projects, incorrect estimates produce delays and cause costs overdrafts.

Based on methods used in Software Engineering for software development cost prediction like COCOMO and Function Points, and using historical data gathered during five years at the Miguel de Cervantes Digital Library, during the digitization of more than 12.000 books, we have developed a method for time and cost estimates named DiCoMo (Digitization Costs Model) for digital content production in general. This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning and OCR, and undergoing human proofreading and error correction, or for the production of digital facsimiles (scanning without OCR). The accuracy of the estimates improve with time, since the algorithms can be optimized by making adjustments based on historical data gathered from previous tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albrecht, A.J., Gaffney, J.E.: Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation. IEEE Transactions on Software Engineering SE-9(6), 639–648 (1983)

    Article  Google Scholar 

  2. Banerjee, G.: Use Case Points, An Estimation Approach (August 2001), http://www2.fiit.stuba.sk/~bielik/courses/msi-slov/reporty/use_case_points.pdf

  3. Boehm, B.W.: Software Engineering Economics. Prentice Hall, Englewood Cliffs (1981)

    MATH  Google Scholar 

  4. Boehm, B., Clark, B., Horowitz, E., Westland, C., Madachy, R., Selby, R.: Cost Models for Future Software Life-Cycle Processes: COCOMO 2.0. In: Arthur, J., Henry, S. (eds.) Annals of Software Engineering Special Volume on Software Process and Product Measurement, vol. 1, pp. 45–60. J.C. Baltzer AG, Science Publishers, Amsterdam (1995)

    Google Scholar 

  5. Clark, B., Devnani-Chulani, S., Boehm, B.: Calibrating the COCOMO II Post-Architecture Model. In: 20th International Conference on Software Engineering, Center for Software Engineering, Computer Science Department, University of Southern California, Los Angeles, CA 90098-0781 USA, +1 213 740 6470 (April 1998), http://sunset.usc.edu/csse/TECHRPTS/1998/usccse98-502/CalPostArch.Pdf

  6. CSE: COCOMO II Model Definition Manual, Center for Software Engineering, Computer Science Department, University of Southern California, Los Angeles, Ca. 90089 (1997), http://sunset.usc.edu/csse/research/COCOMOII/cocomo2000.0/CII_modelman2000.0.pdf

  7. DeMarco, T., Lister, T.: Peopleware, Productive Projects and Teams. Dorset House Publishing, New York (1987)

    Google Scholar 

  8. Fairley, R.E.: Software Engineering Concepts. McGraw-Hill, New York (1985)

    Google Scholar 

  9. Galorath, D.: Software Project Failure Costs Billions. Better Estimation and Planning Can Help, June 7 (2008), http://www.galorath.com/wp/software-project-failure-costs-billions-better-estimation-planning-can-help.php

  10. LCI: Use Cases and Function Points, Longstreet Consulting Inc. (2004), http://www.ifpug.com/Articles/usecases.htm

  11. Magazinovic, A.: Exploring Cost Estimation Inaccuracy - Why do practitioners still fail to predict the actuals? Tech. rep., Chalmers University of Technology, Department of Computer Science and Engineering, Chalmers University of Technology, SE-41296 Göteborg, Sweden (2008), http://publications.lib.chalmers.se/cpl/record/index.xsql?pubid=73759

  12. Minkiewicz, A.F.: Measuring Object Oriented Software with Predictive Object Points, PRICE Systems, L.L.C (1997), http://www.pricesystems.com/whites_papers/Measuring%20Object%20Oriented%20Software%20with%20Predictive%20Object%20Points%20July%20%2797%20-%20Minikiewicz.pdf

  13. Sackman, H., et al.: Exploratory Experimental Studies comparing Online and Offline Programming Performance. Communications of the ACM 11(1) (January 1968)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bia, A., Muñoz, R., Gómez, J. (2010). Estimating Digitization Costs in Digital Libraries Using DiCoMo. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15464-5_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15463-8

  • Online ISBN: 978-3-642-15464-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics