Advertisement

Generation and Use of a Digest System by Integrating OCR and Smart Searches

  • Germán Cáseres
  • Lisandro Delía
  • Pablo Thomas
  • Verónica Aguirre
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 790)

Abstract

A digest can be defined as a regulations repository which is manipulated by organizations for extended time periods. The search for information in this repositories can be tedious without assistance from an ad-hoc software application. This work presents the development of a Digest Software System with its architecture and integration with other base tools. Finally, two study cases are presented where the developed product is used.

Keywords

Digest Full text search OCR Solr Tika 

References

  1. 1.
    Shalin, A., Chopra, A., Ghadge, A.: Optical character recognition. IJARCCE, 3, January 2014Google Scholar
  2. 2.
    Thomas, P., Bertone, R.: Introducción a las bases de datos. Prentice Hall - Pearson Education, Upper Saddle River (2011). ISBN 9789876153515Google Scholar
  3. 3.
    Tesseract OCR. https://github.com/tesseract-ocr/tesseract. Accessed 21 July
  4. 4.
    Apache Solr. http://lucene.apache.org/solr/. Accessed 21 July
  5. 5.
    Apache Lucene. http://lucene.apache.org/. Accessed 21 July
  6. 6.
    Richardson, L., Ruby, S.: RESTful Web Services, 1st edn. O’Reilly Media, California (2007). ISBN 9780596529260Google Scholar
  7. 7.
    Sadalage, P.J., Fowler, M.: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence, 1st edn. Addison-Wesley Professional, Reading (2012). ISBN 9780321826626Google Scholar
  8. 8.
    Apache Tika. https://tika.apache.org/. Accessed 21 July
  9. 9.
    Buyya, R.: High Performance Cluster Computing: Architectures and Systems, 1st edn. Prentice Hall, Upper Saddle River (1999). ISBN 9780130137845Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Germán Cáseres
    • 1
    • 2
  • Lisandro Delía
    • 1
    • 2
  • Pablo Thomas
    • 1
    • 2
  • Verónica Aguirre
    • 1
    • 2
  1. 1.Instituto de Investigación en Informática III-LIDIUniversidad Nacional de La PlataLa PlataArgentina
  2. 2.Centro Asociado Comisión de Investigaciones Científicas de la Provincia de Buenos AiresLa PlataArgentina

Personalised recommendations