Using Special Text Points in the Recognition of Documents

Part of the Studies in Systems, Decision and Control book series (SSDC, volume 259)


The chapter develops the concept of a textual key point, the detector of which is a certain OCR. The descriptor of a textual key point is determined. Examples of algorithms for analyzing documents, using textual key points, are given. The chapter deals with the tasks of recognized document classification, localization of images of recognized documents and comparison of images of documents for finding differences. The results of the algorithms for the data sets of the documents of the Russian Federation are given. The proposed methods allow achieving high accuracy of complexly structured documents analysis with entering document images in modern cyber-physical systems based on big data technologies.


Character recognition Key point Textual key point Document classification Document localization 


  1. 1.
    Rodehorst, V., Koschan, A.: Comparison and evaluation of feature point detectors. In: 5th International Symposium Turkish-German Joint Geodetic Days (2006)Google Scholar
  2. 2.
    Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Found. Trends Comput. Graph. Vision 3(3), 177–280 (2008)CrossRefGoogle Scholar
  3. 3.
    Moravec, H.: Obstacle avoidance and navigation in the real world by a seeing robot Rover. Tech Report CMU-RI-TR-3 Carnegie-Mellon University, Robotics Institute (1980)Google Scholar
  4. 4.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  5. 5.
    Shi, J., Tomasi, C.: Good “features to track”. In: 9th IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600. Springer, Berlin (1994)Google Scholar
  6. 6.
    Wang, H., Brady, M.: Real-time corner detection algorithm for motion estimation. Image Vis. Comput. 13(9), 695–703 (1995)CrossRefGoogle Scholar
  7. 7.
    Smith, S.M., Brady, J.M.: SUSAN—a new approach to low level image processing. Int. J. Comput. Vis. 23(1), 45–78 (1997)Google Scholar
  8. 8.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999).
  9. 9.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. Comput. Vis. Image Underst. (CVIU) 110(3), 346–359 (2008)Google Scholar
  10. 10.
    Vorontsov, K.V., Potapenko, A.A.: Tutorial on probabilistic topic modeling: additive regularization for stochastic matrix factorization. In: AIST’2014, Analysis of images, Social networks and Texts, vol. 436. Communications in Computer and Information Science (CCIS), pp. 29–46. Springer International Publishing, Switzerland (2014)Google Scholar
  11. 11.
    Rusiñol, M., Frinken, V., Karatzas, D., Bagdanov, A.D., Lladós, J.: Multimodal page classification in administrative document image streams. IJDAR 17(4), 331–341 (2014)CrossRefGoogle Scholar
  12. 12.
    Awal, A.M., Ghanmi, N., Sicre, R., Furon, T.: Complex document classification and localization application on identity document images. In: Proceedings of 14th IAPR International Conference on Document Analysis and Recognition, pp. 427–432 (2017).
  13. 13.
    Chum, O., Matas, J., Kittler, J.: Locally optimized RANSAC. In: DAGM-Symposium, vol. 2781. Lecture Notes in Computer Science, pp. 236–243 (2003)Google Scholar
  14. 14.
    Zhukovsky, A., Nikolaev, D., Arlazarov, V., Postnikov, V., Polevoy, D., Skoryukina, N., Chernov, T., Shemiakina, J., Mukovozov, A., Konovalenko, I., et al.: Segments graph based approach for document capture in a smartphone video stream. In: Proceedings of 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 337–342, IEEE (2017)Google Scholar
  15. 15.
    Augereau, O., Journet, N., Domenger, J.-P.: Semistructured document image matching and recognition. In: Document Recognition and Retrieval XX, vol. 8658, p. 865804. International Society for Optics and Photonics (2013)Google Scholar
  16. 16.
    Ahmed, A.G.H., Forgery, S.F.: Detection based on intrinsic document contents. In: Proceedings of 11th IAPR International Workshop on Document Analysis Systems (2014).
  17. 17.
    Badino, H., Kanade, T.: A head-wearable “short-baseline stereo system for the simultaneous estimation of structure and motion”. In: Proceedings of MVA, pp. 185–189 (2011)Google Scholar
  18. 18.
    Andreeva, E., Arlazarov, V.V., Manzhikov, T., Slavin, O.: Comparison of the scanned pages of the contractual documents. In: Proceedings of SPIE, vol. 10696. Tenth International Conference on Machine Vision (ICMV 2017), Vienna, Austria, 13–15 November 2017. Art. No. 1069605, pp. 106960–106966 (2018).
  19. 19.
    Kravets, A.G., Lebedev, N., Legenchenko, M.: Patents images retrieval and convolutional neural network training dataset quality improvement. ACSR-Adv. Comput. Sci. Res. 72, 287–293 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.FRC “Computer Science and Control” RASMoscowRussia

Personalised recommendations