Image indexing and content analysis in children’s picture books using a large-scale database

Huang, Chengwei; Jiang, Hao

doi:10.1007/s11042-019-7440-8

Image indexing and content analysis in children’s picture books using a large-scale database

Published: 05 March 2019

Volume 78, pages 20679–20695, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

524 Accesses
6 Citations
Explore all metrics

Abstract

In this paper we introduce a visual database for children’s picture book and we also present an intelligent robot trained on this database. Firstly, a large-scale image dataset is built that contains image samples of book pages. It can be used to verify image indexing algorithms and content recognition algorithms. Secondly, we study the state-of-the-art algorithms in image matching and object recognition. Several approaches are presented and compared from the aspects of computational efficiency and recognition accuracy. In order to improve the speed we proposed a novel hierarchical algorithm for fast search. Finally, using this large-scale database we are able to build a robot that can read children’s picture books and initial experimental results are presented. We can see that both the training database and the algorithms are promising, yet there are still a few open challenges concerning the costs and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constructing Hierarchical Visual Tree for Discriminative Image Representation and Classification

Hierarchical BoW with segmental sparse coding for large scale image classification and retrieval

Article 05 May 2018

Geometric Indexing for Recognition of Places

References

Bay H, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110(3):346–359
Article Google Scholar
Cai H, Wu Q, Corradi T, et al (2015) The cross-depiction problem: computer vision algorithms for recognising objects in artwork and in photographs. arXiv:1505.00110
Deng J, Dong W, Socher R, Li LJ, Li K, Li F (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255
Dong J, Soatto S (2015) Domain-size pooling in local descriptors DSP-SIFT. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5097–5106
Dosovitskiy A, Springenberg JT, Brox T (2013) Unsuper-vised feature learning by augmenting single images. arXiv:1312.5242
Fischer P, Dosovitskiy A, Brox T (2014) Descriptor matching with convolutional neural networks: a comparison to sift. arXiv:1405.5769
Ginosar S, Haas D, Brown T, et al (2014) Detecting people in cubist art, workshop at the European conference on computer vision. Springer International Publishing, pp 101–116
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition
Huang C, Efraty BA, Kurkure U, et al (2012) Facial landmark configuration for improved detection. In: IEEE international workshop on information forensics & security
Jian M, Lam KM (2014) Face-image retrieval based on singular values and potential-field representation. Signal Process 100(7):9–15
Article Google Scholar
Jian M, Lam KM, Dong J (2014) Facial-feature detection and localization based on a hierarchical scheme. Inf Sci 262:1–14
Article Google Scholar
Jian M, Lam KM, Dong J, et al (2015) Visual-patch-attention-aware saliency detection. IEEE Transactions on Cybernetics 45(8):1575
Article Google Scholar
Jian M, Qi Q, Dong J, et al (2018) Saliency detection using quaternionic distance based weber local descriptor and level priors. Multimed Tools Appl 77 (11):14343–14360
Article Google Scholar
Jian M, Yin Y, Dong J, et al (2018) Content-based image retrieval via a hierarchical-local-feature extraction scheme. Multimed Tools Appl 77(21):29099–29117
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. International conference on neural information processing systems
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context, European conference on computer vision. Springer International Publishing, pp 740–755
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International conference learning representations
Szegedy C, Liu W, Jia Y, et al (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition
Wang XY, Zhang BB, Yang HY (2014) Content-based image retrieval by integrating color and texture features. Multimed Tools Appl 68(3):545–569
Article Google Scholar
Yang J, Jiang YG, Hauptmann AG, et al (2007) Evaluating bag-of-visual-words representations in scene classification. Proceedings of the international workshop on workshop on multimedia information retrieval. ACM, 197-206
Zhang T, Yang Z, Jia W, et al (2015) Fast and robust head detection with arbitrary pose and occlusion. Multimed Tools Appl 74(21):9365–9385
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable advices that improved this paper greatly, and Dr. Jingjie Yan for his help in data collection, technical writing and experiment set-up.

Author information

Authors and Affiliations

Fandou Information Technology Co. Ltd., Shenzhen, China
Chengwei Huang & Hao Jiang

Authors

Chengwei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengwei Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, C., Jiang, H. Image indexing and content analysis in children’s picture books using a large-scale database. Multimed Tools Appl 78, 20679–20695 (2019). https://doi.org/10.1007/s11042-019-7440-8

Download citation

Received: 19 July 2018
Revised: 18 December 2018
Accepted: 27 February 2019
Published: 05 March 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s11042-019-7440-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image indexing and content analysis in children’s picture books using a large-scale database

Abstract

Access this article

Similar content being viewed by others

Constructing Hierarchical Visual Tree for Discriminative Image Representation and Classification

Hierarchical BoW with segmental sparse coding for large scale image classification and retrieval

Geometric Indexing for Recognition of Places

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image indexing and content analysis in children’s picture books using a large-scale database

Abstract

Access this article

Similar content being viewed by others

Constructing Hierarchical Visual Tree for Discriminative Image Representation and Classification

Hierarchical BoW with segmental sparse coding for large scale image classification and retrieval

Geometric Indexing for Recognition of Places

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation