MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning

Buitrago, Paola A.; Toropov, Evgeny; Prabha, Rajanie; Uran, Julian; Adal, Raja

doi:10.1007/978-3-030-86334-0_1

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12823))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3485 Accesses
1 Citations
4 Altmetric

Abstract

Mining existing image datasets with rich information can help advance knowledge across domains in the humanities and social sciences. In the past, the extraction of this information was often prohibitively expensive and labor-intensive. AI can provide an alternative, making it possible to speed up the labeling and mining of large and specialized datasets via a human-in-the-loop method of active learning (AL). Although AL methods are helpful for certain scenarios, they present limitations when the set of classes is not known before labeling (i.e. an open-ended set) and the distribution of objects across classes is highly unbalanced (i.e. a long-tailed distribution). To address these limitations in object detection scenarios we propose a multi-step approach consisting of 1) object detection of a generic “object” class, and 2) image classification with an open class set and a long tail distribution. We apply our approach to recognizing stamps in a large compendium of historical documents from the Japanese company Mitsui Mi’ike Mine, one of the largest business archives in modern Japan that spans half a century, includes tens of thousands of documents, and has been widely used by labor historians, business historians, and others. To test our approach we produce and make publicly available the novel and expert-curated MiikeMineStamps dataset. This unique dataset consists of 5056 images of 405 different Japanese stamps, which to the best of our knowledge is the first published dataset of historical Japanese stamps. We hope that the MiikeMineStamps dataset will become a useful tool to further explore the application of AI methods to the study of historical documents in Japan and throughout the world of Chinese characters, as well as serve as a benchmark for image classification algorithms with an open-ended and highly unbalanced class set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Weakly Supervised Object Detection in Artworks

The MAMe dataset: on the relevance of high resolution and variable shape image properties

Article 29 January 2022

CATI: An Extensible Platform Supporting Assisted Classification of Large Datasets

References

Aghdam, H.H., González-García, A., van de Weijer, J., López, A.M.: Active learning for deep detection neural networks. In: ICCV, pp. 3671–3679 (2019)
Google Scholar
Beluch, W.H., Genewein, T., Nurnberger, A., Kohler, J.M.: The power of ensembles for active learning in image classification. In: CVPR, pp. 9368–9377 (2018). https://doi.org/10.1109/CVPR.2018.00976
Buitrago, P.A., Nystrom, N.A.: Neocortex and bridges-2: a high performance AI+HPC ecosystem for science, discovery, and societal good. In: Nesmachnow, S., Castro, H., Tchernykh, A. (eds.) High Performance Computing, pp. 205–219. Springer International Publishing, Cham (2021)
Chapter Google Scholar
Clanuwat, T., Lamb, A., Kitamoto, A.: KuroNet: pre-modern Japanese Kuzushiji character recognition with deep learning. In: ICDAR, pp. 607–614 (2019)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Article Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup, D., Teh, Y.W. (eds.) ICML, vol. 70, pp. 1126–1135 (2017)
Google Scholar
Gal, Y., Islam, R., Ghahramani, Z.: Deep Bayesian active learning with image data. ICML 70, 1183–1192 (2017)
Google Scholar
Geifman, Y., El-Yaniv, R.: Deep active learning over the long tail (2017)
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. CalTech Report, March 2007
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Kao, C.-C., Lee, T.-Y., Sen, P., Liu, M.-Y.: Localization-aware active learning for object detection. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 506–522. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_32
Chapter Google Scholar
Krishna, R., et al.: The visual genome dataset v1.0 + v1.2 images. https://visualgenome.org/
Krishnamurthy, A., Agarwal, A., Huang, T.K., Daume, H., III., Langford, J.: Active learning for cost-sensitive classification. JMLR 20(65), 1–50 (2019)
MathSciNet MATH Google Scholar
Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-100 (Canadian Institute for Advanced Research)
Google Scholar
Lin, T., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2999–3007 (2017)
Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)
Google Scholar
Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: CVPR (2019)
Google Scholar
Nystrom, N.A., Levine, M.J., Roskies, R.Z., Scott, J.R.: Bridges: a uniquely flexible HPC resource for new communities and data analytics. In: XSEDE 2015: Scientific Advancements Enabled by Enhanced Cyberinfrastructure (2015). https://doi.org/10.1145/2792745.2792775
Qu, Z., Du, J., Cao, Y., Guan, Q., Zhao, P.: Deep active learning for remote sensing object detection (2020)
Google Scholar
Roy, S., Unmesh, A., Namboodiri, V.: Deep active learning for object detection. In: BMVC (2019)
Google Scholar
Russell, B., Torralba, A., Murphy, K., Freeman, W.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008)
Article Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. CoRR abs/1503.03832 (2015)
Google Scholar
Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. In: ICLR (2018)
Google Scholar
Sinha, S., Ebrahimi, S., Darrell, T.: Variational adversarial active learning. In: ICCV, pp. 5971–5980 (2019). https://doi.org/10.1109/ICCV.2019.00607
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. NIPS 30, 4077–4087 (2017)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: CVPR, pp. 1701–1708 (2014). https://doi.org/10.1109/CVPR.2014.220
Toropov, E., Buitrago, P.A., Moura, J.M.F.: Shuffler: A large scale data management tool for machine learning in computer vision. In: PEARC (2019)
Google Scholar
Towns, J., Cockerill, T., Dahan, M., Foster, I., Gaither, K., Grimshaw, A., Hazlewood, V., Lathrop, S., Lifka, D., Peterson, G.D., Roskies, R., Scott, J., Wilkins-Diehr, N.: XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16(05), 62–74 (2014). https://doi.org/10.1109/MCSE.2014.80
Article Google Scholar
Villalonga, G., Lopez, A.M.: Co-training for on-board deep object detection (2020)
Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, k., Wierstra, D.: Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) NIPS, vol. 29, pp. 3630–3638 (2016)
Google Scholar
Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2017). https://doi.org/10.1109/TCSVT.2016.2589879
Article Google Scholar
Wang, Y., Yao, Q., Kwok, J., Ni, L.: Few-shot learning: a survey. arXiv preprint arXiv:1904.05046 (2019)
Xia, G., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: CVPR, pp. 3974–3983 (2018). https://doi.org/10.1109/CVPR.2018.00418
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: CVPR, pp. 3530–3538 (2017)
Google Scholar
Yoo, D., Kweon, I.S.: Learning loss for active learning. In: CVPR, pp. 93–102 (2019). https://doi.org/10.1109/CVPR.2019.00018
Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. In: CVPR, pp. 4457–4465 (2017). https://doi.org/10.1109/CVPR.2017.474

Download references

Acknowledgements

This work used the Extreme Science and Engineering Discovery Environment (XSEDE) which is supported by National Science Foundation grant number ACI-1548562. Specifically, it used the Bridges and Bridges-2 systems, which is supported by NSF award number ACI-1445606 and ACI-1928147, at the Pittsburgh Supercomputing Center (PSC) [3, 20, 30]. The work was made possible through the XSEDE Extended Collaborative Support Service (ECSS) program.

We are grateful to the Mitsui Archives for giving us permission to reproduce their documents and publish the stamps.

Finally, this work would not have been possible without the expert labeling and assistance of Ms. Mieko Ueda.

Author information

Authors and Affiliations

Pittsburgh Supercomputing Center, Pittsburgh, PA, 15203, USA
Paola A. Buitrago, Rajanie Prabha & Julian Uran
Carnegie Mellon University, Pittsburgh, PA, 15203, USA
Paola A. Buitrago, Rajanie Prabha & Julian Uran
DeepMap Inc., East Palo Alto, CA, 94303, USA
Evgeny Toropov
University of Pittsburgh, Pittsburgh, PA, 15260, USA
Raja Adal

Authors

Paola A. Buitrago
View author publications
You can also search for this author in PubMed Google Scholar
Evgeny Toropov
View author publications
You can also search for this author in PubMed Google Scholar
Rajanie Prabha
View author publications
You can also search for this author in PubMed Google Scholar
Julian Uran
View author publications
You can also search for this author in PubMed Google Scholar
Raja Adal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evgeny Toropov .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Buitrago, P.A., Toropov, E., Prabha, R., Uran, J., Adal, R. (2021). MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-86334-0_1
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86333-3
Online ISBN: 978-3-030-86334-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning

Abstract

Access this chapter

Similar content being viewed by others

Weakly Supervised Object Detection in Artworks

The MAMe dataset: on the relevance of high resolution and variable shape image properties

CATI: An Extensible Platform Supporting Assisted Classification of Large Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

MiikeMineStamps: A Long-Tailed Dataset of Japanese Stamps via Active Learning

Abstract

Access this chapter

Similar content being viewed by others

Weakly Supervised Object Detection in Artworks

The MAMe dataset: on the relevance of high resolution and variable shape image properties

CATI: An Extensible Platform Supporting Assisted Classification of Large Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation