Background

Yin, Xu-Cheng; Yang, Chun; Liu, Chang

doi:10.1007/978-981-97-0361-6_2

Xu-Cheng Yin⁴,
Chun Yang⁵ &
Chang Liu⁶

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

81 Accesses

Abstract

This character offers an introduction to the background of the OSTR task, covering essential aspects such as open-set identification and recognition, conventional OCR methods, and their applications. First, we introduce the concept of open-set (or open-world). We discuss the different usage of open-set from various research areas and declare the OSTR task concern on either identification or recognition capability, with both seen and novel (abnormal) samples. Then, we introduce the study on open-set identification and open-set recognition to show how unknown instances are dealt with by various tasks involving recognition, including image classification, object detection, semantic segmentation, and instance segmentation. The introduction focuses on general ideas instead of implementations. Besides, the chapter also lays the background of conventional OCR methods, which can be categorized into three main frameworks: Word Level Prediction, Feature Aggregation, and Label Aggregation. Finally, we also introduce some existing studies beyond close-set text recognition before the OSTR task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For tracking, and/or ReID, the category label is not a direct part of the task formulation, so it is considered as a domain here.
2.
https://commons.wikimedia.org/wiki/File:Egypt_Hieroglyphe4.jpg.

References

Naylor, A.R.: Known knowns, known unknowns and unknown unknowns: a 2010 update on carotid artery disease (2010). [Online]. https://api.semanticscholar.org/CorpusID:196394883
Scheirer, W.J., Jain, L.P., Boult, T.E.: Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2317–2324 (2014). [Online]. https://doi.org/10.1109/TPAMI.2014.2321392
Dhamija, A.R., Günther, M., Boult, T.E.: Reducing network agnostophobia. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Dec 3–8, 2018, Montréal, Canada (2018), pp. 9175–9186. [Online]. https://proceedings.neurips.cc/paper/2018/hash/48db71587df6c7c442e5b76cc723169a-Abstract.html
Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3614–3631 (2021)
Article Google Scholar
Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2013)
Google Scholar
Ge, Z., Demyanov, S., Garnavi, R.: Generative openmax for multi-class open set classification. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, Sept 4–7, 2017. BMVA Press (2017)
Google Scholar
Ding, C., Pang, G., Shen, C.: Catching both gray and black swans: open-set supervised anomaly detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 7378–7388
Google Scholar
Acsintoae, A., Florescu, A., Georgescu, M., Mare, T., Sumedrea, P., Ionescu, R.T., Khan, F.S., Shah, M.: Ubnormal: new benchmark for supervised open-set video anomaly detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 20 111–20 121
Google Scholar
Mahdavi, A., Carvalho, M.: A survey on open set recognition. In: Fourth IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2021, Laguna Hills, CA, USA, Dec 1–3, 2021. IEEE (2021), pp. 37–44. [Online]. https://doi.org/10.1109/AIKE52691.2021.00013
Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized out-of-distribution detection: a survey. CoRR, vol. abs/2110.11334 (2021). [Online]. https://arxiv.org/abs/2110.11334
Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3614–3631 (2021). [Online]. https://doi.org/10.1109/TPAMI.2020.2981604
Chen, G., Qiao, L., Shi, Y., Peng, P., Li, J., Huang, T., Pu, S., Tian, Y.: Learning open set network with discriminative reciprocal points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J. (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part III, ser. Lecture Notes in Computer Science, vol. 12348. Springer (2020), pp. 507–522. [Online]. https://doi.org/10.1007/978-3-030-58580-8_30
Shu, Y., Shi, Y., Wang, Y., Huang, T., Tian, Y.: P-odn: prototype-based open deep network for open set recognition. Sci. Rep. 10 (2019). [Online]. https://api.semanticscholar.org/CorpusID:146120506
Yoshihashi, R., Shao, W., Kawakami, R., You, S., Iida, M., Naemura, T.: Classification-reconstruction learning for open-set recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE (2019), pp. 4016–4025. [Online]. http://openaccess.thecvf.com/content_CVPR_2019/html/Yoshihashi_Classification-Reconstruction_Learning_for_Open-Set_Recognition_CVPR_2019_paper.html
Yu, Y., Qu, W., Li, N., Guo, Z.: Open category classification by adversarial sample generation. In: Sierra, C. (ed.) Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, Aug 19–25, 2017, ijcai.org (2017), pp. 3357–3363. [Online]. https://doi.org/10.24963/ijcai.2017/469
Geng, C., Chen, S.: Collective decision for open set recognition. IEEE Trans. Knowl. Data Eng. 34(1), 192–204 (2022). [Online]. https://doi.org/10.1109/TKDE.2020.2978199
Zhang, X., Liu, C., Suen, C.Y.: Towards robust pattern recognition: a review. Proc. IEEE 108(6), 894–922 (2020). [Online]. https://doi.org/10.1109/JPROC.2020.2989782
Zhou, D., Wang, Q., Qi, Z., Ye, H., Zhan, D., Liu, Z.: Deep class-incremental learning: a survey. CoRR, vol. abs/2302.03648 (2023). [Online]. https://doi.org/10.48550/arXiv.2302.03648
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, Sept 8–14, 2018, Proceedings, Part XII, ser. Lecture Notes in Computer Science, vol. 11216. Springer (2018), pp. 241–257. [Online]. https://doi.org/10.1007/978-3-030-01258-8_15
Ahn, H., Kwak, J., Lim, S., Bang, H., Kim, H., Moon, T.: SS-IL: separated softmax for incremental learning. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, Oct 10–17, 2021. IEEE (2021), pp. 824–833. [Online]. https://doi.org/10.1109/ICCV48922.2021.00088
He, C., Wang, R., Chen, X.: A tale of two cils: the connections between class incremental learning and class imbalanced learning, and beyond. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 3559–3569. [Online]. https://openaccess.thecvf.com/content/CVPR2021W/CLVision/html/He_A_Tale_of_Two_CILs_The_Connections_Between_Class_Incremental_CVPRW_2021_paper.html
Pham, Q., Liu, C., Hoi, S.C.H.: Continual normalization: rethinking batch normalization for online continual learning. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net (2022). [Online]. https://openreview.net/forum?id=vwLLQ-HwqhZ
Bommasani, R., Hudson, D.A., Adeli, E., Altman, R.B., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N.S., Chen, A.S., Creel, K., Davis, J.Q., Demszky, D., Donahue, C., Doumbouya, M., Durmus, E., Ermon, S., Etchemendy, J., Ethayarajh, K., Fei-Fei, L., Finn, C., Gale, T., Gillespie, L., Goel, K., Goodman, N.D., Grossman, S., Guha, N., Hashimoto, T., Henderson, P., Hewitt, J., Ho, D.E., Hong, J., Hsu, K., Huang, J., Icard, T., Jain, S., Jurafsky, D., Kalluri, P., Karamcheti, S., Keeling, G., Khani, F., Khattab, O., Koh, P.W., Krass, M.S., Krishna, R., Kuditipudi, R., et al.: On the opportunities and risks of foundation models. CoRR, vol. abs/2108.07258 (2021). [Online]. https://arxiv.org/abs/2108.07258
Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Dec 6–12, 2020, Virtual (2020). [Online]. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Elish, M.C., Isaac, W., Zemel, R.S. (eds.) FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event/Toronto, Canada, March 3–10, 2021, ACM (2021), pp. 610–623. [Online]. https://doi.org/10.1145/3442188.3445922
Huang, S., Dong, L., Wang, W., Hao, Y., Singhal, S., Ma, S., Lv, T., Cui, L., Mohammed, O.K., Patra, B., Liu, Q., Aggarwal, K., Chi, Z., Bjorck, J., Chaudhary, V., Som, S., Song, X., Wei, F.: Language is not all you need: aligning perception with language models (2023). [Online]. http://arxiv.org/abs/2302.14045
Shi, Y., Peng, D., Liao, W., Lin, Z., Chen, X., Liu, C., Zhang, Y., Jin, L.: Exploring ocr capabilities of gpt-4v (ision): a quantitative and in-depth evaluation. arXiv preprint arXiv:2310.16809 (2023)
Rust, P., Lotz, J.F., Bugliarello, E., Salesky, E., de Lhoneux, M., Elliott, D.: Language modelling with pixels. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1–5, 2023. OpenReview.net (2023)
Google Scholar
Liu, C., Yang, C., Yin, X.: Open-set text recognition via shape-awareness visual reconstruction. In: Document Analysis and Recognition - ICDAR 2023–17th International Conference, San José, CA, USA, Aug 21–26, 2023, Proceedings, Part VI, ser. Lecture Notes in Computer Science, vol. 14192. Springer (2023), pp. 89–105
Google Scholar
Long, Y., Wen, Y., Han, J., Xu, H., Ren, P., Zhang, W., Zhao, S., Liang, X.: Capdet: unifying dense captioning and open-world detection pretraining. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17–24, 2023. IEEE (2023), pp. 15 233–15 243
Google Scholar
Fei, G., Liu, B.: Breaking the closed world assumption in text classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2016), pp. 506–514
Google Scholar
Pourpanah, F., Abdar, M., Luo, Y., Zhou, X., Wang, R., Lim, C.P., Wang, X., Wu, Q.M.J.: A review of generalized zero-shot learning methods. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4051–4070 (2023)
Google Scholar
Bansal, A., Sikka, K., Sharma, G., Chellappa, R., Divakaran, A.: Zero-shot object detection. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 384–400
Google Scholar
Zheng, Y., Wu, J., Qin, Y., Zhang, F., Cui, L.: Zero-shot instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 2593–2602
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics (2019), pp. 4171–4186
Google Scholar
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event, ser. Proceedings of Machine Learning Research, vol. 139. PMLR (2021), pp. 8748–8763
Google Scholar
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Google Scholar
Sanh, V., Webson, A., Raffel, C., Bach, S.H., Sutawika, L., Alyafeai, Z., Chaffin, A., Stiegler, A., Raja, A., Dey, M., Bari, M.S., Xu, C., Thakker, U., Sharma, S.S., Szczechla, E., Kim, T., Chhablani, G., Nayak, N.V., Datta, D., Chang, J., Jiang, M.T., Wang, H., Manica, M., Shen, S., Yong, Z.X., Pandey, H., Bawden, R., Wang, T., Neeraj, T., Rozen, J., Sharma, A., Santilli, A., Févry, T., Fries, J.A., Teehan, R., Scao, T.L., Biderman, S., Gao, L., Wolf, T., Rush, A.M.: Multitask prompted training enables zero-shot task generalization. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net (2022)
Google Scholar
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.-Y., et al.: Segment anything (2023)
Google Scholar
Ma, Z., Luo, G., Gao, J., Li, L., Chen, Y., Wang, S., Zhang, C., Hu, W.: Open-vocabulary one-stage detection with hierarchical visual-language knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 14 054–14 063
Google Scholar
Panareda Busto, P., Gall, J.: Open set domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 754–763
Google Scholar
Shi, J., Xu, N., Zheng, H., Smith, A., Luo, J., Xu, C.: Spaceedit: learning a unified editing space for open-domain image color editing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 19 698–19 707
Google Scholar
Katsumata, K., Vo, D.M., Nakayama, H.: OSSGAN: open-set semi-supervised image generation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 11 175–11 183
Google Scholar
Ning, K., Zhao, X., Li, Y., Huang, S.: Active learning for open-set annotation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 41–49
Google Scholar
Yang, C., Liu, C., Fang, Z.-Y., Han, Z., Liu, C.-L., Yin, X.-C.: Open set text recognition technology. J. Image Graph. 28, 1767–1791 (2023)
Google Scholar
Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: 1996 Conference on Computer Vision and Pattern Recognition (CVPR ’96), June 18–20, 1996 San Francisco, CA, USA. IEEE Computer Society (1996), pp. 631–637
Google Scholar
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2552–2566 (2014)
Article Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)
Article MathSciNet Google Scholar
Chanda, S., Baas, J., Haitink, D., Hamel, S., Stutzmann, D., Schomaker, L.: Zero-shot learning based approach for medieval word recognition using deep-learned features. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, Aug 5–8, 2018. IEEE (2018), pp. 345–350
Google Scholar
Rai, A., Krishnan, N.C., Chanda, S.: Pho(sc)net: an approach towards zero-shot word image recognition in historical documents. In: 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, Sept 5–10, 2021, Proceedings, Part I, ser. Lecture Notes in Computer Science, vol. 12821. Springer (2021), pp. 19–33
Google Scholar
Chanda, S., Haitink, D., Prasad, P.K., Baas, J., Pal, U., Schomaker, L.: Recognizing Bengali word images - A zero-shot learning perspective. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, Jan 10–15, 2021. IEEE (2020), pp. 5603–5610
Google Scholar
Li, B., Tang, X., Qi, X., Chen, Y., Xiao, R.: Hamming OCR: a locality sensitive hashing neural network for scene text recognition (2020). [Online]. https://arxiv.org/abs/2009.10874
Zhang, J., Du, J., Dai, L.: Radical analysis network for learning hierarchies of Chinese characters. Pattern Recognit. 103, 107305 (2020)
Article Google Scholar
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)
Article Google Scholar
Fang, S., Xie, H., Wang, Y., Mao, Z., Zhang, Y.: Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 7098–7107
Google Scholar
Sheng, F., Chen, Z., Xu, B.: NRTR: a no-recurrence sequence-to-sequence model for scene text recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, Sept 20–25, 2019. IEEE (2019), pp. 781–786
Google Scholar
Wang, T., Zhu, Y., Jin, L., Luo, C., Chen, X., Wu, Y., Wang, Q., Cai, M.: Decoupled attention network for text recognition. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, Feb 7–12, 2020. AAAI Press (2020), pp. 12 216–12 224
Google Scholar
Yu, D., Li, X., Zhang, C., Liu, T., Han, J., Liu, J., Ding, E.: Towards accurate scene text recognition with semantic reasoning networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE (2020), pp. 12 110–12 119
Google Scholar
Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S.J., Lee, H.: What is wrong with scene text recognition model comparisons? dataset and model analysis. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), Oct 27–Nov 2, 2019. IEEE (2019), pp. 4714–4722
Google Scholar
Liu, C., Yang, C., Yin, X.: Open-set text recognition via character-context decoupling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 4513–4522
Google Scholar
Liu, C., Yang, C., Qin, H., Zhu, X., Liu, C., Yin, X.: Towards open-set text recognition via label-to-prototype learning. Pattern Recognit. 134, 109109 (2023)
Article Google Scholar
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Article Google Scholar
Borisyuk, F., Gordo, A., Sivakumar, V.: Rosetta: large scale system for text detection and recognition in images. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, KDD 2018, London, UK, Aug 19–23, 2018. ACM (2018), pp. 71–79
Google Scholar
Cheng, Z., Xu, Y., Bai, F., Niu, Y., Pu, S., Zhou, S.: AON: towards arbitrarily-oriented text recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society (2018), pp. 5571–5579
Google Scholar
Liao, M., Zhang, J., Wan, Z., Xie, F., Liang, J., Lyu, P., Yao, C., Bai, X.: Scene text recognition from two-dimensional perspective. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, Jan 27–Feb 1, 2019. AAAI Press (2019), pp. 8714–8721
Google Scholar
Wang, T., Xie, Z., Li, Z., Jin, L., Chen, X.: Radical aggregation network for few-shot offline handwritten Chinese character recognition. Pattern Recognit. Lett. 125, 821–827 (2019)
Article Google Scholar
Cao, Z., Lu, J., Cui, S., Zhang, C.: Zero-shot handwritten Chinese character recognition with hierarchical decomposition embedding. Pattern Recognit. 107, 107488 (2020)
Article Google Scholar
Huang, Y., Jin, L., Peng, D.: Zero-shot Chinese text recognition via matching class embedding. In: 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, Sept 5–10, 2021, Proceedings, Part III, ser. Lecture Notes in Computer Science, vol. 12823. Springer (2021), pp. 127–141
Google Scholar
Chen, J., Li, B., Xue, X.: Zero-shot Chinese character recognition with stroke-level decomposition. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event/Montreal, Canada, 19–27 Aug 2021. ijcai.org (2021), pp. 615–621
Google Scholar
Zhang, C., Gupta, A., Zisserman, A.: Adaptive text recognition through visual matching. In: Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part XVI, ser. Lecture Notes in Computer Science, vol. 12361. Springer (2020), pp. 51–67
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Xu-Cheng Yin
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Chun Yang
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Chang Liu

Authors

Xu-Cheng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Chun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xu-Cheng Yin or Chun Yang .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yin, XC., Yang, C., Liu, C. (2024). Background. In: Open-Set Text Recognition. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-97-0361-6_2

Download citation

DOI: https://doi.org/10.1007/978-981-97-0361-6_2
Published: 02 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0360-9
Online ISBN: 978-981-97-0361-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics