Skip to main content

Background

  • Chapter
  • First Online:
Open-Set Text Recognition

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 81 Accesses

Abstract

This character offers an introduction to the background of the OSTR task, covering essential aspects such as open-set identification and recognition, conventional OCR methods, and their applications. First, we introduce the concept of open-set (or open-world). We discuss the different usage of open-set from various research areas and declare the OSTR task concern on either identification or recognition capability, with both seen and novel (abnormal) samples. Then, we introduce the study on open-set identification and open-set recognition to show how unknown instances are dealt with by various tasks involving recognition, including image classification, object detection, semantic segmentation, and instance segmentation. The introduction focuses on general ideas instead of implementations. Besides, the chapter also lays the background of conventional OCR methods, which can be categorized into three main frameworks: Word Level Prediction, Feature Aggregation, and Label Aggregation. Finally, we also introduce some existing studies beyond close-set text recognition before the OSTR task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For tracking, and/or ReID, the category label is not a direct part of the task formulation, so it is considered as a domain here.

  2. 2.

    https://commons.wikimedia.org/wiki/File:Egypt_Hieroglyphe4.jpg.

References

  1. Naylor, A.R.: Known knowns, known unknowns and unknown unknowns: a 2010 update on carotid artery disease (2010). [Online]. https://api.semanticscholar.org/CorpusID:196394883

  2. Scheirer, W.J., Jain, L.P., Boult, T.E.: Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2317–2324 (2014). [Online]. https://doi.org/10.1109/TPAMI.2014.2321392

  3. Dhamija, A.R., Günther, M., Boult, T.E.: Reducing network agnostophobia. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Dec 3–8, 2018, Montréal, Canada (2018), pp. 9175–9186. [Online]. https://proceedings.neurips.cc/paper/2018/hash/48db71587df6c7c442e5b76cc723169a-Abstract.html

  4. Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3614–3631 (2021)

    Article  Google Scholar 

  5. Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2013)

    Google Scholar 

  6. Ge, Z., Demyanov, S., Garnavi, R.: Generative openmax for multi-class open set classification. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, Sept 4–7, 2017. BMVA Press (2017)

    Google Scholar 

  7. Ding, C., Pang, G., Shen, C.: Catching both gray and black swans: open-set supervised anomaly detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 7378–7388

    Google Scholar 

  8. Acsintoae, A., Florescu, A., Georgescu, M., Mare, T., Sumedrea, P., Ionescu, R.T., Khan, F.S., Shah, M.: Ubnormal: new benchmark for supervised open-set video anomaly detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 20 111–20 121

    Google Scholar 

  9. Mahdavi, A., Carvalho, M.: A survey on open set recognition. In: Fourth IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2021, Laguna Hills, CA, USA, Dec 1–3, 2021. IEEE (2021), pp. 37–44. [Online]. https://doi.org/10.1109/AIKE52691.2021.00013

  10. Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized out-of-distribution detection: a survey. CoRR, vol. abs/2110.11334 (2021). [Online]. https://arxiv.org/abs/2110.11334

  11. Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3614–3631 (2021). [Online]. https://doi.org/10.1109/TPAMI.2020.2981604

  12. Chen, G., Qiao, L., Shi, Y., Peng, P., Li, J., Huang, T., Pu, S., Tian, Y.: Learning open set network with discriminative reciprocal points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J. (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part III, ser. Lecture Notes in Computer Science, vol. 12348. Springer (2020), pp. 507–522. [Online]. https://doi.org/10.1007/978-3-030-58580-8_30

  13. Shu, Y., Shi, Y., Wang, Y., Huang, T., Tian, Y.: P-odn: prototype-based open deep network for open set recognition. Sci. Rep. 10 (2019). [Online]. https://api.semanticscholar.org/CorpusID:146120506

  14. Yoshihashi, R., Shao, W., Kawakami, R., You, S., Iida, M., Naemura, T.: Classification-reconstruction learning for open-set recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE (2019), pp. 4016–4025. [Online]. http://openaccess.thecvf.com/content_CVPR_2019/html/Yoshihashi_Classification-Reconstruction_Learning_for_Open-Set_Recognition_CVPR_2019_paper.html

  15. Yu, Y., Qu, W., Li, N., Guo, Z.: Open category classification by adversarial sample generation. In: Sierra, C. (ed.) Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, Aug 19–25, 2017, ijcai.org (2017), pp. 3357–3363. [Online]. https://doi.org/10.24963/ijcai.2017/469

  16. Geng, C., Chen, S.: Collective decision for open set recognition. IEEE Trans. Knowl. Data Eng. 34(1), 192–204 (2022). [Online]. https://doi.org/10.1109/TKDE.2020.2978199

  17. Zhang, X., Liu, C., Suen, C.Y.: Towards robust pattern recognition: a review. Proc. IEEE 108(6), 894–922 (2020). [Online]. https://doi.org/10.1109/JPROC.2020.2989782

  18. Zhou, D., Wang, Q., Qi, Z., Ye, H., Zhan, D., Liu, Z.: Deep class-incremental learning: a survey. CoRR, vol. abs/2302.03648 (2023). [Online]. https://doi.org/10.48550/arXiv.2302.03648

  19. Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, Sept 8–14, 2018, Proceedings, Part XII, ser. Lecture Notes in Computer Science, vol. 11216. Springer (2018), pp. 241–257. [Online]. https://doi.org/10.1007/978-3-030-01258-8_15

  20. Ahn, H., Kwak, J., Lim, S., Bang, H., Kim, H., Moon, T.: SS-IL: separated softmax for incremental learning. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, Oct 10–17, 2021. IEEE (2021), pp. 824–833. [Online]. https://doi.org/10.1109/ICCV48922.2021.00088

  21. He, C., Wang, R., Chen, X.: A tale of two cils: the connections between class incremental learning and class imbalanced learning, and beyond. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 3559–3569. [Online]. https://openaccess.thecvf.com/content/CVPR2021W/CLVision/html/He_A_Tale_of_Two_CILs_The_Connections_Between_Class_Incremental_CVPRW_2021_paper.html

  22. Pham, Q., Liu, C., Hoi, S.C.H.: Continual normalization: rethinking batch normalization for online continual learning. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net (2022). [Online]. https://openreview.net/forum?id=vwLLQ-HwqhZ

  23. Bommasani, R., Hudson, D.A., Adeli, E., Altman, R.B., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N.S., Chen, A.S., Creel, K., Davis, J.Q., Demszky, D., Donahue, C., Doumbouya, M., Durmus, E., Ermon, S., Etchemendy, J., Ethayarajh, K., Fei-Fei, L., Finn, C., Gale, T., Gillespie, L., Goel, K., Goodman, N.D., Grossman, S., Guha, N., Hashimoto, T., Henderson, P., Hewitt, J., Ho, D.E., Hong, J., Hsu, K., Huang, J., Icard, T., Jain, S., Jurafsky, D., Kalluri, P., Karamcheti, S., Keeling, G., Khani, F., Khattab, O., Koh, P.W., Krass, M.S., Krishna, R., Kuditipudi, R., et al.: On the opportunities and risks of foundation models. CoRR, vol. abs/2108.07258 (2021). [Online]. https://arxiv.org/abs/2108.07258

  24. Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Dec 6–12, 2020, Virtual (2020). [Online]. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html

  25. Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Elish, M.C., Isaac, W., Zemel, R.S. (eds.) FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event/Toronto, Canada, March 3–10, 2021, ACM (2021), pp. 610–623. [Online]. https://doi.org/10.1145/3442188.3445922

  26. Huang, S., Dong, L., Wang, W., Hao, Y., Singhal, S., Ma, S., Lv, T., Cui, L., Mohammed, O.K., Patra, B., Liu, Q., Aggarwal, K., Chi, Z., Bjorck, J., Chaudhary, V., Som, S., Song, X., Wei, F.: Language is not all you need: aligning perception with language models (2023). [Online]. http://arxiv.org/abs/2302.14045

  27. Shi, Y., Peng, D., Liao, W., Lin, Z., Chen, X., Liu, C., Zhang, Y., Jin, L.: Exploring ocr capabilities of gpt-4v (ision): a quantitative and in-depth evaluation. arXiv preprint arXiv:2310.16809 (2023)

  28. Rust, P., Lotz, J.F., Bugliarello, E., Salesky, E., de Lhoneux, M., Elliott, D.: Language modelling with pixels. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1–5, 2023. OpenReview.net (2023)

    Google Scholar 

  29. Liu, C., Yang, C., Yin, X.: Open-set text recognition via shape-awareness visual reconstruction. In: Document Analysis and Recognition - ICDAR 2023–17th International Conference, San José, CA, USA, Aug 21–26, 2023, Proceedings, Part VI, ser. Lecture Notes in Computer Science, vol. 14192. Springer (2023), pp. 89–105

    Google Scholar 

  30. Long, Y., Wen, Y., Han, J., Xu, H., Ren, P., Zhang, W., Zhao, S., Liang, X.: Capdet: unifying dense captioning and open-world detection pretraining. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17–24, 2023. IEEE (2023), pp. 15 233–15 243

    Google Scholar 

  31. Fei, G., Liu, B.: Breaking the closed world assumption in text classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics (2016), pp. 506–514

    Google Scholar 

  32. Pourpanah, F., Abdar, M., Luo, Y., Zhou, X., Wang, R., Lim, C.P., Wang, X., Wu, Q.M.J.: A review of generalized zero-shot learning methods. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4051–4070 (2023)

    Google Scholar 

  33. Bansal, A., Sikka, K., Sharma, G., Chellappa, R., Divakaran, A.: Zero-shot object detection. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 384–400

    Google Scholar 

  34. Zheng, Y., Wu, J., Qin, Y., Zhang, F., Cui, L.: Zero-shot instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 2593–2602

    Google Scholar 

  35. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics (2019), pp. 4171–4186

    Google Scholar 

  36. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event, ser. Proceedings of Machine Learning Research, vol. 139. PMLR (2021), pp. 8748–8763

    Google Scholar 

  37. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)

    Google Scholar 

  38. Sanh, V., Webson, A., Raffel, C., Bach, S.H., Sutawika, L., Alyafeai, Z., Chaffin, A., Stiegler, A., Raja, A., Dey, M., Bari, M.S., Xu, C., Thakker, U., Sharma, S.S., Szczechla, E., Kim, T., Chhablani, G., Nayak, N.V., Datta, D., Chang, J., Jiang, M.T., Wang, H., Manica, M., Shen, S., Yong, Z.X., Pandey, H., Bawden, R., Wang, T., Neeraj, T., Rozen, J., Sharma, A., Santilli, A., Févry, T., Fries, J.A., Teehan, R., Scao, T.L., Biderman, S., Gao, L., Wolf, T., Rush, A.M.: Multitask prompted training enables zero-shot task generalization. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net (2022)

    Google Scholar 

  39. Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.-Y., et al.: Segment anything (2023)

    Google Scholar 

  40. Ma, Z., Luo, G., Gao, J., Li, L., Chen, Y., Wang, S., Zhang, C., Hu, W.: Open-vocabulary one-stage detection with hierarchical visual-language knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 14 054–14 063

    Google Scholar 

  41. Panareda Busto, P., Gall, J.: Open set domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 754–763

    Google Scholar 

  42. Shi, J., Xu, N., Zheng, H., Smith, A., Luo, J., Xu, C.: Spaceedit: learning a unified editing space for open-domain image color editing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 19 698–19 707

    Google Scholar 

  43. Katsumata, K., Vo, D.M., Nakayama, H.: OSSGAN: open-set semi-supervised image generation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 11 175–11 183

    Google Scholar 

  44. Ning, K., Zhao, X., Li, Y., Huang, S.: Active learning for open-set annotation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 41–49

    Google Scholar 

  45. Yang, C., Liu, C., Fang, Z.-Y., Han, Z., Liu, C.-L., Yin, X.-C.: Open set text recognition technology. J. Image Graph. 28, 1767–1791 (2023)

    Google Scholar 

  46. Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: 1996 Conference on Computer Vision and Pattern Recognition (CVPR ’96), June 18–20, 1996 San Francisco, CA, USA. IEEE Computer Society (1996), pp. 631–637

    Google Scholar 

  47. Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2552–2566 (2014)

    Article  Google Scholar 

  48. Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)

    Article  MathSciNet  Google Scholar 

  49. Chanda, S., Baas, J., Haitink, D., Hamel, S., Stutzmann, D., Schomaker, L.: Zero-shot learning based approach for medieval word recognition using deep-learned features. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, Aug 5–8, 2018. IEEE (2018), pp. 345–350

    Google Scholar 

  50. Rai, A., Krishnan, N.C., Chanda, S.: Pho(sc)net: an approach towards zero-shot word image recognition in historical documents. In: 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, Sept 5–10, 2021, Proceedings, Part I, ser. Lecture Notes in Computer Science, vol. 12821. Springer (2021), pp. 19–33

    Google Scholar 

  51. Chanda, S., Haitink, D., Prasad, P.K., Baas, J., Pal, U., Schomaker, L.: Recognizing Bengali word images - A zero-shot learning perspective. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, Jan 10–15, 2021. IEEE (2020), pp. 5603–5610

    Google Scholar 

  52. Li, B., Tang, X., Qi, X., Chen, Y., Xiao, R.: Hamming OCR: a locality sensitive hashing neural network for scene text recognition (2020). [Online]. https://arxiv.org/abs/2009.10874

  53. Zhang, J., Du, J., Dai, L.: Radical analysis network for learning hierarchies of Chinese characters. Pattern Recognit. 103, 107305 (2020)

    Article  Google Scholar 

  54. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)

    Article  Google Scholar 

  55. Fang, S., Xie, H., Wang, Y., Mao, Z., Zhang, Y.: Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, June 19–25, 2021. Computer Vision Foundation/IEEE (2021), pp. 7098–7107

    Google Scholar 

  56. Sheng, F., Chen, Z., Xu, B.: NRTR: a no-recurrence sequence-to-sequence model for scene text recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, Sept 20–25, 2019. IEEE (2019), pp. 781–786

    Google Scholar 

  57. Wang, T., Zhu, Y., Jin, L., Luo, C., Chen, X., Wu, Y., Wang, Q., Cai, M.: Decoupled attention network for text recognition. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, Feb 7–12, 2020. AAAI Press (2020), pp. 12 216–12 224

    Google Scholar 

  58. Yu, D., Li, X., Zhang, C., Liu, T., Han, J., Liu, J., Ding, E.: Towards accurate scene text recognition with semantic reasoning networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE (2020), pp. 12 110–12 119

    Google Scholar 

  59. Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S.J., Lee, H.: What is wrong with scene text recognition model comparisons? dataset and model analysis. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), Oct 27–Nov 2, 2019. IEEE (2019), pp. 4714–4722

    Google Scholar 

  60. Liu, C., Yang, C., Yin, X.: Open-set text recognition via character-context decoupling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE (2022), pp. 4513–4522

    Google Scholar 

  61. Liu, C., Yang, C., Qin, H., Zhu, X., Liu, C., Yin, X.: Towards open-set text recognition via label-to-prototype learning. Pattern Recognit. 134, 109109 (2023)

    Article  Google Scholar 

  62. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)

    Article  Google Scholar 

  63. Borisyuk, F., Gordo, A., Sivakumar, V.: Rosetta: large scale system for text detection and recognition in images. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, KDD 2018, London, UK, Aug 19–23, 2018. ACM (2018), pp. 71–79

    Google Scholar 

  64. Cheng, Z., Xu, Y., Bai, F., Niu, Y., Pu, S., Zhou, S.: AON: towards arbitrarily-oriented text recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society (2018), pp. 5571–5579

    Google Scholar 

  65. Liao, M., Zhang, J., Wan, Z., Xie, F., Liang, J., Lyu, P., Yao, C., Bai, X.: Scene text recognition from two-dimensional perspective. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, Jan 27–Feb 1, 2019. AAAI Press (2019), pp. 8714–8721

    Google Scholar 

  66. Wang, T., Xie, Z., Li, Z., Jin, L., Chen, X.: Radical aggregation network for few-shot offline handwritten Chinese character recognition. Pattern Recognit. Lett. 125, 821–827 (2019)

    Article  Google Scholar 

  67. Cao, Z., Lu, J., Cui, S., Zhang, C.: Zero-shot handwritten Chinese character recognition with hierarchical decomposition embedding. Pattern Recognit. 107, 107488 (2020)

    Article  Google Scholar 

  68. Huang, Y., Jin, L., Peng, D.: Zero-shot Chinese text recognition via matching class embedding. In: 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, Sept 5–10, 2021, Proceedings, Part III, ser. Lecture Notes in Computer Science, vol. 12823. Springer (2021), pp. 127–141

    Google Scholar 

  69. Chen, J., Li, B., Xue, X.: Zero-shot Chinese character recognition with stroke-level decomposition. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event/Montreal, Canada, 19–27 Aug 2021. ijcai.org (2021), pp. 615–621

    Google Scholar 

  70. Zhang, C., Gupta, A., Zisserman, A.: Adaptive text recognition through visual matching. In: Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part XVI, ser. Lecture Notes in Computer Science, vol. 12361. Springer (2020), pp. 51–67

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xu-Cheng Yin or Chun Yang .

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yin, XC., Yang, C., Liu, C. (2024). Background. In: Open-Set Text Recognition. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-97-0361-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0361-6_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0360-9

  • Online ISBN: 978-981-97-0361-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics