Skip to main content
Log in

Few-shot learning in realistic settings for text CAPTCHA recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Text-based captcha is commonly used by many commercial websites. Most existing captcha recognition methods rely on deep learning and large-scale labeled data. Recently, few-shot learning has shown its effectiveness in various visual classification tasks in the case of insufficient data. However, the performance of current few-shot learning methods will deteriorate in realistic settings with class-imbalance and cross-domain. In this paper, we have proposed a novel captcha solver based on prototypical networks and model-agnostic meta-learning. Two major improvements, including multi-source domain data augmentation and intra-class variance distance weighting method, are proposed to alleviate the performance degradation problems caused by cross-domain and class imbalance. Our approaches achieve an average character accuracy of more than 90% in 5-shot and 10-shot tasks and an astonishing attack rate of 88% in one-shot tasks. The efficacy of this work may promote the application of few-shot learning in realistic settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Zi Y, Gao H, Cheng Z, Liu Y (2019) An end-to-end attack on text captchas. IEEE Trans Inf Forensics Secur 15:753–766

    Article  Google Scholar 

  2. Kim D, Sample L (2019) Search prevention with captcha against web indexing: a proof of concept. In: 2019 IEEE International conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), pp 219–224. IEEE

  3. Kumar M, Jindal MK, Kumar M (2022) Design of innovative CAPTCHA for hindi language. Neural Comput Appl 34:4957–4992

    Article  Google Scholar 

  4. Mohamed M, Sachdeva N, Georgescu M, Gao S, Saxena N, Zhang C, Kumaraguru P, Van Oorschot PC, Chen W-B (2014) A three-way investigation of a game-captcha: automated attacks, relay attacks and usability. In: Proceedings of the 9th ACM symposium on information, computer and communications security, pp 195–206

  5. Xu X, Liu L, Li B (2020) A survey of captcha technologies to distinguish between human and computer. Neurocomputing 408:292–307

    Article  Google Scholar 

  6. Yu N, Darling K (2019) A low-cost approach to crack python captchas using AI-based chosen-plaintext attack. Appl Sci 9(10):2010

    Article  Google Scholar 

  7. Wang J, Qin JH, Xiang XY, Tan Y, Pan N (2019) Captcha recognition based on deep convolutional neural network. Math Biosci Eng 16(5):5851–5861

    Article  MathSciNet  Google Scholar 

  8. Chellapilla K, Simard PY (2005) Using machine learning to break visual human interaction proofs (HIPs). Adv Neural Inf Process Syst 17:265–272

    Google Scholar 

  9. Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, Shet V (2013) Multi-digit number recognition from street view imagery using deep convolutional neural networks. CoRR arxiv:1312.6082

  10. Mansilla L, Echeveste R, Milone DH, Ferrante E (2021) Domain generalization via gradient surgery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6630–6638

  11. Li C, Chen X, Wang H, Wang P, Zhang Y, Wang W (2021) End-to-end attack on text-based CAPTCHAs based on cycle-consistent generative adversarial network. Neurocomputing 433:223–236

    Article  Google Scholar 

  12. Ye G, Tang Z, Fang D, Zhu Z, Feng Y, Xu P, Chen X, Wang Z (2018) Yet another text captcha solver: a generative adversarial network based approach. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 332–348

  13. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems. Red Hook, NY Curran, pp 2672–2680

  14. Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop, vol 2. Lille

  15. Ye H-J, Hu H, Zhan D-C, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8808–8817

  16. Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: a survey on few-shot learning. ACM Comput Surveys (CSUR) 53(3):1–34

    Article  Google Scholar 

  17. Cao T, Law M, Fidler S (2019) A theoretical analysis of the number of shots in few-shot learning. arXiv preprint arXiv:1909.11722

  18. Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. arXiv preprint arXiv:1904.04232

  19. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc,. https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

  20. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp. 1126–1135. PMLR

  21. Bansal A, Garg D, Gupta A, Gupta A (2008) Breaking a visual CAPTCHA: a novel approach using HMM

  22. Yan J, El Ahmad AS (2007) Breaking visual captchas with naive pattern recognition algorithms. In: Twenty-third annual computer security applications conference (ACSAC 2007), pp 279–291. IEEE

  23. Yan J, El Ahmad AS (2008) A low-cost attack on a microsoft captcha. In: Proceedings of the 15th ACM conference on computer and communications security, pp 543–554

  24. Gao H, Tang M, Liu Y, Zhang P, Liu X (2017) Research on the security of microsoft’s two-layer captcha. IEEE Trans Inf Forensics Secur 12(7):1671–1685

    Article  Google Scholar 

  25. Chen J, Luo X, Hu J, Ye D, Gong D (2018) An attack on hollow captcha using accurate filling and nonredundant merging. IETE Tech Rev 35(sup1):106–118

    Article  Google Scholar 

  26. Ferreira DD, Leira L, Mihaylova P, Georgieva P (2019) Breaking text-based captcha with sparse convolutional neural networks. Iberian conference on pattern recognition and image analysis. Springer, Cham, pp 404–415

    Google Scholar 

  27. Wang Z, Shi P (2021) Captcha recognition method based on CNN with focal loss. Complexity. https://doi.org/10.1155/2021/6641329

    Article  Google Scholar 

  28. Liu J, Zhang Z, Yang G (2021) Cross-class generative network for zero-shot learning. Inf Sci 555:147–163

    Article  MATH  MathSciNet  Google Scholar 

  29. Wang Y, Wei Y, Zhang M, Liu Y, Wang B (2021) Make complex captchas simple: a fast text captcha solver based on a small number of samples. Inf Sci 578:181–194

    Article  MathSciNet  Google Scholar 

  30. Alfassy A, Karlinsky L, Aides A, Shtok J, Harary S, Feris R, Giryes R, Bronstein AM (2019) Laso: Label-set operations networks for multi-label few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6548–6557

  31. Chu W-H, Li Y-J, Chang J-C, Wang Y-CF (2019) Spot and learn: a maximum-entropy patch sampler for few-shot image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6251–6260

  32. Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero-and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8247–8255

  33. Li A, Luo T, Lu Z, Xiang T, Wang L (2019) Large-scale few-shot learning: Knowledge transfer with class hierarchy. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7212–7220

  34. Schwartz E, Karlinsky L, Feris R, Giryes R, Bronstein AM (2019) Baby steps towards few-shot learning with multiple semantics. arXiv preprint arXiv:1906.01905

  35. Vinyals O, Blundell C, Lillicrap T, kavukcuoglu k, Wierstra D (2016) Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol 29. Curran Associates, Inc,. https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf

  36. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208

  37. Lifchitz Y, Avrithis Y, Picard S, Bursuc, A (2019) Dense classification and implanting for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9258–9267

  38. Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7260–7268

  39. Mahmud S, Lim KH (2022) One-step model agnostic meta-learning using two-phase switching optimization strategy. Neural Comput Appl 34:13529–13537

    Article  Google Scholar 

  40. Wertheimer D, Hariharan B (2019) Few-shot learning with localization in realistic settings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6558–6567

  41. Ochal M, Patacchiola M, Storkey A, Vazquez J, Wang S (2021) Few-shot learning with class imbalance. arXiv preprint arXiv:2101.02523

  42. Triantafillou E, Zhu T, Dumoulin V, Lamblin P, Evci U, Xu K, Goroshin R, Gelada C, Swersky K, Manzagol P-A, et al (2019) Meta-dataset: a dataset of datasets for learning to learn from few examples. arXiv preprint arXiv:1903.03096

  43. Guan J, Liu J, Sun J, Feng P, Shuai T, Wang W (2020) Meta metric learning for highly imbalanced aerial scene classification. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4047–4051. IEEE

  44. Chen X, Dai H, Li Y, Gao X, Song L (2020) Learning to stop while learning to predict. In: International conference on machine learning, pp 1520–1530. PMLR

  45. Guo Y, Codella NC, Karlinsky L, Codella JV, Smith JR, Saenko K, Rosing T, Feris R (2020) A broader study of cross-domain few-shot learning. European conference on computer vision. Springer, Cham, pp 124–141

    Google Scholar 

  46. Tseng H-Y, Lee H-Y, Huang J-B, Yang M-H (2020) Cross-domain few-shot classification via learned feature-wise transformation. arXiv preprint arXiv:2001.08735

  47. Sa L, Yu C, Ma X, Zhao X, Xie T (2022) Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput Appl 34(6):4733–4746

    Article  Google Scholar 

  48. Ye G, Tang Z, Fang D, Zhu Z, Feng Y, Xu P, Chen X, Han J, Wang Z (2020) Using generative adversarial networks to break and protect text captchas. ACM Trans Privacy Secur (TOPS) 23(2):1–29

    Article  Google Scholar 

  49. Tian S, Xiong T (2020) A generic solver combining unsupervised learning and representation learning for breaking text-based captchas. In: Proceedings of the web conference 2020, pp 860–871

  50. Chellapilla K, Larson K, Simard PY, Czerwinski M (2005) Computers beat humans at single character recognition in reading based human interaction proofs (HIPs). In: CEAS

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2021YFB2012400), the Fundamental Research Funds for the Central Universities (HIT.NSRIF.2020098), Key Technology Research and Development Program of Shandong (2017CXGC0706), National Regional Innovation Center Science and Technology Special Project of China (2017QYCX14).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Guodong Xin or Bailing Wang.

Ethics declarations

Conflict of interest

The author declares that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Wei, Y., Zhang, Y. et al. Few-shot learning in realistic settings for text CAPTCHA recognition. Neural Comput & Applic 35, 10751–10764 (2023). https://doi.org/10.1007/s00521-023-08262-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08262-0

Keywords

Navigation