Abstract
Text-based captcha is commonly used by many commercial websites. Most existing captcha recognition methods rely on deep learning and large-scale labeled data. Recently, few-shot learning has shown its effectiveness in various visual classification tasks in the case of insufficient data. However, the performance of current few-shot learning methods will deteriorate in realistic settings with class-imbalance and cross-domain. In this paper, we have proposed a novel captcha solver based on prototypical networks and model-agnostic meta-learning. Two major improvements, including multi-source domain data augmentation and intra-class variance distance weighting method, are proposed to alleviate the performance degradation problems caused by cross-domain and class imbalance. Our approaches achieve an average character accuracy of more than 90% in 5-shot and 10-shot tasks and an astonishing attack rate of 88% in one-shot tasks. The efficacy of this work may promote the application of few-shot learning in realistic settings.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Zi Y, Gao H, Cheng Z, Liu Y (2019) An end-to-end attack on text captchas. IEEE Trans Inf Forensics Secur 15:753–766
Kim D, Sample L (2019) Search prevention with captcha against web indexing: a proof of concept. In: 2019 IEEE International conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), pp 219–224. IEEE
Kumar M, Jindal MK, Kumar M (2022) Design of innovative CAPTCHA for hindi language. Neural Comput Appl 34:4957–4992
Mohamed M, Sachdeva N, Georgescu M, Gao S, Saxena N, Zhang C, Kumaraguru P, Van Oorschot PC, Chen W-B (2014) A three-way investigation of a game-captcha: automated attacks, relay attacks and usability. In: Proceedings of the 9th ACM symposium on information, computer and communications security, pp 195–206
Xu X, Liu L, Li B (2020) A survey of captcha technologies to distinguish between human and computer. Neurocomputing 408:292–307
Yu N, Darling K (2019) A low-cost approach to crack python captchas using AI-based chosen-plaintext attack. Appl Sci 9(10):2010
Wang J, Qin JH, Xiang XY, Tan Y, Pan N (2019) Captcha recognition based on deep convolutional neural network. Math Biosci Eng 16(5):5851–5861
Chellapilla K, Simard PY (2005) Using machine learning to break visual human interaction proofs (HIPs). Adv Neural Inf Process Syst 17:265–272
Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, Shet V (2013) Multi-digit number recognition from street view imagery using deep convolutional neural networks. CoRR arxiv:1312.6082
Mansilla L, Echeveste R, Milone DH, Ferrante E (2021) Domain generalization via gradient surgery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6630–6638
Li C, Chen X, Wang H, Wang P, Zhang Y, Wang W (2021) End-to-end attack on text-based CAPTCHAs based on cycle-consistent generative adversarial network. Neurocomputing 433:223–236
Ye G, Tang Z, Fang D, Zhu Z, Feng Y, Xu P, Chen X, Wang Z (2018) Yet another text captcha solver: a generative adversarial network based approach. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 332–348
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems. Red Hook, NY Curran, pp 2672–2680
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop, vol 2. Lille
Ye H-J, Hu H, Zhan D-C, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8808–8817
Wang Y, Yao Q, Kwok JT, Ni LM (2020) Generalizing from a few examples: a survey on few-shot learning. ACM Comput Surveys (CSUR) 53(3):1–34
Cao T, Law M, Fidler S (2019) A theoretical analysis of the number of shots in few-shot learning. arXiv preprint arXiv:1909.11722
Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. arXiv preprint arXiv:1904.04232
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc,. https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp. 1126–1135. PMLR
Bansal A, Garg D, Gupta A, Gupta A (2008) Breaking a visual CAPTCHA: a novel approach using HMM
Yan J, El Ahmad AS (2007) Breaking visual captchas with naive pattern recognition algorithms. In: Twenty-third annual computer security applications conference (ACSAC 2007), pp 279–291. IEEE
Yan J, El Ahmad AS (2008) A low-cost attack on a microsoft captcha. In: Proceedings of the 15th ACM conference on computer and communications security, pp 543–554
Gao H, Tang M, Liu Y, Zhang P, Liu X (2017) Research on the security of microsoft’s two-layer captcha. IEEE Trans Inf Forensics Secur 12(7):1671–1685
Chen J, Luo X, Hu J, Ye D, Gong D (2018) An attack on hollow captcha using accurate filling and nonredundant merging. IETE Tech Rev 35(sup1):106–118
Ferreira DD, Leira L, Mihaylova P, Georgieva P (2019) Breaking text-based captcha with sparse convolutional neural networks. Iberian conference on pattern recognition and image analysis. Springer, Cham, pp 404–415
Wang Z, Shi P (2021) Captcha recognition method based on CNN with focal loss. Complexity. https://doi.org/10.1155/2021/6641329
Liu J, Zhang Z, Yang G (2021) Cross-class generative network for zero-shot learning. Inf Sci 555:147–163
Wang Y, Wei Y, Zhang M, Liu Y, Wang B (2021) Make complex captchas simple: a fast text captcha solver based on a small number of samples. Inf Sci 578:181–194
Alfassy A, Karlinsky L, Aides A, Shtok J, Harary S, Feris R, Giryes R, Bronstein AM (2019) Laso: Label-set operations networks for multi-label few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6548–6557
Chu W-H, Li Y-J, Chang J-C, Wang Y-CF (2019) Spot and learn: a maximum-entropy patch sampler for few-shot image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6251–6260
Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero-and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8247–8255
Li A, Luo T, Lu Z, Xiang T, Wang L (2019) Large-scale few-shot learning: Knowledge transfer with class hierarchy. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7212–7220
Schwartz E, Karlinsky L, Feris R, Giryes R, Bronstein AM (2019) Baby steps towards few-shot learning with multiple semantics. arXiv preprint arXiv:1906.01905
Vinyals O, Blundell C, Lillicrap T, kavukcuoglu k, Wierstra D (2016) Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol 29. Curran Associates, Inc,. https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Lifchitz Y, Avrithis Y, Picard S, Bursuc, A (2019) Dense classification and implanting for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9258–9267
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7260–7268
Mahmud S, Lim KH (2022) One-step model agnostic meta-learning using two-phase switching optimization strategy. Neural Comput Appl 34:13529–13537
Wertheimer D, Hariharan B (2019) Few-shot learning with localization in realistic settings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6558–6567
Ochal M, Patacchiola M, Storkey A, Vazquez J, Wang S (2021) Few-shot learning with class imbalance. arXiv preprint arXiv:2101.02523
Triantafillou E, Zhu T, Dumoulin V, Lamblin P, Evci U, Xu K, Goroshin R, Gelada C, Swersky K, Manzagol P-A, et al (2019) Meta-dataset: a dataset of datasets for learning to learn from few examples. arXiv preprint arXiv:1903.03096
Guan J, Liu J, Sun J, Feng P, Shuai T, Wang W (2020) Meta metric learning for highly imbalanced aerial scene classification. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4047–4051. IEEE
Chen X, Dai H, Li Y, Gao X, Song L (2020) Learning to stop while learning to predict. In: International conference on machine learning, pp 1520–1530. PMLR
Guo Y, Codella NC, Karlinsky L, Codella JV, Smith JR, Saenko K, Rosing T, Feris R (2020) A broader study of cross-domain few-shot learning. European conference on computer vision. Springer, Cham, pp 124–141
Tseng H-Y, Lee H-Y, Huang J-B, Yang M-H (2020) Cross-domain few-shot classification via learned feature-wise transformation. arXiv preprint arXiv:2001.08735
Sa L, Yu C, Ma X, Zhao X, Xie T (2022) Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput Appl 34(6):4733–4746
Ye G, Tang Z, Fang D, Zhu Z, Feng Y, Xu P, Chen X, Han J, Wang Z (2020) Using generative adversarial networks to break and protect text captchas. ACM Trans Privacy Secur (TOPS) 23(2):1–29
Tian S, Xiong T (2020) A generic solver combining unsupervised learning and representation learning for breaking text-based captchas. In: Proceedings of the web conference 2020, pp 860–871
Chellapilla K, Larson K, Simard PY, Czerwinski M (2005) Computers beat humans at single character recognition in reading based human interaction proofs (HIPs). In: CEAS
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2021YFB2012400), the Fundamental Research Funds for the Central Universities (HIT.NSRIF.2020098), Key Technology Research and Development Program of Shandong (2017CXGC0706), National Regional Innovation Center Science and Technology Special Project of China (2017QYCX14).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The author declares that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Wei, Y., Zhang, Y. et al. Few-shot learning in realistic settings for text CAPTCHA recognition. Neural Comput & Applic 35, 10751–10764 (2023). https://doi.org/10.1007/s00521-023-08262-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08262-0