TICS: text–image-based semantic CAPTCHA synthesis via multi-condition adversarial learning

Jia, Xinkang; Xiao, Jun; Wu, Chao

doi:10.1007/s00371-021-02061-1

TICS: text–image-based semantic CAPTCHA synthesis via multi-condition adversarial learning

Original article
Published: 09 February 2021

Volume 38, pages 963–975, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

751 Accesses
3 Citations
Explore all metrics

Abstract

CAPTCHA is used to distinguish humans from automated programs and plays an important role in multimedia security mechanisms. Traditional CAPTCHA methods like image-based CAPTCHA and text-based CAPTCHA are usually based on word-level understanding, which can be easily cracked due to the recent success of deep learning techniques. To this end, this paper proposes a text–image-based CAPTCHA based on the cognition process and semantic reasoning and a novel model to generate the CAPTCHA. This method synthesizes three features: sentence, object, and location to generate a multi-conditional CAPTCHA that can resist the attack of the classification of CNN. A quantity of experiments has been conducted, and the result showed that the classification of ResNet-50 on the proposed TIC only achieves 3.38% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Transformer-Based Network with Character-Level Masks for CAPTCHA Recognition

Breaking Text-Based CAPTCHA with Sparse Convolutional Neural Networks

Generating Adversarial Robust Defensive CAPTCHA (GARD-CAPTCHA) in Convolutional Neural Networks

References

Aleksandrovich, P.N., Alekseevich, N.I., Vladimirovich, V.M., Igorevich, N.A., Borisovna, P.V., Igorevna, N.O.: Image-based captcha system. US Patent App. 13/528,373 (2012)
Bursztein, E., Martin, M., Mitchell, J.: Text-based captcha strengths and weaknesses. In: Proceedings of the 18th ACM conference on Computer and communications security, pp. 125–138. ACM (2011)
Chen, J., Luo, X., Guo, Y., Zhang, Y., Gong, D.: A survey on breaking technique of text-based captcha. Secur. Commun. Netw. 2017 (2017)
Cheng, Z., Gao, H., Liu, Z., Wu, H., Zi, Y., Pei, G.: Image-based captchas based on neural style transfer. IET Inf. Secur. 13(6), 519–529 (2019)
Article Google Scholar
Chew, M., Tygar, J.D.: Image recognition captchas. In: International Conference on Information Security, pp. 268–279. Springer (2004)
Datta, R., Li, J., Wang, J.Z.: Imagination: a robust image-based captcha generation system. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 331–334. ACM (2005)
Dong, H., Yu, S., Wu, C., Guo, Y.: Semantic image synthesis via adversarial learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5706–5714 (2017)
Gao, H., Wang, W., Qi, J., Wang, X., Liu, X., Yan, J.: The robustness of hollow captchas. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 1075–1086. ACM (2013)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hwang, K.F., Huang, C.C., You, G.N.: A spelling based captcha system by using click. In: 2012 International Symposium on Biometrics and Security Technologies, pp. 1–8. IEEE (2012)
Ince, I.F., Yengin, I., Salman, Y.B., Cho, H.G., Yang, T.C.: Designing captcha algorithm: splitting and rotating the images against OCRS. In: Third International Conference on Convergence and Hybrid Information Technology, 2008. ICCIT’08, vol. 2, pp. 596–601. IEEE (2008)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539 (2014)
Kwon, H., Kim, Y., Yoon, H., Choi, D.: Captcha image generation systems using generative adversarial networks. IEICE Trans. Inf. Syst. 101(2), 543–546 (2018)
Article Google Scholar
Kwon, H., Yoon, H., Park, K.W.: Captcha image generation using style transfer learning in deep neural network. In: International Workshop on Information Security Applications, pp. 234–246. Springer (2019)
Kwon, H., Yoon, H., Park, K.W.: Robust captcha image generation enhanced with adversarial example methods. IEICE Trans. Inf. Syst. 103(4), 879–882 (2020)
Article Google Scholar
Lipton, Z.C., Tripathi, S.: Precise recovery of latent vectors from generative adversarial networks. arXiv preprint arXiv:1702.04782 (2017)
Liu, F., Li, Z., Li, X., Lv, T.: A text-based captcha cracking system with generative adversarial networks. In: 2018 IEEE International Symposium on Multimedia (ISM), pp. 192–193. https://doi.org/10.1109/ISM.2018.000-9 (2018)
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Mori, G., Malik, J.: Breaking a visual captcha. Unpublished manuscript (2002)
Park, H., Yoo, Y., Kwak, N.: MC-GAN: multi-conditional generative adversarial network for image synthesis. In: The British Machine Vision Conference (BMVC) (2018)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2016)
Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49–58 (2016)
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text-to-image synthesis. In: Proceedings of The 33rd International Conference on Machine Learning (2016)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sivakorn, S., Polakis, I., Keromytis, A.D.: I am robot: (deep) learning to break semantic image captchas. In: IEEE European Symposium on Security and Privacy (EuroS&P), pp. 388–403. IEEE (2016)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Von Ahn, L., Blum, M., Hopper, N.J., Langford, J.: Captcha: using hard AI problems for security. In: International Conference on the Theory and Applications of Cryptographic Techniques, pp. 294–311. Springer (2003)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset. Technical report (2011)
Wang, Y., Lu, M.: An optimized system to solve text-based captcha. arXiv preprint arXiv:1806.07202 (2018)
Ye, G., Tang, Z., Fang, D., Zhu, Z., Feng, Y., Xu, P., Chen, X., Wang, Z.: Yet another text captcha solver: a generative adversarial network based approach. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 332–348. ACM (2018)
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.N.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5907–5915 (2017)
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.N.: Stackgan++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1947–1962 (2018)
Article Google Scholar
Zhu, B.B., Yan, J., Li, Q., Yang, C., Liu, J., Xu, N., Yi, M., Cai, K.: Attacks and design of image recognition captchas. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 187–200. ACM (2010)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232

Download references

Acknowledgements

This work is supported by Fundamental Research Funds for the Central Universities, Artificial Intelligence Research Foundation of Baidu Inc., Zhejiang University and Cybervein Joint Research Lab, Zhejiang Natural Science Foundation (LY19F020051, R19F020009, LZ17F020-001), National Natural Science Foundation of China (61572-431, U19B2042), Key R&D Program of Zhejiang Province (2018C01006), Program of China Knowledge Center for Engineering Sciences and Technology, Program of ZJU and Tongdun Joint Research Lab, Program of ZJU and Horizon Robotics Joint Research Lab, Joint Research Program of ZJU and Hikvision Research Institute, and Major Scientific Research Project of Zhejiang Lab (No. 2018EC0ZX01-1), CAS Earth Science Research Project(XDA19020104).

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, China
Xinkang Jia, Jun Xiao & Chao Wu

Authors

Xinkang Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jun Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Chao Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jia, X., Xiao, J. & Wu, C. TICS: text–image-based semantic CAPTCHA synthesis via multi-condition adversarial learning. Vis Comput 38, 963–975 (2022). https://doi.org/10.1007/s00371-021-02061-1

Download citation

Accepted: 03 January 2021
Published: 09 February 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s00371-021-02061-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TICS: text–image-based semantic CAPTCHA synthesis via multi-condition adversarial learning

Abstract

Access this article

Similar content being viewed by others

A Transformer-Based Network with Character-Level Masks for CAPTCHA Recognition

Breaking Text-Based CAPTCHA with Sparse Convolutional Neural Networks

Generating Adversarial Robust Defensive CAPTCHA (GARD-CAPTCHA) in Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TICS: text–image-based semantic CAPTCHA synthesis via multi-condition adversarial learning

Abstract

Access this article

Similar content being viewed by others

A Transformer-Based Network with Character-Level Masks for CAPTCHA Recognition

Breaking Text-Based CAPTCHA with Sparse Convolutional Neural Networks

Generating Adversarial Robust Defensive CAPTCHA (GARD-CAPTCHA) in Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation