Abstract
Quick Response (QR) codes usage in e-commerce is on the rise due to their versatility and ability to connect offline and online content, taking over almost every aspect of a business from posters to payments. Thus, many efforts have aimed at improving the visual quality of QR codes to be easily included in publicity designs in billboards and magazines. The most successful approaches, however, are slow since optimization algorithms are required for the generation of each beautified QR code, hindering its online customization. The aim of this paper is the fast generation of visually pleasant and robust QR codes. The proposed framework leverages state-of-the-art deep-learning algorithms to embed a color image into a baseline QR code in seconds while keeping a maximum probability of error during the decoding procedure. Halftoning techniques that exploit the human visual system (HVS) are used to smooth the embedding of the QR code structure in the final QR code image while reinforcing the decoding robustness. Compared to optimization-based methods, our framework provides similar qualitative results but is 3 orders of magnitude faster.
Similar content being viewed by others
Code Availability
The manuscript focuses on the development of the underlying method. However, the codes used for the experiments in this manuscript are available from the corresponding author on reasonable request.
References
7 Proven ways ecommerce uses QR codes to bring users back. https://blog.beaconstac.com/2019/10/how-ecommerce-is-using-qr-codes-in-interesting-ways-to-bring-back-users/https://blog.beaconstac.com/2019/10/how-ecommerce-is-using-qr-codes-in-interesting-ways-to-bring-back-users/. Accessed 20 March 2021
Agustsson E, Timofte R (2017) Ntire 2017 challenge on single image super-resolution: dataset and study. In: IEEE conf comput vis pattern recognit (CVPR) workshops
Baharav Z, Kakarala R (2013) Visually significant QR codes: image blending and statistical analysis. In: 2013 IEEE int. conf. on multimedia and expo (ICME). IEEE, pp 1–6
Chen O, Xu J, Koltun V (2017) Fast image processing with fully-convolutional networks. In: Proceedings of the IEEE int. conf. on computer vision, pp 2497–2506
Chidambaram N, Raj P, Thenmozhi K, Amirtharajan R (2021) A new method for producing 320-bit modified hash towards tamper detection and restoration in colour images. Multimed Tools Appl 80(15):23359–23375
Chu HK, Chang CS, Lee RR, Mitra NJ (2013) Halftone QR codes. ACM Trans Graph (TOG) 32(6):1–8
Cox R (2012) Qart codes. https://research.swtch.com/qart. Accessed 30 June 2018
Garateguy GJ, Arce GR, Lau DL, Villarreal OP (2014) Qr images: optimized image embedding in qr codes. IEEE Trans Image Process 23 (7):2842–2853
GoArt - AI photo effects. https://www.fotor.com/features/goart.html. Accessed 15 March 2021
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 4700–4708
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 1125–1134
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2. Lille
Lau DL, Arce GR (2018) Modern digital halftoning, vol 1. CRC Press
Lau DL, Ulichney R, Arce GR (2003) Blue and green noise halftoning models. IEEE Signal Proc Mag 20(4):28–38
Lin SS, Hu MC, Lee CH, Lee TY (2015) Efficient QR code beautification with high quality visual content. IEEE Trans Multimed 17(9):1515–1524
Mullen C (2020) The pandemic has given QR codes a shot in the arm. Will it last post-Covid? https://www.bizjournals.com/bizwomen/news/latest-news/2020/10/qr-codes-get-popular-with-dining-other-retailers.html?page=all. Accessed 15 Nov 2020
Owen S (2016) Multi-format 1d/2d barcode image processing library with clients for android, java and c++ (zxing). https://opensource.google.com/projects/zxing. Accessed 30 May 2018
Rodriguez J B, Arce G R, Lau D L (2008) Blue-noise multitone dithering. IEEE Trans Image Process 17(8):1368–1382
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Int. conf. on med image comput comput assist interv. Springer, pp 234–241
Samretwit D, Wakahara T (2011) Measurement of reading characteristics of multiplexed image in QR code. In: 2011 third int. conf. on intelligent networking and collaborative systems. IEEE, pp 552–557
Schonfeld E, Schiele B, Khoreva A (2020) A u-net based discriminator for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8207–8216
Technology I (2006) Automatic identification and data capture techniques - bar code symbology QR code
The new marketing opportunity for the e-commerce industry is QR codes. https://www.qr-code-generator.com/blog/e-commerce-industry-qr-codes/https://www.qr-code-generator.com/blog/e-commerce-industry-qr-codes/. Accessed 20 March 2021
Visualead company free visual QR code generator. https://www.visualead.com/. Accessed 30 May 2018
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The thrity-seventh asilomar conference on signals, systems & computers, vol 2, pp 1398–1402
Wengrowski E, Dana K (2019) Light field messaging with deep photographic steganography. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 1515–1524
Xu M, Li Q, Niu J, Liu X, Xu W, Lv P, Zhou B (2018) ART-UP: a novel method for generating scanning-robust aesthetic QR codes. arXiv:1803.02280 [cs.MM]
Xu M, Su H, Li Y, Li X, Liao J, Niu J, Lv P, Zhou B (2019) Stylized aesthetic QR code. IEEE Trans Multimed 21(8):1960–1970
Zhang Y, Deng S, Liu Z, Wang Y (2015) Aesthetic QR codes based on two-stage image blending. In: Int. conf. on multimedia modeling. Springer, pp 183–194
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: IEEE conf comput vis pattern recognit (CVPR
Acknowledgements
We would like to thank Gonzalo Garateguy and Jose Luis Paredes for their collaboration in this project.
Funding
This work was supported in part by Graphiclead LLC.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The logo images used in this paper were not endorsed by the trademark owners and they were used here only to evaluate the capabilities of our method. Coca Cola, Starbucks, FCB, Golden State Warriors, and Juan Valdez Logos are a trademark of the Coca Cola Company, Starbucks U.S. Brands, LLC, Futbol Club Barcelona, Golden State Warriors, LLC, and National Federation of Coffee Growers of Colombia, respectively. They do not sponsor, authorize or endorse the images in this paper. All Rights Reserved.
Conflict of Interests
Karelia Pena-Pena and Andrew Arce declare that they have no competing interests. Gonzalo Arce and Daniel Lau are part of Graphiclead LLC. Gonzalo Arce received support from Graphiclead LLC.
Additional information
Availability of Data and Material
The datasets used for the experiments in this manuscript are available from the corresponding author on reasonable request. Additionally, the method used for the generation of the dataset is described in the manuscript.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Encoding procedure
QR codes have a well-defined structure for correcting geometric transformations and achieving quick machine detection and decoding. Some encoders analyze the input message to find the most suitable encoding mode (numeric, alphanumeric, byte or Kanji), version and, error correction capability, while in others, these are chosen by the user. Although the method used to convert the input text into bits varies according to the encoding mode, a general procedure is followed to complete the bit stream so. That is, the encoded message is always located in the first m-bits of the bit stream which is padded with zeros to be later broken up into 8-bit codewords.
Due to QR codes structure, the QR code standard requires to add specific padding bits p after the message to completely fill the total capacity of the QR code. Finally, u error correction bits are generated for the message m and padding bits p, generating an n-bits Reed Solomon (RS) code. This bit stream is distributed through the QR code, starting from the bottom right as it is shown in Fig. 12.
Appendix B: Decoding procedure
As illustrated in Fig. 13, after an image of a QR code is captured by the decoder, it is binarized to a black and white image. The QR code standard [24] does not define a specific binarization method, thus, there is not an unique manner to perform this step. However, many widely used QR code decoders apply local thresholding strategies to overcome the effects of uneven lighting conditions. For instance, the widely used open source ZXing [19] library for QR code generation and reading computes local thresholds for non-overlapping blocks Bm,n as the mean value of the image intensity in a local window W as
where Yi,j is the image luminance value at a pixel location (i,j), tm,n is the local threshold for the block Bm,n, and N is the number of pixels in the local window W with size of Wa × Wa pixels. This binarization method has been used before to design a probability of error model on beautified QR codes [8, 30] and to evaluate their robustness [31].
The decoder then searches for the finder patterns and the alignment pattern regions, which possess a black-white-black-white-black pattern with a ratio of 1:1:3:1:1, and black-white-black pattern with ratio 1:1:1, respectively. Based on the three finder patterns and the closest-to-corner alignment pattern, a perspective transform is performed to compensate for geometric distortions from the scanning process. Once the detected barcode region is converted into a square region, a sampling grid is estimated, using as reference the detected coordinates of the finder and alignment pattern, as well as, the timing patterns. As shown in Fig. 13, the grid samples the central pixel from each binarized module and demodulate these data as ‘1’ or ‘0’ for white and black, respectively. After the error correction algorithm is applied, the message is extracted from the QR code.
Appendix C: Probability of error models
The luminance modification in the original binary structure of the QR code distorts the binarization threshold, increasing the probability of error during the decoding process. To consider this error, a normalized version of the probability of error model proposed in [8] is used. In this model, two independent probability models are developed based on the sampling accuracy of the central pixels of the QR code modules. Then, these two models are combined through a probability of sampling error model.
1.1 C.1 Probability of binarization error model
This first model considers the probability that any given pixel inside a QR code module is binarized in the wrong direction during the decoding procedure. Then, the probability of error at the detected binary pixel (i,j) is given by
where p0 = P(Qi,j = 0) and p1 = P(Qi,j = 1) are the probabilities of having a black or a white pixel in the QR code Q, respectively. As mentioned before, the calculation of the threshold tm,n depends on the decoder. In this case, a decoder built on ZXing [19] library open source code is considered. Consequently, the threshold depends on the local distribution of luminance values in the image, the distribution of pixels in the QR code, and the values of the parameters for the luminance transformation. Since \(\hat {Q}_{i,j}\) represents the binary decision made by the decoder depending on the threshold, the conditional probabilities above can be written as
The exact solution to (7) requires the knowledge of the joint distribution of the image and threshold values. The problem can be simplified assuming that the components of the embedding luminance are independent and can be represented in function of the luminance transformation parameters as [8]
Using (8) and (5) together with certain reasonable assumptions stated in [8], a model for the probability distribution of the threshold is given by
where n = ⌊pcN⌋ is the total number of pixels selected for modification in the window W, \(t_{k}=\lfloor \frac {k\beta }{N}+\frac {(n-k)}{N}\alpha \rfloor \) are all the possible average contributions of n modified pixels to the total threshold, and f is the convolution of fY(x) and fη(x). fY(x) represents the probability distribution of the unmodified pixels in the image, modeled as independent Gaussian random variables that depend on the window W, and fη(x) accounts for the noise at the detector, modeled also as a Gaussian distribution. Thus, conditioning on the set of modified pixels, the conditional probabilities in (7) can be determined as follows
where
Substituting (10) and (11) in (6), we finally estimate the probability of binarization error PBerr.
1.2 C.2 Probability of detection error
As in [8], assuming the central pixels in the QR code modules are precisely sampled, this model considers the probability of a given central pixel to be detected in the wrong direction during the decoding procedure. The probability of detection error per each QR code module is defined as
In this case, the binarization threshold at the decoder can be decomposed as t = μ + b + η, considering not only the average contribution to the threshold from the modified pixels μ computed as
but also from the non-modified pixels b in the window, as well as the noise at the decoder η. In (15), \(n_{\alpha }, n_{\beta }, n_{\alpha _{c}}\) y \(n_{\beta _{c}}\) are the number of pixels per each modified luminance level and N is the total number of pixels in the window W.
Non-modified pixels are modeled as Gaussian random variables as before, with mean μb = E[Yi,j](1 − pc) and variance \({\sigma _{b}^{2}}=\frac {Var[Y_{i,j}]}{N} (1- p_{c})\). As a result, the probability distribution of the threshold t is Gaussian with mean μt = μ + μb and variance \({\sigma _{t}^{2}}={\sigma _{b}^{2}}+\sigma _{\eta }^{2}\) where \(\sigma _{\eta }^{2}\) is the variance of the Gaussian noise at the decoder. Substituting this probability distribution of the threshold in the normalized probabilities defined in (12), we can estimate the probability of detection error PDerr in (14).
1.3 C.3 Global probability of error
A probability of error sampling model is considered to combine the two independent stages in a global probability of error [8]. According to the decoding procedure, ideally, central pixels are more likely to be sampled. Thus, it is reasonable to model the sampling process by means of a Gaussian distribution around the center regions with σ = Wa/6. The probability of sampling out of the center region ps can be determined by integrating outside the central region da × da. In this paper, this probability is also normalized. Once the probability of sampling error ps is known, the global probability of error is determined by
Rights and permissions
About this article
Cite this article
Pena-Pena, K., Lau, D.L., Arce, A.J. et al. QRnet: fast learning-based QR code image embedding. Multimed Tools Appl 81, 10653–10672 (2022). https://doi.org/10.1007/s11042-022-12357-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12357-6