QRnet: fast learning-based QR code image embedding

Pena-Pena, Karelia; Lau, Daniel L.; Arce, Andrew J.; Arce, Gonzalo R.

doi:10.1007/s11042-022-12357-6

QRnet: fast learning-based QR code image embedding

Published: 16 February 2022

Volume 81, pages 10653–10672, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Karelia Pena-Pena ORCID: orcid.org/0000-0001-8214-6852¹,
Daniel L. Lau²,
Andrew J. Arce³ &
…
Gonzalo R. Arce¹

579 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Quick Response (QR) codes usage in e-commerce is on the rise due to their versatility and ability to connect offline and online content, taking over almost every aspect of a business from posters to payments. Thus, many efforts have aimed at improving the visual quality of QR codes to be easily included in publicity designs in billboards and magazines. The most successful approaches, however, are slow since optimization algorithms are required for the generation of each beautified QR code, hindering its online customization. The aim of this paper is the fast generation of visually pleasant and robust QR codes. The proposed framework leverages state-of-the-art deep-learning algorithms to embed a color image into a baseline QR code in seconds while keeping a maximum probability of error during the decoding procedure. Halftoning techniques that exploit the human visual system (HVS) are used to smooth the embedding of the QR code structure in the final QR code image while reinforcing the decoding robustness. Compared to optimization-based methods, our framework provides similar qualitative results but is 3 orders of magnitude faster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Auto ROI & mask R-CNN model for QR code beautification (ARM-QR)

Article 25 January 2023

Enrichment of Visual Appearance of Aesthetic QR Code

Embedding Function Icons into QR Codes

Code Availability

The manuscript focuses on the development of the underlying method. However, the codes used for the experiments in this manuscript are available from the corresponding author on reasonable request.

References

7 Proven ways ecommerce uses QR codes to bring users back. https://blog.beaconstac.com/2019/10/how-ecommerce-is-using-qr-codes-in-interesting-ways-to-bring-back-users/https://blog.beaconstac.com/2019/10/how-ecommerce-is-using-qr-codes-in-interesting-ways-to-bring-back-users/. Accessed 20 March 2021
Agustsson E, Timofte R (2017) Ntire 2017 challenge on single image super-resolution: dataset and study. In: IEEE conf comput vis pattern recognit (CVPR) workshops
Baharav Z, Kakarala R (2013) Visually significant QR codes: image blending and statistical analysis. In: 2013 IEEE int. conf. on multimedia and expo (ICME). IEEE, pp 1–6
Chen O, Xu J, Koltun V (2017) Fast image processing with fully-convolutional networks. In: Proceedings of the IEEE int. conf. on computer vision, pp 2497–2506
Chidambaram N, Raj P, Thenmozhi K, Amirtharajan R (2021) A new method for producing 320-bit modified hash towards tamper detection and restoration in colour images. Multimed Tools Appl 80(15):23359–23375
Article Google Scholar
Chu HK, Chang CS, Lee RR, Mitra NJ (2013) Halftone QR codes. ACM Trans Graph (TOG) 32(6):1–8
Article Google Scholar
Cox R (2012) Qart codes. https://research.swtch.com/qart. Accessed 30 June 2018
Garateguy GJ, Arce GR, Lau DL, Villarreal OP (2014) Qr images: optimized image embedding in qr codes. IEEE Trans Image Process 23 (7):2842–2853
Article MathSciNet Google Scholar
GoArt - AI photo effects. https://www.fotor.com/features/goart.html. Accessed 15 March 2021
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 4700–4708
Isola P, Zhu J, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 1125–1134
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2. Lille
Lau DL, Arce GR (2018) Modern digital halftoning, vol 1. CRC Press
Lau DL, Ulichney R, Arce GR (2003) Blue and green noise halftoning models. IEEE Signal Proc Mag 20(4):28–38
Article Google Scholar
Lin SS, Hu MC, Lee CH, Lee TY (2015) Efficient QR code beautification with high quality visual content. IEEE Trans Multimed 17(9):1515–1524
Article Google Scholar
Mullen C (2020) The pandemic has given QR codes a shot in the arm. Will it last post-Covid? https://www.bizjournals.com/bizwomen/news/latest-news/2020/10/qr-codes-get-popular-with-dining-other-retailers.html?page=all. Accessed 15 Nov 2020
Owen S (2016) Multi-format 1d/2d barcode image processing library with clients for android, java and c++ (zxing). https://opensource.google.com/projects/zxing. Accessed 30 May 2018
Rodriguez J B, Arce G R, Lau D L (2008) Blue-noise multitone dithering. IEEE Trans Image Process 17(8):1368–1382
Article MathSciNet Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Int. conf. on med image comput comput assist interv. Springer, pp 234–241
Samretwit D, Wakahara T (2011) Measurement of reading characteristics of multiplexed image in QR code. In: 2011 third int. conf. on intelligent networking and collaborative systems. IEEE, pp 552–557
Schonfeld E, Schiele B, Khoreva A (2020) A u-net based discriminator for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8207–8216
Technology I (2006) Automatic identification and data capture techniques - bar code symbology QR code
The new marketing opportunity for the e-commerce industry is QR codes. https://www.qr-code-generator.com/blog/e-commerce-industry-qr-codes/https://www.qr-code-generator.com/blog/e-commerce-industry-qr-codes/. Accessed 20 March 2021
Visualead company free visual QR code generator. https://www.visualead.com/. Accessed 30 May 2018
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Wang Z, Simoncelli EP, Bovik AC (2003) Multiscale structural similarity for image quality assessment. In: The thrity-seventh asilomar conference on signals, systems & computers, vol 2, pp 1398–1402
Wengrowski E, Dana K (2019) Light field messaging with deep photographic steganography. In: Proc. IEEE comput. soc. conf. comput. vis. pattern recognit, pp 1515–1524
Xu M, Li Q, Niu J, Liu X, Xu W, Lv P, Zhou B (2018) ART-UP: a novel method for generating scanning-robust aesthetic QR codes. arXiv:1803.02280 [cs.MM]
Xu M, Su H, Li Y, Li X, Liao J, Niu J, Lv P, Zhou B (2019) Stylized aesthetic QR code. IEEE Trans Multimed 21(8):1960–1970
Article Google Scholar
Zhang Y, Deng S, Liu Z, Wang Y (2015) Aesthetic QR codes based on two-stage image blending. In: Int. conf. on multimedia modeling. Springer, pp 183–194
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: IEEE conf comput vis pattern recognit (CVPR

Download references

Acknowledgements

We would like to thank Gonzalo Garateguy and Jose Luis Paredes for their collaboration in this project.

Funding

This work was supported in part by Graphiclead LLC.

Author information

Authors and Affiliations

University of Delaware, Newark, DE, 19716, USA
Karelia Pena-Pena & Gonzalo R. Arce
University of Kentucky, Lexington, KY, 40506, USA
Daniel L. Lau
Newark, DE, 19711, USA
Andrew J. Arce

Authors

Karelia Pena-Pena
View author publications
You can also search for this author in PubMed Google Scholar
Daniel L. Lau
View author publications
You can also search for this author in PubMed Google Scholar
Andrew J. Arce
View author publications
You can also search for this author in PubMed Google Scholar
Gonzalo R. Arce
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karelia Pena-Pena.

Ethics declarations

The logo images used in this paper were not endorsed by the trademark owners and they were used here only to evaluate the capabilities of our method. Coca Cola, Starbucks, FCB, Golden State Warriors, and Juan Valdez Logos are a trademark of the Coca Cola Company, Starbucks U.S. Brands, LLC, Futbol Club Barcelona, Golden State Warriors, LLC, and National Federation of Coffee Growers of Colombia, respectively. They do not sponsor, authorize or endorse the images in this paper. All Rights Reserved.

Conflict of Interests

Karelia Pena-Pena and Andrew Arce declare that they have no competing interests. Gonzalo Arce and Daniel Lau are part of Graphiclead LLC. Gonzalo Arce received support from Graphiclead LLC.

Additional information

Availability of Data and Material

The datasets used for the experiments in this manuscript are available from the corresponding author on reasonable request. Additionally, the method used for the generation of the dataset is described in the manuscript.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Encoding procedure

QR codes have a well-defined structure for correcting geometric transformations and achieving quick machine detection and decoding. Some encoders analyze the input message to find the most suitable encoding mode (numeric, alphanumeric, byte or Kanji), version and, error correction capability, while in others, these are chosen by the user. Although the method used to convert the input text into bits varies according to the encoding mode, a general procedure is followed to complete the bit stream s_o. That is, the encoded message is always located in the first m-bits of the bit stream which is padded with zeros to be later broken up into 8-bit codewords.

Due to QR codes structure, the QR code standard requires to add specific padding bits p after the message to completely fill the total capacity of the QR code. Finally, u error correction bits are generated for the message m and padding bits p, generating an n-bits Reed Solomon (RS) code. This bit stream is distributed through the QR code, starting from the bottom right as it is shown in Fig. 12.

Appendix B: Decoding procedure

As illustrated in Fig. 13, after an image of a QR code is captured by the decoder, it is binarized to a black and white image. The QR code standard [24] does not define a specific binarization method, thus, there is not an unique manner to perform this step. However, many widely used QR code decoders apply local thresholding strategies to overcome the effects of uneven lighting conditions. For instance, the widely used open source ZXing [19] library for QR code generation and reading computes local thresholds for non-overlapping blocks B_m,n as the mean value of the image intensity in a local window W as

$$ t_{m,n} = \frac{1}{N} \sum\limits_{(i,j)\in \mathbf{W}} Y_{i,j}, $$

(5)

where Y_i,j is the image luminance value at a pixel location (i,j), t_m,n is the local threshold for the block B_m,n, and N is the number of pixels in the local window W with size of W_a × W_a pixels. This binarization method has been used before to design a probability of error model on beautified QR codes [8, 30] and to evaluate their robustness [31].

The decoder then searches for the finder patterns and the alignment pattern regions, which possess a black-white-black-white-black pattern with a ratio of 1:1:3:1:1, and black-white-black pattern with ratio 1:1:1, respectively. Based on the three finder patterns and the closest-to-corner alignment pattern, a perspective transform is performed to compensate for geometric distortions from the scanning process. Once the detected barcode region is converted into a square region, a sampling grid is estimated, using as reference the detected coordinates of the finder and alignment pattern, as well as, the timing patterns. As shown in Fig. 13, the grid samples the central pixel from each binarized module and demodulate these data as ‘1’ or ‘0’ for white and black, respectively. After the error correction algorithm is applied, the message is extracted from the QR code.

Appendix C: Probability of error models

The luminance modification in the original binary structure of the QR code distorts the binarization threshold, increasing the probability of error during the decoding process. To consider this error, a normalized version of the probability of error model proposed in [8] is used. In this model, two independent probability models are developed based on the sampling accuracy of the central pixels of the QR code modules. Then, these two models are combined through a probability of sampling error model.

1.1 C.1 Probability of binarization error model

This first model considers the probability that any given pixel inside a QR code module is binarized in the wrong direction during the decoding procedure. Then, the probability of error at the detected binary pixel (i,j) is given by

$$ \begin{array}{@{}rcl@{}} P_{Berr} &=& P(\hat{Q}_{i,j}=1|Q_{i,j}=0) p_{0} \\ && + P(\hat{Q}_{i,j}=0|Q_{i,j}=1) p_{1}, \end{array} $$

(6)

where p₀ = P(Q_i,j = 0) and p₁ = P(Q_i,j = 1) are the probabilities of having a black or a white pixel in the QR code Q, respectively. As mentioned before, the calculation of the threshold t_m,n depends on the decoder. In this case, a decoder built on ZXing [19] library open source code is considered. Consequently, the threshold depends on the local distribution of luminance values in the image, the distribution of pixels in the QR code, and the values of the parameters for the luminance transformation. Since $\hat {Q}_{i,j}$ represents the binary decision made by the decoder depending on the threshold, the conditional probabilities above can be written as

$$ \begin{array}{@{}rcl@{}} P(\hat{Q}_{i,j} = 1|Q_{i,j}=0) &=& P({Y}_{i,j}^{out}>t_{m,n}|Q_{i,j}=0),\\ P(\hat{Q}_{i,j} = 0|Q_{i,j}=1) &=& P(Y_{i,j}^{out}<t_{m,n}|Q_{i,j}=1). \end{array} $$

(7)

The exact solution to (7) requires the knowledge of the joint distribution of the image and threshold values. The problem can be simplified assuming that the components of the embedding luminance are independent and can be represented in function of the luminance transformation parameters as [8]

$$ {Y}_{i,j}^{out} = [\beta Q_{i,j}+\alpha (1-Q_{i,j})] I_{p_{c}(i,j)}+Y_{i,j} (1-I_{p_{c}(i,j)}). $$

(8)

Using (8) and (5) together with certain reasonable assumptions stated in [8], a model for the probability distribution of the threshold is given by

$$ f_{t}(x) = \sum\limits_{k=0}^{n} f (x-t_{k}) \left( \begin{array}{c}n\\ k \end{array}\right) {{p}_{1}^{k}} p_{0}^{n-k}, $$

(9)

where n = ⌊p_cN⌋ is the total number of pixels selected for modification in the window W, $t_{k}=\lfloor \frac {k\beta }{N}+\frac {(n-k)}{N}\alpha \rfloor $ are all the possible average contributions of n modified pixels to the total threshold, and f is the convolution of f_Y(x) and f_η(x). f_Y(x) represents the probability distribution of the unmodified pixels in the image, modeled as independent Gaussian random variables that depend on the window W, and f_η(x) accounts for the noise at the detector, modeled also as a Gaussian distribution. Thus, conditioning on the set of modified pixels, the conditional probabilities in (7) can be determined as follows

$$ \begin{array}{@{}rcl@{}} P(Y_{i,j}^{out}>t_{m,n}|Q_{i,j}=0) &=& P(\alpha>t_{m,n}) p_{c} \\ && + P(Y_{i,j}>t_{m,n}) (1- p_{c}), \end{array} $$

(10)

$$ \begin{array}{@{}rcl@{}} P(Y_{i,j}^{out}<t_{m,n}|Q_{i,j}=1) &=& P(\beta<t_{m,n}) p_{c} \\ && + P(Y_{i,j}<t_{m,n})(1- p_{c}), \end{array} $$

(11)

where

$$ \begin{array}{@{}rcl@{}} P(x>t_{m,n}) &=& \frac{{{\int}_{\!\!0}^{x}} f_{t}(t)dt}{{{\int}_{\!\!0}^{1}} f_{t}(t)dt}, \end{array} $$

(12)

$$ \begin{array}{@{}rcl@{}} P(x<t_{m,n}) &=& \frac{{{\int}_{\!\!x}^{1}} f_{t}(t)dt}{{{\int}_{\!\!0}^{1}} f_{t}(t)dt}. \end{array} $$

(13)

Substituting (10) and (11) in (6), we finally estimate the probability of binarization error P_Berr.

1.2 C.2 Probability of detection error

As in [8], assuming the central pixels in the QR code modules are precisely sampled, this model considers the probability of a given central pixel to be detected in the wrong direction during the decoding procedure. The probability of detection error per each QR code module is defined as

$$ P_{Derr} = P(\alpha_{c}>t)p_{0}+P(\beta_{c}<t)p_{1}. $$

(14)

In this case, the binarization threshold at the decoder can be decomposed as t = μ + b + η, considering not only the average contribution to the threshold from the modified pixels μ computed as

$$ \mu = \frac{\alpha n_{\alpha}+\beta n_{\beta}+\alpha_{c} n_{\alpha_{c}}+\beta_{c} n_{\beta_{c}}}{N}, $$

(15)

but also from the non-modified pixels b in the window, as well as the noise at the decoder η. In (15), $n_{\alpha }, n_{\beta }, n_{\alpha _{c}}$ y $n_{\beta _{c}}$ are the number of pixels per each modified luminance level and N is the total number of pixels in the window W.

Non-modified pixels are modeled as Gaussian random variables as before, with mean μ_b = E[Y_i,j](1 − p_c) and variance ${\sigma _{b}^{2}}=\frac {Var[Y_{i,j}]}{N} (1- p_{c})$. As a result, the probability distribution of the threshold t is Gaussian with mean μ_t = μ + μ_b and variance ${\sigma _{t}^{2}}={\sigma _{b}^{2}}+\sigma _{\eta }^{2}$ where $\sigma _{\eta }^{2}$ is the variance of the Gaussian noise at the decoder. Substituting this probability distribution of the threshold in the normalized probabilities defined in (12), we can estimate the probability of detection error P_Derr in (14).

1.3 C.3 Global probability of error

A probability of error sampling model is considered to combine the two independent stages in a global probability of error [8]. According to the decoding procedure, ideally, central pixels are more likely to be sampled. Thus, it is reasonable to model the sampling process by means of a Gaussian distribution around the center regions with σ = W_a/6. The probability of sampling out of the center region p_s can be determined by integrating outside the central region d_a × d_a. In this paper, this probability is also normalized. Once the probability of sampling error p_s is known, the global probability of error is determined by

$$ P_{err} = P_{Berr} p_{s}+P_{Derr}(1-p_{s}). $$

(16)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pena-Pena, K., Lau, D.L., Arce, A.J. et al. QRnet: fast learning-based QR code image embedding. Multimed Tools Appl 81, 10653–10672 (2022). https://doi.org/10.1007/s11042-022-12357-6

Download citation

Received: 31 March 2021
Revised: 20 October 2021
Accepted: 18 January 2022
Published: 16 February 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11042-022-12357-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

QRnet: fast learning-based QR code image embedding

Abstract

Access this article

Similar content being viewed by others

Auto ROI & mask R-CNN model for QR code beautification (ARM-QR)