Skip to main content
Log in

Quantum convolutional neural network for classical data classification

  • Research Article
  • Published:
Quantum Machine Intelligence Aims and scope Submit manuscript

Abstract

With the rapid advance of quantum machine learning, several proposals for the quantum-analogue of convolutional neural network (CNN) have emerged. In this work, we benchmark fully parameterized quantum convolutional neural networks (QCNNs) for classical data classification. In particular, we propose a quantum neural network model inspired by CNN that only uses two-qubit interactions throughout the entire algorithm. We investigate the performance of various QCNN models differentiated by structures of parameterized quantum circuits, quantum data encoding methods, classical data pre-processing methods, cost functions and optimizers on MNIST and Fashion MNIST datasets. In most instances, QCNN achieved excellent classification accuracy despite having a small number of free parameters. The QCNN models performed noticeably better than CNN models under the similar training conditions. Since the QCNN algorithm presented in this work utilizes fully parameterized and shallow-depth quantum circuits, it is suitable for Noisy Intermediate-Scale Quantum (NISQ) devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

Download references

Acknowledgements

We thank Quantum Open Source Foundation as this work was initiated under the Quantum Computing Mentorship program.

Funding

This research is supported by the National Research Foundation of Korea (Grant No. 2019R1I1A1A01050161 and 2021M3H3A1038085) and Quantum Computing Development Program (Grant No. 2019M3E4A1080227).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel K. Park.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Data availability

The source code used in this study is available at https://github.com/takh04/QCNN.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Tak Hur and Leeseok Kim contributed equally to this work and are listed in alphabetical order.

Appendices

Appendix: A. Related works

The term quantum convolutional neural network (QCNN) appears in several places, but it refers to a number of different frameworks. Several proposals have been made in the past to reproduce classical CNN on a quantum circuit by imitating the basic arithmetic of the convolutional layer for a given filter (Kerenidis et al. 2019; Li et al. 2020; Wei et al. 2021). Although these algorithms have the potential to achieve exponential speedups against the classical counterpart in the asymptotic limit, they require an efficient means to implement quantum random access memory (QRAM), expensive subroutines such as the linear combination of unitaries or quantum phase estimation with extra qubits, and they work only for specific types of quantum data embedding. Another branch of CNN-inspired QML algorithms focuses on implementing the convolutional filter as a parameterized quantum circuit, which can be stacked by inserting a classical pooling layer in between (Liu et al. 2021; Henderson et al. 2020; Chen et al. 2020). Following the nomenclature provided in Henderson et al. (2020), we refer to this approach as quanvolutioanl neural network to distinguish it from QCNN. The potential quantum advantage of using quanvolutional layers lies in the fact that quantum computers can access kernel functions in high-dimensional Hilbert spaces much more efficiently than classical computers. In quanvolutional NN, a challenge is to find a good structure for the parametric quantum circuit in which the number of qubits equals the size of the filter. This approach is also limited to qubit encoding since each layer requires a quantum embedding which has a non-negligible cost. Furthermore, stacking quanvolutional layers via pooling requires each parameterized quantum circuit to be measured multiple times for the measurement statistics.

Fig. 7
figure 7

A schematic of CNN used in this work for comparing to the classification performance of QCNN. To make the comparison as fair as possible, the number of free parameters are adjusted to be similar to that used in QCNN. This leads to starting with a small number of input nodes. While we used two CNN structures with 8 and 16 input nodes, the figure shows a CNN structure with 8 input nodes as an example

Variational quantum circuits with the hierarchical structure consisting of \(O(\log (n))\) layers do not exhibit the problem of “barren plateau” (Pesah et al. 2021). In other words, the precision required in the measurement grows at most polynomially with the system size. This result guarantees the trainability of the fully parameterized QCNN models studied in this work when randomly initializing their parameters. Furthermore, numerical calculations in Ref. Pesah et al. (2021) show that the cost function gradient vanishes at a slower rate (with n, the number of initial qubits) when all unitary operators in the same layer are identical as in QCNN (Cong et al. 2019). The hierarchical structure inspired by tensor network, without translational invariance, was first introduced in Ref. Grant et al. (2018). The hierarchical quantum circuit can be combined with a classical neural network as demonstrated in Ref. Huang et al. (2021).

We note in passing that there exist several works proposing the quantum version of perceptron for binary classification (Tacchino et al. 2020; Mangini et al. 2020; Monteiro et al. 2021). While our QCNN model defers from them as it implements the entire network as a parameterized quantum circuit, interesting future work is to investigate the alternative approach to construct a complex network of quantum artificial neurons developed in the previous works.

Appendix: B. Classical CNN

In order to compare the classification accuracy of CNN and QCNN in fair conditions, we fixed hyperparameters used in the optimization step to be the same, which include iteration numbers, batch size, optimizer type, and its learning rates. In addition, we modified the structure of CNN in ways that its number of parameters subject to optimization is as close to that used in QCNN as possible. For example, since QCNN attains the best results with about 40 to 50 free parameters, we adjust the CNN structure accordingly. This led us to come up with two CNN, one with the input shape of (8, 1, 1) and another with the input shape of (16, 1, 1). In order to occupy the small number of input nodes for MNIST and Fashion MNIST classification, PCA and autoencoding are used for data pre-processing as done in QCNN. The CNNs go through convolutional and pooling stages twice, followed by a fully connected layer. The number of free parameters used in the CNN models is 26 or 44 for the case of 8 input nodes and 34 or 56 for the case of 16 input nodes.

The training also mimics that of QCNN. For every iteration step, 25 data are randomly selected from the training dataset, and trained via Adam optimizer with the learning rate of 0.01. We also fixed the number of iterations to be 200 as done in QCNN. The number of training (test) data are 12665 (2115) and 12000 (2000) for MNIST and fashion MNIST datasets, respectively.

Appendix: C. QCNN simulation results for MSE loss

In Section 4 of the main text, we presented the Pennylane simulation results of QCNN trained with the cross-entropy loss. When MSE is used as the cost function, similar results are obtained. We report classification results for MNIST and Fashion MNIST data attained from QCNN models that are trained with MSE in Tables 6 and 7.

Table 6 Mean accuracy and one standard deviation of the classification for 0 and 1 in the MNIST dataset when the QCNN model is trained with MSE
Table 7 Mean accuracy and one standard deviation of the classification for 0 (t-shirt/top) and 1 (trouser) in the Fashion MNIST dataset when the QCNN model is trained with MSE

Appendix: D. Classification with hierarchical quantum classifier

The hierarchical structure inspired by tensor network named as hierarchical quantum classifier (HQC) was first introduced in Ref. Grant et al. (2018). The HQC therein does not enforce translational invariance, and hence, the number of free parameters subject to optimization grows as O(n) for a quantum circuit with n input qubits. Although the simulation presented in the main manuscript aims to benchmark the classification performance of the QML model in which the number of parameters grows as \(O(\log (n))\), we also report the simulation results of HQC with the tensor tree network (TTN) structure (Grant et al. 2018) in this supplementary section for interested readers. The TTN classifier does not employ parameterized quantum gates for pooling. Thus, for certain ansatz, the number of parameters differs from that of QCNN models. For example, although the convolutional circuit 2 in Fig. 2 has two free parameters, only one of them is effective since one of the qubits is traced out as soon as the parameterized gate is applied. For brevity, here we only report the results obtained with the cross-entropy loss but similar results can be obtained with MSE. As can be seen from Table 8 and Table 9, the number of effective parameters (i.e., the second column) grows faster than that of QCNN models. An interesting observation is that there is no clear trend as the number of parameters is increased beyond 42, which is close to the maximum number of parameters used in QCNN. In other words, there is no clear motivation to increase the number of free parameters beyond 42 or so when seeking to improve the classification performance. Studying overfitting under the growth of the number of parameters remains an interesting open problem.

Table 8 Mean accuracy and one standard deviation of the classification for 0 and 1 in the MNIST dataset when the HQC model is trained with cross-entropy loss
Table 9 Mean accuracy and one standard deviation of the classification for 0 (t-shirt/top) and 1 (trouser) in the Fashion MNIST dataset when the HQC model is trained with cross-entropy loss

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hur, T., Kim, L. & Park, D.K. Quantum convolutional neural network for classical data classification. Quantum Mach. Intell. 4, 3 (2022). https://doi.org/10.1007/s42484-021-00061-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42484-021-00061-x

Keywords

Navigation