A continuous-variable quantum-inspired algorithm for classical image segmentation
- 138 Downloads
Abstract
The probabilistic nature of quantum particles, state space, and the superposition principle are among the important concepts in quantum mechanics. A framework was previously developed by the authors that allowed to take advantage of these quantum aspects in the field of image processing. This was done by modeling each image’s pixel by a two-state quantum system which allowed efficient single-object segmentation. However, the extension of the framework to multi-object segmentation would be highly complex and computationally expensive. In this paper, we propose a classical image segmentation algorithm inspired by the continuous-variable quantum theory that overcomes the challenges in extending the framework to multi-object segmentation. By associating each pixel with a quantum harmonic oscillator, the space of coherent states becomes continuous. Thus, each pixel can evolve from an initial state to any of the continuous coherent states under the influence of an external resonant force. The Hamiltonian operator is designed to account for this force and is derived from the features extracted at the pixel. Therefore, the system evolves from an initial ground state to a final coherent state depending on the image features. Finally by calculating the fidelity between the final state and a set of reference states representing the objects in the image, the state with the highest fidelity is selected. The collective states of all pixels produce the final segmentation. The proposed method is tested on a database of synthetic and natural images, and compared with other methods. Average sensitivity and specificity of 97.86% and 99.61% were obtained respectively indicating the high segmentation accuracy of the algorithm.
Keywords
Quantum-inspired algorithms Coherent states Quantum harmonic oscillator Signal processing1 Introduction
Quantum image processing has been an active area of research recently. The usual tasks of image processing are performed utilizing the theory of quantum mechanics. This includes image representation (Yan et al. 2016), image matching (Jiang et al. 2016), similarity analysis (Zhou et al. 2018b), interpolation (Zhou et al. 2018a), denoising (Mastriani 2015a), coding (Chapeau-Blondeau and Belin 2016), watermarking (Li et al. 2016), and segmentation (Caraiman and Manta 2015). These attempts are based on representing the pixels of an image as qubits operated on via suitable quantum circuits. However, there are major challenges facing this approach that need further investigation as in Mastriani (2015b). On the other hand, another approach exists in which classical images are processed on classical computers using quantum-inspired models. This includes for example the classical image segmentation algorithm in Youssry et al. (2015), and its application in segmenting biomedical retinal images (Youssry et al. 2016), quantum K-means (Casper et al. 2012), quantum Gaussian mixture models (Tanaka and Tsuda 2008), and quantum pattern recognition (Sergioli et al. 2016). Some of these methods can be ported to work on a quantum computer such as Youssry et al. (2015, 2016) which opens the path for many future applications.
Conventionally, discrete degrees of freedom of particles (such as the spin of an electron or the polarization of a photon) are used for encoding information in the form of qubits. In general, this approach faces some technological difficulties when it comes to the implementation. For example, we might have to control the polarization of a single photon system or the spin of a single electron which is not easily realizable. As a result, many of the developed ideas and algorithms have not been fully realized yet. Continuous-variable quantum information processing is another approach that depends on using the continuous degrees of freedom of the particle (such as position and momentum) for manipulating the information. This approach provides easier technological implementations, but can be challenging in porting algorithms and protocols from the discrete domain to the continuous domain. Examples of this category of information processing include quantum computation (Adcock et al. 2016), machine learning (Lau et al. 2016), quantum key distribution (Borelli et al. 2016), and identity authentication (Ma et al. 2016).
Image segmentation is an area of image processing that has many applications. It deals with delineating the significant objects in the image and isolating them from the background. Many methods exists to segment images including thresholding, edge detection, supervised and unsupervised machine learning, morphological methods, and deformable models. There exist also quantum-based methods for image segmentation. A short review on these techniques is given in Youssry et al. (2015). Some of the authors previously proposed a general framework that uses the theory of two-state quantum mechanics systems to process images (Youssry et al. 2015). Based on this framework, a general single-object image segmentation was developed and applied to generic images as well as in determining the vessel tree in retinal images. This algorithm showed high efficiency in segmenting images. Although the framework provides the theory for the extension to multi-object segmentation by utilizing discrete multi-state quantum systems, this extension is complex and computationally expensive. The continuous-variable quantum theory provides a solution to this challenge.
In this paper, we propose a new algorithm for image segmentation based on the continuous-variable coherent quantum states that occur in the theory of quantum harmonic oscillators. The work is built upon the framework presented in Youssry et al. (2015), but extends to the case of multi-object segmentation. The paper starts in Section 2 with a brief theoretical overview essential for introducing the new methodology. Next, the proposed algorithm is presented in Section 3. After that, the materials used for testing as well as the obtained results are shown in Sections 4 and 5. In Section 6, the analysis and the significance of the results are discussed. Finally, the conclusion and the future perspectives are given in Section 7. The Appendices A and B include some additional proofs given for the sake of completeness.
2 Background
This section starts with a brief overview on the theory of quantum harmonic oscillators, needed for developing the proposed methodology. The details can be found in any standard reference on quantum mechanics or quantum optics such as Griffiths (2005) or Gerry and Knight (2005). Afterwards, a short review on the quantum fidelity measure is given. Finally, the performance measures used for evaluating the segmentation algorithm are discussed.
2.1 Quantum harmonic oscillator
- They are not orthogonal,In the limit of large magnitudes, the states tend to be orthogonal.$$ \langle{\alpha|\beta}\rangle=e^{\frac{1}{2}(\alpha^{*}\beta+\alpha\beta^{*})}e^{-\frac{1}{2}|\beta-\alpha|^{2}}. $$(11)
- They form an overcomplete basis$$ \int|\alpha\rangle\langle{\alpha}|d^{2}\alpha=\pi. $$(12)
2.2 Fidelity distance measure
2.3 Performance measures
The proposed image segmentation algorithm takes a classification approach, where each pixel is classified to belong to either the foreground or the background of one of the objects in the image. Accordingly, the sensitivity and specificity measures are suitable for evaluating the performance of the algorithm quantitatively (Youssry et al. 2015). Sensitivity measures the percentage of pixels of the object’s foreground that are correctly classified by the algorithm as foreground. Specificity measures the percentage of the object’s background that is correctly classified as background. Ideally, it is favorable to have both sensitivity and specificity of 100%. However, practically this may not be possible. In the case of multi-object segmentation, the algorithm may succeed in segmenting some objects and fails to segment others. Thus, the sensitivity and specificity for each individual object in the image are calculated to evaluate the performance in all cases.
3 Methods
In this section, the proposed methodology is presented. First, an overview is given on the quantum-based framework upon which the proposed image segmentation algorithm is built. This framework has been proposed recently in Youssry et al. (2015). The challenges of extending this framework to the multi-object case are discussed, as well as how the novel algorithm overcomes these challenges. After that, the detailed steps of the developed algorithm for multi-object image segmentation are elaborated.
3.1 Overview of the framework
The proposed algorithm follows the quantum-based framework developed in Youssry et al. (2015). In this framework, an analogy between the signal processing task required to be performed and quantum mechanics is formed. This allows transforming the signal processing problem into a problem that can be solved easily within the well-developed quantum mechanics theory. Afterwards, the obtained quantum solution can be transformed back to the signal processing domain. This idea is used to develop an image segmentation algorithm that was suitable for segmenting single-object images. A classification approach is adopted, where each pixel in the image is classified into one of the two possible classes: background or foreground of the object. In order to accomplish this task, each pixel in the image is associated to a two-level quantum system (qubit). The quantum system starts from an initial state and is evolved to a final state. By measuring the final state, the final outcome representing the class of the pixel is obtained. In order to reach a correct final state, the Hamiltonian operator is designed to be a function in the features of extracted from the pixel. In other words, the feature vector guides the quantum system to reach its correct final state. This requires estimating some parameters so that the features can be combined together in the Hamiltonian. This is done using a supervised learning method. A small window in the image is selected together with its ground truth, and both are used to estimate the parameters targeting the minimization of the error between the resulting segmentation of this window and its ground truth. After this learning phase, the obtained parameters do not change anymore for this image, and they can also be used to segment any other visually similar image.
A straightforward approach to extend this algorithm to the case of multi-object image segmentation is to use multi-level quantum systems (qudit). However, there are four problems in this approach. First, the complexity of computations will increase, as the state vector of an N-level quantum system is represented as an N × 1 vector, and the quantum operators will be represented by N × N matrices. Since the framework is mainly designed to work on a classical computer, then this can form a bottleneck in the execution in the case of large number of objects. Second, it may be difficult to derive a closed-form solution for Schrödinger’s equation in the general case (N-level system) as was proven for the qubit case (2-level system). In this case, the solution must be numerically obtained. When the number of levels increases, this again increases the overall complexity. Third, there is an important issue concerning the controllability of quantum systems. Not every Hamiltonian allows an arbitrary transition between states. Therefore, this issue must be taken into consideration while choosing the Hamiltonian form. Besides increasing the difficulty of the design process, the result may be a Hamiltonian that does not correspond to an actual physical process. This may prevent the realization of the algorithm on a quantum computer. This opposes the case of the 2-level system where any Hamiltonian can be realized easily. Finally, the number of Hamiltonian parameters (degrees of freedom in the matrix representation) generally increases for larger systems which adds more complexity.
In principle, these challenges can be solved to obtain a generalized model for multi-object image segmentation. However, in this paper a novel model is proposed that does not face those challenges. Additionally, it can be generalized to any number of image objects without an increase in the overall complexity. The basic idea is to map each pixel in the image to a quantum harmonic oscillator system instead of a qudit system. The oscillator is initialized to the ground state. By applying an external resonant force, the oscillator evolves to a final state which will be a particular coherent state. By choosing the Hamiltonian parameters, the final state can be controlled. Therefore, image features are extracted at each pixel and combined together. Next, a training phase is performed to estimate the Hamiltonian parameters. A small window of the image and its ground truth are provided for this step. The training pixels should include representative pixels for all objects. The pixels of each object of the image in the ground truth are assigned to a particular coherent state referred to as the reference state in this paper. For instance, if the image contains N − 1 objects plus the background, then we need to define a set of N coherent states to be used as reference states. So, the background is treated as an object as well. Each pixel is associated to its corresponding reference state according to the ground truth segmentation. Consequently, the Hamiltonian will be trained such that it results in the evolution of all the pixels in the training set from the initial state (which is the ground state) to the final state (which should be the corresponding reference coherent state). Once the Hamiltonian is constructed, it is used afterwards without further change. It will be used to evolve the states of pixels in the testing set (the remaining image pixels that are outside the training set). In general, the final state of those test pixels may not coincide exactly with any of reference states. So, in order to determine the class/state of the pixel, the final state is compared with the whole set of reference coherent states representing each object. If the final state of system is close to an object’s reference state, the pixel is classified as belonging to the foreground of this object. The quantum fidelity measure is used as a distance measure to quantify the closeness of the final state to any of the reference states.
The quantum harmonic oscillator is an infinite-dimensional quantum system. Working with number states of the QHO will result in matrices that are of infinite-dimensions. So, it will be impossible to store and process them on a classical computer. However, although the system is infinite-dimensional, it is completely defined by a single complex-valued parameter α. All operations can be done by manipulating this parameter. This simple parameter can be stored and manipulated efficiently on a classical computer. Consequently, the representation of the quantum states as well the required quantum operators will be of a fixed size independent on the number of classes (objects) in the image. This solves the complexity problem in the original framework. The choice of the Hamiltonian generating the coherent states of the QHO solves the second problem as the solution exists in a closed form, as shown in Appendix A, independent on the number of objects in the image. Additionally, this form guarantees that starting from the ground state, any final coherent state can be reached. Thus, the third challenge related to controllability is resolved. Moreover, the chosen Hamiltonian can be realized easily in the case of implementing on a quantum computer. Finally, as will be shown later, there are only three degrees of freedom in the Hamiltonian representation irrespective to the number of image objects.
3.2 Proposed algorithm
3.2.1 Reference state preparation
3.2.2 Pixel state initialization
3.2.3 Feature extraction
3.2.4 State evolution
3.2.5 Measurement
4 Materials
Following the work in Youssry et al. (2015), the proposed algorithm is tested on two datasets. The first dataset consists of synthetic images of geometric shapes with different types and different number of objects. The ground truth of these images is generated manually. Moreover, noise is applied to some of these images in order to test the performance of the algorithm in the presence of noise. The second dataset consists of natural images and is chosen from the publicly available image segmentation database of Alpert et al. (2007). This database provides images with single and double objects as well as the human segmentation for all images to test the accuracy of segmentation methods. The number of synthetic images are 19 images of which 11 of them are noisy images. Five natural images are included which adds up to a total of 24 images.
5 Results
Proposed algorithm segmentation of single-object images from Youssry et al. (2015)
Proposed algorithm segmentation of multi-object images
Proposed algorithm segmentation of noisy images
Comparison of the proposed algorithm with other algorithms on different data subsets, showing the number of images in each subset, and the sensitivity and specificity as percentages
Proposed | K-means | Multithreshold | Graph cuts | Random forests | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Dataset | Num | Sens | Spec | Sens | Spec | Sens | Spec | Sens | Spec | Sens | Spec |
Paper (Youssry et al. 2015) | 9 | 98.23 | 99.53 | 92.75 | 99.76 | 93.16 | 99.85 | 95.53 | 96.88 | 98.55 | 98.75 |
Multi-objects | 12 | 97.52 | 99.61 | 91.12 | 99.64 | 89.41 | 96.57 | 88.31 | 92.36 | 92.78 | 98.99 |
Noisy images | 11 | 98.30 | 99.30 | 91.71 | 99.59 | 91.63 | 94.98 | 91.06 | 99.66 | 98.84 | 99.48 |
All | 24 | 97.86 | 99.56 | 91.17 | 99.69 | 90.07 | 97.57 | 90.03 | 93.92 | 94.58 | 99.00 |
The sensitivity and the specificity measures for each class in each image are calculated. First, the algorithm is applied to the images with single object that are used to validate the original framework (Youssry et al. 2015). This system is considered an enhancement to the previously introduced framework. Thus, the purpose of this step is to verify that the system can produce comparable results in case of single object before proceeding to multi-object images. Average sensitivity and specificity of 98.23% and 99.53% were obtained, respectively. This shows that the coherent state–based algorithm performs very efficiently in segmenting images with single object. Samples of segmented objects are shown in Fig. 1. In addition, the results are very similar to the former framework (sensitivity = 98.5% and specificity = 99.7%) which were shown to exceed other existing segmentation methods like active contours and graph cuts. In this paper, four methods are compared against the proposed algorithm. These segmentation algorithms are K-means clustering (Lloyd 1982), Otsu’s multithreshold (Otsu 1979), lazy snapping graph cuts (Li et al. 2004), and random forests (Sommer et al. 2011). The results of applying the five algorithms to the entire dataset are summarized in Table 1. Regarding the single-object images, the sensitivities of both the quantum (98.23%) and random forests (98.55%) methods were close and significantly higher than those of the other three methods. The quantum, Otsu, and K-means techniques produced similar specificities (around 99.72%) which were 1–2% better than the other two algorithms.
Proposed algorithm segmentation of images with compression noise
The reported overall average performance measures indicate that the specificities from all methods, except the graph cuts (93.92%), are in close proximity to each other (99.00 to 99.69%) with quantum and K-means at the top of the range. However, the superiority of the proposed method over all the other methods under comparison in capturing the target objects for different types of images is evident from the sensitivity results. The quantum technique’s average sensitivity is 97.86% compared with the closest value of 94.58% from random forests and the least value of 90.03% from the graph cuts method.
6 Discussion
The theory of quantum harmonic oscillators and coherent states provide the bases for the proposed quantum-based image segmentation algorithm. This method relies on treating each pixel in a classical image as QHO initially at the vacuum state. By allowing the system to evolve controlled by features extracted from the image’s pixels, the oscillator can reach any of the continuous eigenstates. Principally, this allows for the segmentation of an infinite number of objects. The Hamiltonian parameters are estimated by supervised learning from the image features to lead the evolution to the desired class. The results of applying the system to segment different images indicate that the algorithm can accurately segment multi-objects in many types of images including noisy ones.
The presented method inherits the design flexibility from the original framework (Youssry et al. 2015). So, many design aspects can be adjusted to suit different types of applications such as the form of the Hamiltonian. In this work, the Hamiltonian was selected to lead to a closed-form solution. However for other problems, a more complicated form might be needed which may necessitate obtaining a numerical solution of Schrödinger’s equation. The construction of the Hamiltonian was performed using supervised learning which can also be changed to possibly unsupervised learning approach. Additionally, the fidelity metric was adopted in this work as it can be optically implemented as will be discussed later in this section. However, other metrics can be used.
K-means, multithreshold, and graph cuts segmentation of a natural image of a bird
K-means, multithreshold, and graph cuts segmentation of a gradient image with Gaussian noise of variance 0.1
Multithreshold segmentation of a natural image of marbles
In Sergioli et al. (2016), a framework for pattern recognition has been proposed in which features are mapped to quantum density matrices on the Bloch sphere via a stereographic mapping. This is suitable for a 2D feature vector. For larger number of features, the model can be generalized geometrically by using Bloch spheres of higher dimensions. In this case, higher-dimensional matrices are used. The classification is done by a nearest mean classifier rule based on trace distance. In a broader sense, the work in Sergioli et al. (2016) shares a quantum-based classification approach as the proposed method. Nevertheless, the two methods have many differences. First, the presented work is based on the theory of quantum harmonic oscillators which uses continuous states of infinite-dimensions and is independent of the number of classes as previously discussed. Second, the features are encoded through the Hamiltonian governing the evolution of states. Third, the identification of the final state is performed by first evolving the system then by using the fidelity as a distance metric. Fourth, the learning is done in a least-square sense for evaluating the Hamiltonian parameters. Finally, despite the ability of the suggested algorithm to be used as a classifier, the focus is on developing a complete for image segmentation technique.
Random forests segmentation of a textured image
7 Conclusions and future work
This paper proposes an algorithm for segmenting classical images that is formulated from the foundations of quantum mechanics. It can be considered as enhanced extension of the work done in Youssry et al. (2015). In addition to the ability to deploy beneficial aspects from quantum mechanics in image segmentation as the original framework, this algorithm has a major advantage as it can handle images with multi-objects at no additional computational complexity. This is accomplished by utilizing the theory of quantum harmonic oscillators rather than two-state quantum system. Since the number of coherent states is a continuum, their eigenstates represents a continuous-variable and thus can model any number of objects. The performance of the proposed method demonstrates its high performance in terms of accuracy even when noise is present, while being superior to the original work in Youssry et al. (2015) in terms of complexity. Despite being developed as a classical algorithm, we provided a suggestion on the quantum implementation of the system using the aforementioned optical hardware. The following points should be considered in the future to enhance the system. First, the interaction between neighboring pixels is considered only indirectly in the feature extraction. In order to provide a direct way to incorporate this information and replace the use of complicated image features, the mathematical model of coupled quantum harmonic oscillators could be exploited. Second, the algorithm was presented generally and tested on generic images to show its functionality. It remains to get advantage of its flexibility to efficiently apply it to a particular application. Finally, the system could be physically realized, as described previously, to validate its practically.
Notes
Funding information
AY is supported by an Australian Government Research Training Program Scholarship. This work is supported by the National Natural Science Foundation of China under Grant No. 61463016, 61763014, National Key R&D Plan under Grant No. SQ2018YFC120002 and “Science and technology innovation action plan” of Shanghai in 2017 under Grant No. 17510740300.
References
- Adcock MR, Høyer P, Sanders BC (2016) Quantum computation with coherent spin states and the close hadamard problem. Quantum Inf Process 15(4):1361–1386MathSciNetCrossRefGoogle Scholar
- Alpert S, Galun M, Basri R, Brandt A (2007) Image segmentation by probabilistic bottom-up aggregation and cue integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
- Borelli L, Aguiar L, Roversi J, Vidiella-Barranco A (2016) Quantum key distribution using continuous-variable non-gaussian states. Quantum Inf Process 15(2):893–904MathSciNetCrossRefGoogle Scholar
- Caraiman S, Manta VI (2015) Image segmentation on a quantum computer. Quantum Inf Process 14 (5):1693–1715MathSciNetCrossRefGoogle Scholar
- Casper E, Hung CC, Jung E, Yang M (2012) A quantum-modeled k-means clustering algorithm for multi-band image segmentation. In: Proceedings of the 2012 ACM research in applied computation symposium. ACM, pp 158–163Google Scholar
- Chan TF, Vese LA (2001) Active contours without edges. Trans Img Proc 10(2):266–277CrossRefGoogle Scholar
- Chapeau-Blondeau F, Belin E (2016) Quantum image coding with a reference-frame-independent scheme. Quantum Inf Process: 1–16Google Scholar
- Ekert AK, Alves CM, Oi DK, Horodecki M, Horodecki P, Kwek LC (2002) Direct estimations of linear and nonlinear functionals of a quantum state. Physical Rev Lett 88(21):217, 901CrossRefGoogle Scholar
- Gerry C, Knight P (2005) Introductory quantum optics. Cambridge University Press, CambridgeGoogle Scholar
- Griffiths DJ (2005) Introduction to quantum mechanics. Pearson Education IndiaGoogle Scholar
- Jiang N, Dang Y, Wang J (2016) Quantum image matching. Quantum Inf Process: 1–30Google Scholar
- Lau HK, Pooser R, Siopsis G, Weedbrook C (2016) Quantum machine learning over infinite dimensions. arXiv:160306222
- Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graphics (ToG) 23(3):303–308CrossRefGoogle Scholar
- Li P, Xiao H, Li B (2016) Quantum representation and watermark strategy for color images based on the controlled rotation of qubits. Quantum Inf Process: 1–26Google Scholar
- Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137MathSciNetCrossRefGoogle Scholar
- Ma H, Huang P, Bao W, Zeng G (2016) Continuous-variable quantum identity authentication based on quantum teleportation. Quantum Inf Process: 1–16Google Scholar
- Mastriani M (2015a) Quantum boolean image denoising. Quantum Inf Process 14(5):1647–1673MathSciNetCrossRefGoogle Scholar
- Mastriani M (2015b) Quantum image processing? arXiv:151202942
- Nielsen MA, Chuang IL (2010) Quantum computation and quantum information. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Systems, Man, Cybern 9(1):62–66CrossRefGoogle Scholar
- Sergioli G, Santucci E, Didaci L, Miskczak JA, Giuntini R (2016) Pattern recognition on the quantum bloch sphere. arXiv:160300173
- Sommer C, Straehle C, Koethe U, Hamprecht FA (2011) Ilastik: interactive learning and segmentation toolkit. In: 2011 IEEE International symposium on biomedical imaging: from nano to macro. IEEE, pp 230–233Google Scholar
- Tanaka K, Tsuda K (2008) A quantum-statistical-mechanical extension of gaussian mixture model. In: Journal of physics: conference series, vol 95. IOP Publishing, p 012023Google Scholar
- Yan F, Iliyasu AM, Venegas-Andraca SE (2016) A survey of quantum image representations. Quantum Inf Process 15(1):1–35MathSciNetCrossRefGoogle Scholar
- Youssry A, El-Rafei A, Elramly S (2015) A quantum mechanics-based framework for image processing and its application to image segmentation. Quantum Inf Process 14(10):3613–3638MathSciNetCrossRefGoogle Scholar
- Youssry A, El-Rafei A, Elramly S (2016) A quantum mechanics-based algorithm for vessel segmentation in retinal images. Quantum Inf Process 15(6):2303–2323. https://doi.org/10.1007/s11128-016-1292-1 MathSciNetCrossRefGoogle Scholar
- Zhou R, Hu W, Luo G, Liu X, Fan P (2018a) Quantum realization of the nearest neighbor value interpolation method for ineqr. Quantum Inf Process 17(7):166Google Scholar
- Zhou RG, Liu X, Zhu C, Wei L, Zhang X, Ian H (2018b) Similarity analysis between quantum images. Quantum Inf Process 17(6):121Google Scholar