Ghost polarimetry (GP) is one of the important branches of ghost imaging (GI). Currently, ghost imaging, which has now been formed into a separate scientific direction, is located at the intersection of statistical and quantum optics and intelligent imaging systems [1, 2]. As far as we know, the term “ghost polarimetry” was first introduced in [3]. GP methods, based on the idea of GI [46], make it possible to construct polarization ghost images (PGI) of an object. GP solves the issue of the effect of radiation polarization on the generated PGI [710], as well as the possibility of extracting information about the polarization properties of objects by the GI method [1115]. To date, most of the results have been obtained for bright (classical) light. However, in [1618] it was shown that the ideas underlying GP in bright light can be combined with the principles of quantum optics. This allowed the authors to develop the theory of quantum GP in single photons, as well as to demonstrate experimentally [19] the effectiveness of this approach on example of objects with linear amplitude anisotropy.

In our opinion, for the further development of GP, first of all, two key problems need to be addressed. The first problem, which is basically technological, is that the GI process takes a relatively long time to obtain high-quality PGI. This is due to the fact that the currently used mathematical algorithms are statistical and require a significant number of patterns for object illumination. This significantly increases the image restoration time and, as a result, limits the use of such algorithms for real-time practical applications. Recently, algorithms that use additional information about the object, such as the compressed sensing [20], the measurement reduction method [21], etc., have been used to reduce the image reconstruction time in the GI. Adapting such algorithms to GP could to some extent speed up the process of PGI formation. However, our experience of using such algorithms in ghost fiber optics [22] shows that although such an approach is indeed capable of reducing the time of obtaining PGI, but the problem of obtaining PGI in on-line mode remains unsolved.

The second problem can be classified as fundamental. In the case of forming amplitude GI, it is sufficient to calculate the correlation intensity function (CIF) \(G\left( {\mathbf{r}} \right)\), which turns out to be proportional to the amplitude image \(T\left( {\mathbf{r}} \right)\) of the object. However, in GP it is not possible to form an object image in such a relatively simple way. As was shown in [12], for restoring the PGI, i.e., the spatial polarization profile of an object, the measuring a complete set of CIF obtained at different orientations of polarizers and analyzers is required. In this regard, a rather complicated problem of solving the inverse problem arises, namely, calculating the elements of the Jones matrix from a set of CIF. Recently, we managed to find a complete set of CIF for determining the polarization parameters of the object with linear dichroism [12]. However, for an object with arbitrary polarization properties, this problem still remains unsolved and, moreover, it cannot be said in advance that it can be solved in principle.

In our opinion, using neural networks for the formation of PGI could significantly help in solving both of the problems mentioned above. As far as we know, the idea of using neural networks to form GI was proposed in 2017 in [23]. Recently, a number of reports have appeared on the successful application of neural networks in the construction of GI [2327], but in GP this approach has not yet been used.

In this paper, we report for the first time, to our knowlege, about using neural networks in GP. We demonstrated this for objects with one of four types of anisotropy [28]: linear amplitude anisotropy; linear phase anisotropy; circular amplitude anisotropy and circular phase anisotropy. To address this challenge, we developed a specialized Ghost Polarimetry Deep Neural Network (GPNN) that identifies the type of anisotropy.

As we have already formulated in [12, 16], the task of the GP is to restore the spatial distribution of the object’s polarization properties from the measured CIF.

For clarity, consider the GP circuit shown in Fig. 1. By analogy with computational GI [29], a source of linearly polarized light with pseudo-thermal statistics [30] is used. The source consists of a laser (\(532\) nm), an amplitude spatial light modulator (SLM). An objective lens is used to transmit the spatial distribution of intensity from the SLM to an object. A polarization controller is installed in front of the object, which allows you to set an arbitrary polarization state of light. Before interaction with the object, the polarization of light is uniform in the cross section of the beam. Behind the object is a polarizer that projects the polarization of radiation in the chosen direction. The transmitted radiation is recorded by an “bucket” detector. The object that we will consider has one of 4 types of anisotropy, and its polarization properties are generally described by the Jones matrix:

$${\mathbf{\hat {M}}}\left( {\mathbf{r}} \right) = \left( {\begin{array}{*{20}{c}} {{{M}_{{11}}}\left( {\mathbf{r}} \right)}&{{{M}_{{12}}}\left( {\mathbf{r}} \right)} \\ {{{M}_{{21}}}\left( {\mathbf{r}} \right)}&{{{M}_{{22}}}\left( {\mathbf{r}} \right)} \end{array}} \right),$$
(1)

where r is the radius vector in the cross section of the object.

Fig. 1.
figure 1

(Color online) Scheme of the GP facility. Light from a source of linearly polarized radiation with pseudo-thermal statistics (Source) passes through an objective lens, a polarization controller, a polarization-sensitive object, and a polarizer. Then, the light is collected by a lens on a photosensitive bucket detector area. The control of the source and signal processing from the detector are carried out by a computer (PC).

Let \(I\left( {\mathbf{r}} \right)\) be the distribution of light intensity in the object plane, and \({{W}_{0}}\) is a signal registered by a photodetector. The normalized CIF \(g\left( {\mathbf{r}} \right)\) is defined by the expression [11]

$$g\left( {\mathbf{r}} \right) = \frac{{\left\langle {I\left( {\mathbf{r}} \right){{W}_{0}}} \right\rangle - \left\langle {I\left( {\mathbf{r}} \right)} \right\rangle \left\langle {{{W}_{0}}} \right\rangle }}{{{{G}_{0}}\left( {\mathbf{r}} \right)}},$$
(2)

where \({{G}_{0}}\left( {\mathbf{r}} \right)\) is the CIF measured in the absence of polarization elements and the object in the scheme; \(g\left( {\mathbf{r}} \right)\) is a function of the elements of the Jones matrix, the form of which depends on the position of the polarization controller and the analyzer.

In order to determine the explicit form of the function g(r), let us turn to the scheme in Fig. 1. In the linear polarization basis (x, y), the normalized Jones vector at the output of the polarization controller has the form \(e({\mathbf{r}}) = {{({{E}_{x}}({\mathbf{r}}),{{E}_{y}}({\mathbf{r}}))}^{T}}\), where \({{\left| {{{E}_{x}}({\mathbf{r}})} \right|}^{2}} + {{\left| {{{E}_{y}}({\mathbf{r}})} \right|}^{2}}\) = 1. The Jones matrix describing the linear polarizer used in the scheme is: \({{{\mathbf{\hat {M}}}}_{p}} = \left( {\begin{array}{*{20}{c}} {{{{\cos }}^{2}}\gamma }&{\sin \gamma \cos \gamma } \\ {\sin \gamma \cos \gamma }&{{{{\sin }}^{2}}\gamma } \end{array}} \right)\), where \(\gamma \) is the angle between the polarizer transmission axis and the \(x\) axis. In such a geometry, it is not difficult to show [11] that \(g\left( {\mathbf{r}} \right) = {{\left| {{{{{\mathbf{\hat {M}}}}}_{p}}{\mathbf{\hat {M}}}\left( {\mathbf{r}} \right)e\left( {\mathbf{r}} \right)} \right|}^{2}}\).

As you can see, the value of the \(g\left( {\mathbf{r}} \right)\) function is determined by the measurement configuration: by the Jones vector \(e\left( {\mathbf{r}} \right)\) and the angle value \(\gamma \). Enumeration of various measurement configurations allows selecting a set of measurements sufficient to determine type of the anisotropy of the object. Five configurations were empirically selected (see Table 1). Although calculations are presented here for the classical version of GP, a similar result can be obtained for the quantum implementation (see [16]).

Table 1. Set of CIF \({{g}_{k}}({\mathbf{r}})\)

As already mentioned, in this work we limit our consideration only to those objects that have one of the 4 types of anisotropy. The Jones matrices [28] for such objects are given below:

$$\begin{array}{*{20}{c}} {{{{{\mathbf{\hat {M}}}}}_{{{\text{LA}}}}} = \left( {\begin{array}{*{20}{c}} {{{{\cos }}^{2}}\theta + P{\text{si}}{{{\text{n}}}^{2}}\theta }&{\left( {1 - P} \right)\cos \theta \sin \theta } \\ {\left( {1 - P} \right)\cos \theta \sin \theta }&{{{{\sin }}^{2}}\theta + P{{{\cos }}^{2}}\theta } \end{array}} \right),} \\ {{{{{\mathbf{\hat {M}}}}}_{{{\text{LP}}}}} = \left( {\begin{array}{*{20}{c}} {{{{\cos }}^{2}}\alpha + {{e}^{{ - i\Delta }}}{\text{si}}{{{\text{n}}}^{2}}\alpha }&{[1 - {{e}^{{ - i\Delta }}}]\sin \alpha \cos \alpha } \\ {[1 - {{e}^{{ - i\Delta }}}]\sin \alpha \cos \alpha }&{{\text{si}}{{{\text{n}}}^{2}}\alpha + {{e}^{{ - i\Delta }}}{{{\cos }}^{2}}\alpha } \end{array}} \right),} \\ {{{{{\mathbf{\hat {M}}}}}_{{{\text{CP}}}}} = \left( {\begin{array}{*{20}{c}} {\cos \phi }&{\sin \phi } \\ { - \sin \phi }&{\cos \phi } \end{array}} \right),\quad {{{{\mathbf{\hat {M}}}}}_{{{\text{CA}}}}} = \left( {\begin{array}{*{20}{c}} 1&{ - iR} \\ {iR}&1 \end{array}} \right),} \end{array}$$
(3)

where LA is the linear amplitude anisotropy, LP is the linear phase anisotropy, CP is the circular phase anisotropy, CA is the circular amplitude anisotropy; P is the value of LA, i.e., relative transmission of the field component perpendicular to the transmission axis with respect to the parallel component; \(\theta \) is the azimuth of LA, i.e., an angle between the x-axis and the transmission axis; \(\Delta \) is the value of LP, i.e., relative phase delay of the field component perpendicular to the fast axis with respect to the parallel component; \(\alpha \) is the azimuth of LP, i.e., an angle between fast axis and x-axis; \(\phi \) is the azimuth of CP, i.e., a phase shift between two orthogonal circular field components; \(R\) is the value of CA, i.e., the relative absorption of two orthogonal circular field components. An example of object with a random distribution of polarization properties in the cross section of an object is shown in the Fig. 2a. Parameters \(P\), \(\theta \), \(\alpha \), \(\Delta \), \(\phi \) and \(R\) take the following values: \(0 \leqslant P \leqslant 1\), \( - \frac{\pi }{2} \leqslant \theta \leqslant \frac{\pi }{2}\), \( - \frac{\pi }{2} \leqslant \alpha \leqslant \frac{\pi }{2}\), \(0 \leqslant \Delta \leqslant 2\pi \), \(0 \leqslant \phi \leqslant 2\pi \) and \( - 1 \leqslant R \leqslant 1\). Note, for brevity, that all parameters have an argument r omitted.

Fig. 2.
figure 2

(Color online) Distribution of type of anisotropy in the cross section of the random object. (a) Original distribution of type of anisotropy; (b) distribution of type of anisotropy obtained using GPNN. LA is the linear amplitude anisotropy, LP is the linear phase anisotropy, CP is the circular phase anisotropy, CA is the circular amplitude anisotropy, UP—undefined type of anisotropy. Classifier errors are highlighted with circles.

As in the general case, for the objects under consideration, the GP problem with spatial resolution is reduced to an inverse problem, in which it is necessary to restore the distribution of type of anisotropy in the cross section of the object from the measured CIF. To solve this problem, a specialized neural network GPNN was developed, which determines the type of anisotropy point by point. Note that this formulation of the problem of GP is ideologically similar to the problem of quantum tomography [31, 32].

To describe the GPNN functioning we introduce a vector \({\text{OP}} = \left[ {{{n}_{1}},{{n}_{2}},{{n}_{3}},{{n}_{4}}} \right]\), where \({{n}_{i}}\) take the values 0 or 1 when \({{n}_{1}}\) corresponds to LA, \({{n}_{2}}\) to LP, \({{n}_{3}}\) to CA, and \({{n}_{4}}\) to CP. We suppose that given type of the anisotropy is absent when \({{n}_{i}} = 0\) and when \({{n}_{i}} = 1\) the anisotropy takes place. For example, if \({\text{OP}} = \left[ {0,1,0,0} \right],\) the anisotropy at this point of the object is of the LP type. The developed neural network GPNN predicts the coordinates of the vector OP and consequently the anisotropy type. The input vector of the GPNN is a measurement vector that is a set of the CIF (see Table 1) in a certain point of the object. To train the GPNN we calculate the normalized CIF with the aid of Eq. (1). The accuracy of determining each CIF is 99 percent. There is [33] an example of the generated data used for the neural network training. We store the 4-dimensional data array in the “npy” format file. Each point of object is described by the measurement vector (the set consisting of the five normalized CIF), the OP vector, and the vector storing the parameters of the anisotropy. For training and testing we generate a numerical dataset consisting of 7000 points.

Now let us discuss the structure of the model (see Fig. 3).

Fig. 3.
figure 3

(Color online) Block diagram of GPNN neural network.

The vector formed from the five normalized CIF is fed to the input “Embedding stack” consisting of four Linear stack blocks. Each Linear stack block includes a fully connected layer, a batch normalization and a nonlinear activation function ReLU. In Fig. 3, we show the sizes of the fully connected layers. The Embedding stack block forms the data vector representation translating the data into a 700-dimensional space. The new vector is fed to the inputs of the Classifier stack. The Classifier Stack consists of four independent “structures” each of which we associate with one of the four anisotropy types. The structures consist of one “Linear stack,” a one-dimensional fully connected layer and a nonlinear activation function “Sigmoid.” Finally, at the Classifier Stack we get four numbers each of which is interpreted the probability of the existence of the anisotropy of one or another type of the anisotropy \(({{p}_{1}}|{\text{LA}},{{p}_{2}}|{\text{LP}},{{p}_{3}}|{\text{CA}},{{p}_{4}}|{\text{CP}})\) at the given point of the object. The numbers \({{p}_{i}}\) take the value 0 or 1. Consequently, the Classifier Stack is the classifier, which solves four binary classification tasks.

For convenience we normalize by one all the anisotropy parameters in the dataset and perform the training of our neural network. During the training we calculate the loss function according the formula \(L = {{L}_{{{{p}_{1}}}}} + {{L}_{{{{p}_{2}}}}} + {{L}_{{{{p}_{2}}}}} + {{L}_{{{{p}_{2}}}}}\), where \({{L}_{{{{p}_{i}}}}}\) is a binary cross-entropy for each structure of the “Classifier stack.” The neural network architecture was determined through the application of fundamental principles in neural network construction [34], representing an empirical selection of the quantity and dimensions of layers, as well as activation functions. Although the model may not be optimal, the outcomes achieved permit us to infer that this methodology possesses immense potential for efficacy of GP.

As a result, it gets a significant decrease of the loss function at the both train and test data (see Fig. 4). We define the quality of the Classifier using the F1-score metric [35]. In Fig. 4, for each structure of the Classifier stack we show the dependence of the value of the F1-score on the epoch number. Figure 4 demonstrates the mean likelihood of surpassing a 98% prediction accuracy for the presence of LA and CA by the 19th epoch, with a similar epoch yielding a prediction accuracy exceeding 95% for LP and CP. It is hypothesized that the discrepancy in accurately determining LP and CP stems from the extensive period [28] assigned of the parameters \(\Delta \) and \(\phi \). The utilization of GPNN in reconstructing distribution of type of the anisotropy of a random object is depicted in Fig. 2b, where UP denotes an erroneous determination of type of the anisotropy, including the identification of several types of the anisotropy at one point. It can be observed that the reconstructed distribution aligns quite favorably with the original distribution.

Fig. 4.
figure 4

(Color online) Dependence of loss function on epoch number (a) and dependence of F1-score value on epoch number (b). Curves train dataset (test dataset) show how loss function (values of F1-score) change during training. Curves 1, 2, 3, 4 show values of F1-score change during training for LA, LP, CA, CP, respectively. Curves 5, 6, 7, 8 show values of F1-score when verifying on test data for LA, LP, CA, CP, respectively.

In conclusion it is shown for the first time that deep neural networks can be used to reconstruct distribution of type of the anisotropy of random objects, whose properties are determined by linear and circular amplitude and phase anisotropy.

Successful utilization of neural networks in traditional polarimetry [36] and polarization imaging [37, 38] indicates their potential and for advancing GP. Method of GP is usefull for generating polarization property maps in scenarios where the use of detectors with spatial resolution is impractical. Given the substantial capabilities of neural networks, it is conceivable to develop a network capable of addressing the first problem outlined in the introduction—enhancing the speed of reconstruction PGI to achieve real-time. Such advancements may pave the way for practical applications of GP across various scientific and technological domains, particularly in diagnostic medicine. In addition, the use of deep learning will make it possible to solve the GP of ghost polarimetry, i.e., determine the value of anisotropy parameters (P, \(\theta \), \(\alpha \), \(\Delta \), \(\phi \) and \(R\)).