Blur Invariants for Image Recognition

Blur is an image degradation that makes object recognition challenging. Restoration approaches solve this problem via image deblurring, deep learning methods rely on the augmentation of training sets. Invariants with respect to blur offer an alternative way of describing and recognising blurred images without any deblurring and data augmentation. In this paper, we present an original theory of blur invariants. Unlike all previous attempts, the new theory requires no prior knowledge of the blur type. The invariants are constructed in the Fourier domain by means of orthogonal projection operators and moment expansion is used for efficient and stable computation. Applying a general substitution rule, combined invariants to blur and spatial transformations are easy to construct and use. Experimental comparison to Convolutional Neural Networks shows the advantages of the proposed theory.


I. INTRODUCTION
In image processing and analysis, we often have to deal with images that are degraded versions of the original scene.One of the most common degradations is blur, which usually appears as smoothing or suppression of high-frequency details of the image.Capturing an ideal scene f by an imaging device with the pointspread function (PSF) h, the observed image g can be modeled as a convolution of both g(x) = (f * h)(x) . ( This linear image formation model, even if it is very simple, is a reasonably accurate approximation of many imaging devices and acquisition scenarios.The blur may come from various physical sources.Based on our prior knowledge about the PSF, we distinguish a blind case when no information about the PSF is available, a semi-blind case when some (incomplete) information about the PSF is available (for instance its parametric form), and a non-blind case, when the PSF is known completely.
In classical image processing monographs [1], [2], the first methods of solving Eq. (1) for f were proposed for the non-blind case.The semi-blind and blind cases are much more difficult.Despite their extensive study (see [3]- [5] for a survey), they have not been fully resolved yet.Although some of the current image deconvolution methods yield good results, they rely on prior knowledge incorporated into regularization terms or other constraints.If such prior knowledge is not available, the methods may converge to solutions that are far from the ground truth.If noise is present, the inverse problem becomes even more ill posed and its solution numerically less stable.In the 1990s, some researchers not only realized all the above-mentioned difficulties connected with the solving of Eq. (1) but also found out that in many applications a complete restoration of f is not necessary and can be avoided, provided that an appropriate image representation is used.A typical example is a recognition of objects and patterns in blurred images, where a blur-robust object description forms a sufficient input for the classifier (see Fig. 1 for the illustration of the difference between the recognition and restoration approaches).This led to the introduction of the idea of blur invariants, which are powerful in many semiblind cases.Roughly speaking, blur invariant I is a functional fulfilling the constraint I(f ) = I(f * h) for any h from a certain set S of admissible PSFs.Many systems of blur invariants have been proposed so far (see [6], Chapter 6 and further references thereof).They differ from one another by the assumptions on the PSF, by the mathematical tools used for invariant construction, by the domain in which the invariants are defined, and by the application area for which the invariants were designed.
The main drawback of all current blur invariants is that they lack a unified mathematical framework.For each class of PSFs, the invariants had to be derived "from scratch", which means that one had to prove the invariance property for each PSF type separately.Although one can re-use similar calculation techniques for various families of PSFs, both explicit derivation and formal proof of invariance always had to be customized for any particular family of the PSFs.
We discover a unified theoretical background of blur invariants which is presented in this paper for the first time.We show that all previously published blur invariants are particular cases of a general theory, which provides this topic with a roof.Two key theorems, referred here as Theorem 6 and Theorem 7, are formulated and proved here regardless of the particular PSF type.This is a significant theoretical contribution of this paper, which has an immediate practical consequence.If we want to derive blur invariants w.r.t. a new class of PSFs, Theorems 6 and 7 offer the solution directly, provided that the PSF in question complies with the assumption of the Theorems.Verifying that is, however, much easier than the construction of the invariants from the beginning.

A. State of the art of blur invariants
Unlike geometric invariants, which can be traced over two centuries back to Hilbert [7], blur invariants are a relatively new topic.The problem formulation and the basic idea appeared originally in the '90s in the series of papers by Flusser et al. [8]- [10].The invariants presented in these pioneer papers were found heuristically without any theoretical background.The authors observed that certain moments of a symmetric PSF vanish.They derived the relation between the moments of the blurred image and the original and thanks to the vanishing moments of the PSF they eliminated the non-zero PSF moments by a recursive subtraction and multiplication.They did it for axially symmetric [8], [9] and centrosymmetric [10] PSFs.These invariants, despite their heuristic derivation and the restriction to centrosymmetric PSFs, have been adopted by many researchers in further theoretical studies [11]- [29] and in many application-oriented papers [30]- [39].
Significant progress of the theory of blur invariants was made by Flusser et al. in [54], where the invariants to arbitrary N -fold rotation symmetric blur were proposed.In that paper, a derivation based on a mathematical theory rather than on heuristics was presented for the first time.The invariants were constructed by means of projection of the blurred image onto the subspace of the PSFs.A similar result achieved by another technique was later published by Pedone et al. [55].
The main limitation of all the above mentioned methods is their restriction to a given single class of blurs.In other words, the authors first defined the blur type they were considering and then they derived the invariants based on the specific properties of the blur.In this paper, we approach the problem the other way round.Regardless of the particular blur, we find a general formula for blur invariants.Then, for any type of admissible PSF, this general formula immediately provides specific invariants.This is the major original contribution of this paper that differentiates the proposed theory from the previous ones.

II. MATHEMATICAL PRELIMINARIES Definition 1. By an image function (or image) we understand any real function
with a compact support. 1 The set of all image functions is denoted as I.
For the convenience, we assume the Dirac δ-function to be an element of I. 2 Definition 2. Let B = {π k (x)} be a set of d-variable polynomials.Then the integral is called moment of function f with respect to set B. The non-negative integer |p|, where p is a ddimensional multi-index, is the order of the moment.
Moments are widely used descriptors of compactlysupported functions.Depending on B, we recognize various types of moments.If π k (x) = x k we speak about geometric moments.If d = 2 and π pq (x, y) = (x + iy) p (x − iy) q , we obtain complex moments.If the polynomials π k (x) are orthogonal (or orthogonal with a weight), we get orthogonal (OG) moments.Legendre, Zernike, Chebyshev, and Fourier-Mellin moments are the most common examples.For the theory of moments and their application in image analysis we refer to [6].Definition 3. A linear operator P : I → I is called a projection operator (or projector for short) if it is idempotent, i.e.P 2 = P .
The image space I and any projector P satisfy the following Lemma.Lemma 4. The following statements hold: This proposition shows that the image space is sufficiently "large".
1 Symbol Lp R d denotes a space of all functions of d real variables such that |f | p < ∞.
2 From mathematical point of view, this is formally incorrect since δ / ∈ L 1 ∩ L 2 .We could correctly include δ by means of theory of distributions but this would be superfluous for the purpose of this paper.
2) For any f, g ∈ I their convolution exists and f * g ∈ I.
This convolution closure property follows from Young's inequality.3) For any f ∈ I, its Fourier transform F(f ) exists.4) For any f ∈ I, its moments w.r.t.arbitrary B exist and are finite.5) Let S = P (I).The set S is also a vector space and I can be expressed as a direct sum I = S ⊕ A, where A is called the complement of S and is also a vector space.Any f ∈ I can be unambiguously written as a sum f = P f + f A , where P f is a projection of f onto S and f A ∈ A is simply defined as f A = f − P f .6) For any f ∈ I, P f A = 0 and, consequently, S ∩ A = {0}.If f ∈ S then f = P f and vice versa.
Definition 5. Projector P is called orthogonal (OG), if the respective subspaces S and A are orthogonal.

III. BLUR INVARIANTS
In this Section, we show how blur invariants can be constructed by means of suitable projectors.Let S be, from now on, the set of blurring functions (PSFs), with respect to which we want to design the invariants.Any meaningful S must contain at least one non-zero function and must be closed under convolution.In other words, for any h 1 , h 2 ∈ S must be h 1 * h 2 ∈ S.This is the basic assumption without which the question of invariance does not make sense.If S was not closed under convolution, then any potential invariant would be in fact invariant w.r.t.convolution with functions from the "convolution closure" of S, which is the smallest superset of S closed to convolution.
Under the closure assumption, (S, * ) forms a commutative semi-group (it is not a group because the existence of inverse elements is not guaranteed).Hence, the convolution may be understood as a semi-group action of S on I.The convolution defines the following equivalence relation on I: f ∼ g if and only if there exist h 1 , h 2 ∈ S such that h 1 * f = h 2 * g.Thanks to the closure property of S and to the commutativity of convolution, this relation is transitive, while symmetry and reflexivity are obvious.This relation factorizes I into classes of blur-equivalent images.In particular, all elements of S are blur-equivalent.The image space partitioning and the action of the projector are visualized in Figure 2. Now we are ready to formulate the following General theorem of blur invariants (GTBI), which performs the main contribution of the paper and a significant difference from all previous work on this field.Theorem 6 (GTBI).Let S be a linear subspace of I, which is closed under convolution and correlation.Let Fig. 2: Partitioning of the image space.Projection operator P decomposes image f into its projection P f onto S and its complement f A , which is a projection onto A. The ellipsoids depict blur-equivalent classes.Blurinvariant information is contained in the primordial images f r and g r (see the text).
P be an orthogonal projector of I onto S. Then is an invariant w.r.t. a convolution with arbitrary h ∈ S at all frequencies u where I(f ) is well defined.
Proof.Let us assume that P is "distributive" over a convolution with functions from S, which means P (f * h) = P f * h for arbitrary f and any h ∈ S. Then the proof is trivial, we just employ the basic properties of Fourier transform: The "distributive property" of P is equivalent to the constraint that the complement A is closed w.r.t.convolution with functions from S. This follows from Now let us show that this constraint is implied by the orthogonality of P regardless of its particular form.
Since S ⊥ A and Fourier transform on L 1 ∩ L 2 preserves the scalar product (this property is known as Plancherel Theorem), then F(S) ⊥ F(A).Let us consider arbitrary functions a ∈ A and h 1 , h 2 ∈ S. Using the Plancherel Theorem, the convolution theorem, and the correlation theorem (the correlation of two functions is just a convolution with a flipped function), we have The last equality follows from the closure of S w.r.t.correlation.Hence, A has been proven to be closed w.r.t.convolution with functions from S, which completes the entire proof.
The invariant I(f ) is not defined if P f = 0, which means this Theorem cannot be applied if f ∈ A. In all other cases, I(f ) is well defined almost everywhere.Since S contains compactly-supported functions only, F(P f )(u) cannot vanish on any open set and therefore the set of frequencies, where I(f ) is not defined, has a zero measure. 3In addition to the blur invariance, I is also invariant w.r.t.correlation with functions from S. This is a "side-product" of the assumptions imposed on S. The proof of that is the same as before, with only the operations convolution and correlation swapped.
Under the assumptions of the GTBI, S itself is always an equivalence class and δ ∈ S (to see this, note that for arbitrary h ∈ S we have h = δ * h = P (δ * h) = P δ * h which leads to P δ = δ).
The GTBI is a very strong theorem because it constructs the blur invariants in a unified form regardless of the particular class of the blurring PSFs and regardless of the image dimension d.The only thing we have to do in a particular situation is to find, for a given subspace S of the admissible PSFs, an orthogonal projector P .This is mostly much easier job than to construct blur invariants "from scratch" for any S.This is the most important distinction from our previous paper [54], where the invariants were constructed specifically for N -fold symmetric blur without a possibility of generalization.
Before we proceed further, let us show that the assumptions laid on S and P cannot be skipped.
As a counterexample, let us consider a 1D case where S is a set of even functions.Let P be defined such that P f (x) = f (|x|).So, P is a kind of "mirroring" of f and actually, it is a linear (but not orthogonal) projector on S. In this case, A is a set of functions that vanish for x ≥ 0. Clearly, A is not closed to convolution with even functions and functional I defined in GTBI is not an invariant.Let us consider another example, again in 1D.Let S be a set of functions that vanish for any x < 0. S is a linear subspace closed to convolution but it is not closed to correlation.Let us define operator P as follows: Obviously, P is a linear orthogonal projector onto S.However, A is again not closed to convolution with functions from S and GTBI does not hold.These two simple examples show, that the assumptions of convolution and correlation closure of S and orthogonality of P cannot be generally relaxed (although GTBI may stay valid in some cases even if these assumptions are violated, see Section V).
The property of blur invariance does not say anything about the ability of the invariant to distinguish two different images.In an ideal case, the invariant should be able to distinguish any two images belonging to distinct blur-equivalence classes (images sharing the same equivalence class of course cannot be distinguished due to the invariance).Such invariants are called complete.The following completeness theorem shows that I is a complete invariant within its definition area.
Theorem 7 (Completeness theorem).Let I be the invariant defined by GTBI and let f, g ∈ I \ A. Then I(f ) = I(g) almost everywhere if and only if f ∼ g.
Proof.The proof of the backward implication follows immediately from the blur invariance of I. To prove the forward implication, we set h 1 = P g and h 2 = P f .Then it holds f * h 1 = g * h 2 , which means f ∼ g due to the definition of the equivalence class.
To summarize, I cannot distinguish functions belonging to the same equivalence class due to the invariance and functions from A since they do not lie in its definition area.All other functions are fully distinguishable.Note that the completeness may be violated on other image spaces, for instance, on a space of functions with unlimited support where we find such f and g that I(f ) = I(g) at all frequencies where both I(f ) and I(g) are well defined but f and g belong to different equivalence classes.
Understanding what properties of f are reflected by I(f ) is important both for theoretical considerations as well as for practical application of the invariant.I(f ) is a ratio of two Fourier transforms.As such, it may be interpreted as deconvolution of f with the kernel P f .This "deconvolution" eliminates the part of f belonging to S (more precisely, it transfers P f to δ-function) and effectively acts on the f A only: I(f ) can be viewed as a Fourier transform of socalled primordial image f r .Even if the primordial image itself may not exist (the existence of F −1 (I(f )) is not guaranteed in I), it is a useful concept that helps to understand how the blur invariants work.The primordial image is unique for each equivalence class, it is the "most deconvolved" representative of the class.Two images f and g share the same equivalence class if and only if f r = g r .For instance, the primordial image of all elements of S is δ-function.
Any element of the equivalence class can be reached from the primordial image through a convolution.Any features, which describe the primordial image, are unique blur-invariant descriptors of the entire equivalence class.At the same time, the primordial image can also be viewed as a kind of normalization.It plays the role of a canonical form of f , obtained as the result of the "maximally possible" deconvolution of f (see Fig. 3 for schematic illustration).As the last topic in this section, we briefly analyze the robustness of I(f ) to noise.Let us assume an additive zero-mean white noise, so we have g = f * h + n and, consequently, P g = h * P f + P n.As we will see in Section V, all meaningful projection operators contain summation/integration over a certain set (often large) of pixels, which makes P n to converge to the mean value of n, which is zero.So, we have Considering the magnitude of the second term, note that |F(n)(u)| = σ because the noise is white.
Hence, at least at low frequencies where F(P f )•F(h) dominates, this term is close to zero and I exhibits a robust behavior as I(g) .= I(f ).However, this may be violated at high frequencies where F(P f ) • F(h) is often low.

IV. INVARIANTS AND MOMENTS
The blur invariants defined in the frequency domain by GTBI may suffer from several drawbacks when we use them in practical object recognition tasks.Since I(f ) is a ratio, we possibly divide by very small numbers which requires careful numerical treatment.Moreover, if the input image is noisy, the highfrequency components of I(f ) may be significantly corrupted.This can be overcome by suppressing them by a low-pass filter, but this procedure introduces a user-defined parameter (the cut-off frequency) which should be set up with respect to the particular noise level.That is why we prefer to work directly in the image domain.Some heuristically discovered imagedomain blur invariants were already published in the early papers [8]- [10].Here we present a general theory, which originates from the GTBI.
A straightforward solution might be to calculate an inverse Fourier transform of I(f ), which leads to obtaining the primordial image f r and to characterize f r by some popular descriptors such as moments.This would, however, be time-consuming and also problematic from the numerical point of view.We would not only have to calculate the projection P f , two forward and one inverse Fourier transforms, but even worse, the result may not lie in I.In this Section, we show how to substantially shorten and simplify this process.We show, that the moments of the primordial image can be calculated directly from the input blurred image, without an explicit construction of P f and I(f ).Since f r is a blur invariant, each its moment must be a blur invariant, too.This direct construction of blur invariants in the image domain, again without specifying particular S and P , is the major theoretical result of the paper and performs a very useful tool for practical image recognition.
Image moments can be defined w.r.t.arbitrary polynomial basis (see Definition 2).In image analysis literature, various bases have been employed to construct moment invariants [56].There is no significant difference among them since between any two polynomial bases there exists a transition matrix.In other words, from the theoretical point of view, all polynomial bases and all respective moments carry the same information, provide the same recognition power and generate equivalent invariants.However, working with some basis might be in a particular situation easier than with the others, and also numerical properties and stability of the moments may differ from each other.
Here we choose to work with a basis that separates the moments of P f and f A , although equivalent invariants could be derived in any basis at the expense of the complexity of respective formulas.
Let B = {π p (x)} be a polynomial basis.When considering the polynomials on a bounded support, then B ⊂ I and all moments M (f ) p exist and are finite.Let S and P fulfill the assumptions of GTBI.Considering the decomposition f = P f + f A , we have for the moments We say that B separates the moments if there exist a non-empty set of multi-indices D such that it holds for any f ∈ I if p ∈ D and if p / ∈ D. In other words, this condition says that the moments are either preserved or vanish under the action of P .If fulfilled, the condition also says that the value of M A sufficient condition for B to separate the moments is that π p ∈ S if p ∈ D and π p ∈ A otherwise.Since S and A are assumed to be mutually orthogonal, the separability of such B is obvious.This has nothing to do with a (non)orthogonality of B itself, as we show in the following simple 1D example.Let S be a set of even functions and A be a set of odd functions.Let π p (x) = x p .If we take D = {p = 2k|k ≥ 0}, we obtain the moment-separating polynomials.
For the given S and projector P , the existence of a basis that separates the moments is not guaranteed, although in most cases of practical interest we can find some.If it does not exist, the moment blur invariants still can be derived.It is sufficient if the moments M (P f ) p can be expressed in terms of M (f ) p if p ∈ D and some functions of M (f ) p equal zero for p / ∈ D. This makes the derivation more laborious and the formulas more complicated but does not make a principle difference.Anyway, to keep things simple, we try for any particular S to find such B that provides the moment separability.
To get the link between I(f ) and the moments M (f ) p , we recall that Taylor expansion of Fourier transform is where m p is a geometric moment.In the sequel, we assume that the power basis π p (x) = x p separates the moments.If it was not the case, one would substitute into (11) any separating basis through the polynomial transition relation.
The GTBI can be rewritten as All these three Fourier transforms can be expanded similarly to (11) into absolutely convergent Taylor series.Thanks to the moment separability, we can for any p ∈ D simply write m where C p can be understood as the moments of the primordial image f r .Comparing the coefficients of the same powers of u we obtain, for any p which can be read as The summation goes over those k ∈ D for which 0 ≤ k i ≤ p i , i = 1, . . ., d.Note that always 0 ∈ D. (To see that, it is sufficient to find an image whose zeroorder moment is preserved under the projection.Such an example is δ-function, because P (δ) = δ, as we already showed.)After isolating C p on the left-hand side we obtain the final recurrence This recurrence formula is a general definition of blur invariants in the image domain (provided that m 0 = 0) 4 .Since I(f ) has been proven to be invariant to blur belonging to S, all coefficients C p must also be blur invariants.The beauty of Eq. ( 16) lies in the fact that we can calculate the invariants from the moments of f , without constructing the primordial image explicitly either in frequency or in the spatial domain.
Some of the invariants C p are trivial for any f and useless for recognition.We always have C 0 = 1 and some other invariants may be constrained depending on the index set D. If for arbitrary p, k ∈ D also (p − k) ∈ D, then C p = 0 for any p ∈ D as can be deduced from Eq. ( 16) via induction.This commonly happens in many particular cases of practical interest and then only the invariants with p / ∈ D should be used.In addition to that, some invariants may vanish depending on f .In particular, if f ∈ S, then C p = 0 for any p = 0.
Numerical behavior of one particular moment invariant of the type ( 16) of order 7 can be seen in Fig. 4, where the mean relative error (MRE) between the invariant of the blurred and noisy image and the original one is depicted as a function of the blur size and SNR.Note that the MRE almost does not depend on the blur size (since the blur was synthetic, we eliminated the boundary effect), is below 0.2% if the noise is mild and even for heavy noise of SNR = 10 the MRE is still below 1%, which shows an excellent robustness.The behavior of other invariants is similar.However, when increasing the order of the moments used, the MRE slightly increases as well.Summarizing, the robustness to noise is determined by the robustness of the moments, which has been thoroughly studied in many papers (see [6] and further references thereof) and is known to be quite good.

V. BLUR EXAMPLES
In this Section, we show the blur invariants provided by the GTBI for several concrete choices of S and P with a particular focus on those of practical importance in image recognition.Some of them are equivalent to the invariants already published in earlier papers; in such cases, we show the link between them.Some other invariants are published here for the first time.

A. Trivial cases
The formally simplest case ever is S = I and P f = f .Although this choice fulfills the assumptions of GTBI, it is not of practical importance because the entire image space forms a single equivalence class, and any two images are blur equivalent.Actually, GTBI yields I(f ) = 1 for any f .An opposite extreme is to choose S = {aδ|a ∈ R}.This "blur" is in fact only a contrast stretching.If we set P f = ( f ) • δ, P is not orthogonal but still P (f * h) = P f * h and GTBI can be applied provided that f = 0. We obtain I(f ) = F(f )/ f , which leads to a contrast-normalized primordial image f r = f / f .Another rather trivial case is S = h h = 1 .This is the set of all brightness-preserving blurs without any additional constraints.We may construct P f = f / f , which actually is a projector; however it is neither linear nor orthogonal.Since P (f * h) = P f * h, we can still apply GTBI, which yields a single-valued blur invariant I(f ) = f , that corresponds to the primordial image f r = ( f ) • δ.

B. Symmetric blur in 1D
In 1D, the only blur space S, which can be defined generically and is of practical interest, is the space of all even functions.1D symmetric blur invariants were firstly described in [57] and later adapted to wavelet domain by Makaremi [26].Kautsky [18] rigorously investigated these invariants and showed how to construct them in terms of arbitrary moments.Galigekere [28] studied the blur invariants of 2D images in the Radon domain, which inherently led to 1D blur invariants.
If we consider the projector then A is a space of odd functions, P is orthogonal and GTBI can be applied directly.As for the moment expansion, the simplest solution is to use the standard monomials π p (x) = x p , which separate the geometric moments for D being the set of even non-negative indices.

C. Centrosymmetric blur in 2D
Invariants w.r.t.centrosymmetric blur in 2D have attracted the attention of the majority of authors who have been involved in studying blur invariants.The number of papers on this kind of blur exceeds significantly the number of all other papers on this field.This is basically for two reasons -such kind of blur appears often in practice and the invariants are easy to find heuristically, without the knowledge of the stateof-the-art theory of projection operators.
A natural way of defining P is Then standard geometric moments are separated at D = {(p, q)|(p + q) even} and Eq. ( 16) leads to moment expansion that appeared in some earlier papers such as in [10] and others cited in Section I-A.This approach can be extended into 3D, where the definition of centrosymmetry is analogous.Existing 3D blur invariants [21], [22] are just special cases of Eq. ( 16).

D. Radially symmetric blur
Radially (circularly) symmetric PSF's satisfying h(r, φ) = h(r) appear in imaging namely as an outof-focus blur on a circular aperture (see Fig. 5 (a) for an example).The projector P ∞ is defined as The standard power basis does not separate the moments.This is why various radial moments have been used to ensure the separation.Basis B consists of circular harmonics-like functions of the form π(r, φ) = R pq (r)e iχ(p,q)φ , where R pq (r) is a radial polynomial and χ(p, q) is a simple function of the indices.There are several choices of B, which separate the respective moments and yield blur invariants (the index set D depends on the particular B).Some of them were introduced even without the use of projection operators.They mostly employed Zernike moments [15], [41]- [43], Fourier-Mellin moments [44] and complex moments [40].

E. N -fold symmetric blur
N -fold rotationally symmetric blur performs one of the most interesting cases, both from theoretical and practical points of view.This kind of blur appears as an out-of-focus blur on a polygonal aperture.Most cameras have an aperture the size of which is controlled by physical diaphragm blades, which leads to polygonal or close-to-polygonal aperture shapes if the diaphragm is not fully open (see Fig. 5 (b) and (c)).The blur space is defined as S N is a vector space closed under convolution and correlation.We can construct projector P N as where α j = 2πj/N .Since P N is an orthogonal projector, GTBI can be immediately applied.Complex moments are separated with which allows to get particular blur invariants from Eq. ( 16).Invariants to N -fold symmetric blur were originally studied in [54], where the idea of projection operators appeared for the first time.Their application for registration of blurred images was reported in [55].

F. Dihedral blur
The N -fold symmetry, discussed in the previous subsection, may be coupled with the axial symmetry.In such a case, the number of the axes equals N and we speak about the N -fold dihedral symmetry.Many outof-focus blur PSFs are actually dihedral, particularly if the diaphragm blades are straight (see Fig. 5 (c)).
The blur space D N is a subset of S N given as where α ∈ 0, π/2 is the angle between the symmetry axis a and the x-axis and h α (x, y) denotes function h(x, y) flipped over a.However, the set D N is not closed under convolution if we allow various axis directions.Only if we fix the symmetry axis orientation to a constant angle α, we get the closure property.Then we can define the projection operator and GTBI can be applied.Dihedral blur invariants were firstly studied in [58].Their major limitation comes from the fact that the orientation of the symmetry axis must be apriori known (and the same for all images entering the classifier).This is far from being realistic and the only possibility is to estimate α from the blurred image itself [59].

G. Directional blur
Directional blur (sometimes called linear motion blur) is a 2D blur of a 1D nature that acts in a constant direction only.Directional blur may be caused by camera shake, scene vibrations, and camera or scene motion.The velocity of the motion may vary during the acquisition, but this model assumes the motion along the line.We do not consider a general motion blur along an arbitrary curve in this paper. 5he respective PSF has the form (for the sake of simplicity, we start with the horizontal direction) where h 1 (x) is an arbitrary 1D image function.The space S is defined as a set of all functions of the form (25).When considering a constant direction only, S is closed under 2D convolution and correlation.The projection operator P is defined as P is not orthogonal but geometric moments are separated with D = {(p, q)| q = 0} and Eq. ( 16) yields the directional blur invariants in terms of geometric moments.
If the blur direction under a constant angle β is known, the projector P β f is defined analogously to ( 26) by means a line integral along a line which is perpendicular to the blur direction (see Fig. 5 (d) for an example of a real directional PSF).
The idea of invariants to linear motion blur appeared for the first time in [45] and in a similar form in [46], without any connection to the projection operator.Zhong used the motion blur invariants for recognition of reflections on a waved water surface [49].Peng et al. used them for weed recognition from a camera moving quickly above the field [47] and for classification of wood slices on a moving conveyor belt [48] (these applications were later enhanced by Flusser et al. [60], [61]).Other applications can be found in [50], [51].The necessity of knowing the blur direction beforehand is, however, an obstacle to the wider usage of these invariants.

H. Gaussian blur
Gaussian blur appears whenever the image has been acquired through a turbulent medium.It is also introduced into the images as the sensor blur due to the finite size of the sampling pulse and may be sometimes applied intentionally as a part of denoising.
Since Gaussian function has an unlimited support, we have to extend our current definition of I by including functions of exponential decay.We define the set S as where Σ is the covariance matrix which controls the shape of the Gaussian G Σ .S is closed under convolution but it is not a vector space.We define P f to be such element of S which has the same integral and covariance matrix as the image f itself.Clearly, P 2 = P but P is neither linear nor orthogonal.Although the assumptions of GTBI are violated, the Theorem still holds thanks to P (f * h) = P f * h.The moment expansion analogous to Eq. ( 16) can be obtained when employing the parametric shape of the blurring function.Thanks to this, we express all moments of order higher than two as functions of the low-order ones, which substantially increases the number of non-trivial invariants.

VI. EXPERIMENTAL EVALUATION
In this section, we show the performance of the proposed invariants in the recognition of blurred facial photographs, in template matching within a blurred scene and in two common image processing problems -multichannel deconvolution and multifocus fusionwhere we use the proposed invariants for registration of blurred frames.The first experiment was performed on simulated data, which makes possible to evaluate the results quantitatively, while the other three experiments show the performance on real images and blurs.

A. Face recognition
The use of various CNNs for recognition of blurred images has been tested recently in several papers, that studied the impact of blur on the network recognition performance [66]- [69].They all reported that introducing even a small or moderate blur decreases the performance of networks trained on clear images only.Some of the above papers recommended eliminating this drawback by network fine-tuning or by augmentation of the training set with many blurred versions of the training images, however at the expense of a massive increase of the training time.
We used 38 facial images of distinct persons from the YaleB dataset [70] (frontal views only).Each class was represented by a single image resized to 256×256 pixels and normalized to brightness.As the test images, we used synthetically blurred and noisy instances of the database images starting from mild (5 × 5 blur, SNR = 50 dB) to heavy (125 × 125 blur, SNR = 5 dB) distortions.We used four types of centrosymmetric blur (circular, random, linear motion, Gaussian) and Gaussian white noise in these simulations (see Fig. 6 for some examples).In each setting, we generated 10 instances of each database image.
The faces were classified by four different methods -blur invariants, CNN trained on clear images only, CNN trained on images augmented with blur, and the Gopalan's distance [71].As blur invariants, we used the particular version of I(f ) from Theorem 6 with operator P defined in Section V-C.As the CNN, we used a pre-trained ResNet18 [72] initially trained on the ImageNet dataset [73].Data augmentation was done by adding 100 differently blurred and noisy instances to the training set such that the blur was of the same size as that of the test images.The Gopalan's distance belongs to "handcrafted" features and measure the "distance" between two images in a way that should be insensitive to blur.Unlike the proposed invariants,  The recognition results are summarized in Table I.The performance of the proposed invariants is excellent except for the last two settings, where the blur caused extreme smoothing and significant boundary effect (but still the performance over 90% is very good).Fig. 7 shows examples of a very heavy blur that was handled correctly by the proposed invariants.The CNN trained on clear images only fails for mid-size and large blurs, which corresponds to the results of earlier studies.However, if we augment the training data extensively with blurred images, the performance is close to 100% but the training time was about four hours compared to few seconds required by the invariants.In this scenario, introducing new images/persons to the database requires additional lengthy training of CNNs.The performance of the Gopalan's method decreases as the blur increases because this method is blur-invariant only approximately.Its computing complexity is less than that of the augmented CNN but much higher than that of the proposed invariants and the CNN without augmentation.

B. Template matching
Localization of sharp templates in a blurred scene is a common task in many application areas such as in landmark-based image registration and in stereo matching.In this experiment, we show how the blur invariants can be used for this purpose.
We took two pictures of the same indoor scene -the first one was sharp while the other one was intentionally taken with wrong focus.In the sharp image, we selected 21 square templates (see Fig. 8a) and the goal was to find these templates in the blurred scene.Since the out-of-focus blur has approximately a circular shape, we used the blur invariants w.r.t.radially symmetric blur (see Section V-D).Since the templates are relatively small, we used the invariants defined directly in the image domain by means of moments (16).The matching was performed by searching over the whole scene, without using any prior  information about the template position.The matching criterion was the minimum distance in the space of blur invariants.Nine templates were localized with an error less than or equal to 10 pixels, eight templates with an error 11 -20 pixels, three templates with an error 21 -30 pixels, and one template with an error greater than 30 pixels (see Fig. 8b).In the sense of a target error, each template was localized in a position that is less than half of the template size from the ground truth.
The localization error is caused by the fact that the blurred template is not exactly a convolution of the ground truth template and the PSF.We observe a strong boundary effect as pixels outside the template influence pixels inside the template.This interaction is, of course, beyond the assumed convolution model.In the case of a large PSF, it influences the matching.If the distance matrix has a flat minimum, then a small disturbance of the invariants due to the boundary effect may result in an inaccurate match.
For comparison, we performed the same task using plain moments instead of the invariants while keeping the number and order of the features the same.Results are unacceptable, most of the templates were matched in totally wrong positions (see Fig. 8c).This clearly shows that introducing blur-invariant features brings a significant improvement.

C. Multichannel deconvolution
Multichannel blind deconvolution (MBD) is a process where two or more differently blurred images of the same scene are given as an input and a single de-blurred image is obtained as an output [4].The restoration is blind, so no parametric form of the PSF's is required.Comparing to single-channel deconvolution, it is more stable and usually produces much better results.However, the crucial requirement is that the input frames must be registered before entering the deconvolution procedure.The registration accuracy up to several pixels is sufficient because advanced MBD algorithms are able to compensate for a small misalignment [74].Since the input frames are blurred, most of the common registration techniques designed originally for sharp images [75] fail.
For the registration of blurred frames, the proposed invariants can be used.In Fig. 9 (left and middle), we see two input images of a statue blurred by camera shake.Since the camera was handheld and there was a few-second interval between the acquisitions, the images differ from each other not only by the particular blur but also by a shift and a small rotation.To register them, we use "blur-invariant phase correlation" method.It is an efficient landmark-free technique inspired by traditional phase correlation [76].Our method uses directly the blur invariants I(f ) and I(g) (instead of whitened Fourier spectrum F/|F | and G/|G| used in the phase correlation) to find the correlation peak.Since we do not have much prior information about the blurs, we use operator P 2 from Section V.E to design the invariants, because it is less specific than the others and should work for many blurs.Switching between Cartesian and polar domains, the method can register both shift and rotation.
In this real-data example we do not have any ground truth so we cannot explicitly measure the registration accuracy.However, it is documented by a good performance of the subsequent MBD algorithm.The registered frames were used as an input of the MBD proposed in [77].The result can be seen in Fig. 9 right.We acknowledge a sharp image with very little artifacts, which proves a sufficient registration accuracy (and of course a good performance of the MBD algorithm itself).

D. Multifocus fusion
Multifocus image fusion (MIF) is a well-known technique of combining two or more images of the same 3D scene, that were taken by a camera with a shallow depth of field [78].Typically, one frame is focused to the foreground while the other one to the background (see Fig. 10 for an example).The fusion algorithms basically decide locally in which frame this part of the scene is best focused and generate the fused image by stitching the selected parts together without Fig.9: Multichannel deconvolution.The original frames blurred by a camera shake (left and middle).Note the shift and rotation misalignment between them, that was registered by blur-invariant phase correlation.The result of MBD [77] applied on the registered frames (right).performing any deconvolution.Obviously, an accurate registration of the inputs is a key requirement.
The registration problem is here even more challenging than in the previous experiment, because the convolution model holds only on the foreground or background and the required accuracy is higher them in the MBD case.
The input frames and the fused product are shown in Fig. 10.As in the previous experiment, we applied the blur-invariant phase correlation.Since there was just a shift between the frames, the entire procedure run in the Cartesian coordinates.We assumed a circular out-of-focus blur, so we used the operator P ∞ from Section V.D.After the registration, the fusion itself was performed by the method proposed in [79].High visual quality of the fused product with almost no artifacts proves the accuracy of the registration.

E. Discussion
The experiments demonstrate a very good performance of the proposed invariants in the recognition of blurred objects and in blurred frames registration.Blur invariants exist in equivalent forms in Fourier domain where they are expressed directly by the projection operator and in the image domain where they use moment expansion.Both domains can be used in experiments and our choice mostly depends on the image size (for large images, Fourier invariants are more efficient and vice versa).In terms of recognition power and speed, the proposed invariants are probably the best "handcrafted" blur-invariant features ever published.
The comparison to deep-learning methods, represented here by the ResNet CNN, is perhaps even more interesting.We showed that if the scenario is convenient for using "handcrafted" features, our invariants outperform CNN.By a convenient scenario, we understand situations, where the number of classes may be high but the classes are relatively small, typically represented by a single (or very few) training sample(s).To reach a comparable recognition rate, CNNs require a massive augmentation over a wide range of blurs, which makes the training extremely time-consuming.
On the other hand, the proposed invariants can hardly be used for classification into generic classes such as "person", "car", "animal", "tree", etc.The invariants do not have the ability to analyze the image content and they are not "continuous", which means that two visually similar objects (two dogs or two cars for instance) might have very different invariant values.These scenarios can be well resolved by deep learning, however, there is still the necessity of a large-scale augmentation of the training set with blur if blurred images are expected on the input of the system.
To summarize, the proposed invariants and CNNs with augmentation are complementary rather than competitive approaches, each of them dominates in distinct situations.One of the challenges for future work is to "fuse" both approaches for situations that are somewhere in between the above mentioned extremes.

VII. CONCLUSION
In this paper, we presented the general theory of invariants with respect to blur.The main original contribution of the paper lies in Theorem 6.
The benefit of the paper is twofold.We showed that all previously published examples of blur invariants are just particular cases of a unified theory, which can be formulated by means of projection operators without a limitation to a single blur type.This significantly contributes to the understanding of blur invariants.The application of this theory to the blur types, which have not been fully explored yet, makes it possible to derive new specific blur invariants that would be difficult to construct otherwise.
Several questions, important for the theory and practice of blur invariants, still remain open for future research.A challenging area is an investigation of linear non-orthogonal projection operators.We have shown that they may generate useful blur invariants in some cases such as directional blur, but we lack a general theorem similar to GTBI.At the same time, non-orthogonal projectors might provide solutions to many practically important cases where any blur invariants have not be known.Another, even more difficult, open problem is to go beyond linearity and to study blur invariants constructed by means of non-linear projectors.In the case of Gaussian blur, we showed that a non-linear projector may produce blur invariants in a natural way.Unlike linear projectors, the non-linear ones have not been consistently investigated, which has been partly due to their variability.
Another challenge comes from 3D images.Blur invariants in 3D have been explored much less than those in 2D.In 3D, 17 symmetry groups exist [80] and each of them can create a blur space.Although the definition of respective projection operators seems to be similar to the 2D case, a non-trivial problem is to find an appropriate basis B that separates the moments [81].
The presented blur invariants, both in Fourier and moment domains, can be made invariant also to rotation, scaling and even to an affine transform.Due to the space limitation, it is not possible to explain these "combined invariants" rigorously in this paper.
A way to improving the success rate in recognition of blurred images could be a fusion of blur invariants with deep learning approaches, which could compensate for weaknesses of both approaches.That could be done either by inserting the invariants into the hidden layers of the network or by decision fusion on the top level.The research on this field is at a very initial stage and we envisage its dynamic development in the near future.

Fig. 1 :
Fig. 1: The flowchart of image restoration (left) and of the recognition by blur invariants (right).

Fig. 3 :
Fig.3: The concept of the primordial image: The blurred image is projected onto S and this projection is used to "deconvolve" the input image in the Fourier domain.Blur-invariant primordial image is obtained as a seeming Fourier inversion of I(f ).Its moments are blur invariant and can be calculated directly from f .

Fig. 4 :
Fig. 4: The MRE between the invariant of the blurred and noisy image and the original one as a function of the blur size and SNR.

Fig. 7 :
Fig. 7: Extreme cases recognized correctly by the blur invariants but misclassified both by CNN and the Gopalan's method.

Fig. 10 :
Fig. 10: Multifocus fusion.The input frames focused on the foreground (left) and on the background (middle).The frames were registered by blur-invariant phase correlation and fused by the method from [79] (right).

TABLE I :
[71]recognition rate [%] for different degradations achieved by the proposed invariants (In), CNN trained on clear images, CNN with augmentation by blurred images, and the Gopalan's method[71](G). the Gopalan's method requires the knowledge of the blur support size, which is no problem in simulated experiments.