1 Introduction

Human image analysis still poses huge challenges for a scientist. Automatic computer object recognition and interpretation of an image is crucial in building excellent image analysis software, especially for extracting higher-level information.

Real-life applications forced to develop the idea to describe the object characteristics by using a set of numbers, thus enabling a spectrum of numerical quantifications. Many shape descriptors were created and used [1, 12]. Some of them have generic purpose, such as the Fourier descriptors [3] or moment invariants [10, 15]. On the other hand, for the specific purpose of classification, several shape descriptors are useful for describing and differentiating a variety of objects: convexity [23], rectangularity [24], linearity [27], symmetry [29], etc. Note that, due to the diversity of shapes, descriptors have applications in various areas such as computer science, medicine, biology, and robotics.

Recently, there have been several applications that concentrate on finding and classifying circular and elliptical objects in images. They range from identifying traffic light and sign [4], to face detection [16]. In this paper we define a new measure for two important descriptors: circularity and ellipticity in arbitrary dimensions. Future applications might include, for example, the identification of sick individual cells based on their boundaries in medical imaging—compare Fig. 1. Our motivation, from the theoretical point of view, can be illustrated by the following problem.

Fig. 1
figure 1

Sickle cell anemia [20]. Individuals with sickle cell anemia have crescent-shaped red blood cells. Diseases such as this one cause a decreased ability in oxygen delivery throughout the body. The new ellipticity measure (presented in this paper) differentiates well the cell with anemia (\(\mathcal {E}_2(A)=0.902\)) from the normal cells (\(\mathcal {E}_2(B)=0.999\), \(\mathcal {E}_2(C)=0.999\), \(\mathcal {E}_2(D)=0.993\), \(\mathcal {E}_2(E)=0.999\))

Problem 1

If the actual area occupied by an object can be estimated using the well-known area formula \(\pi r_1 r_2\), it has a good chance of being an ellipse, or, if \(\pi r^2\)—a circle. Obviously, the questions how to estimate the major and minor ellipse radii \(r_1\), \(r_2\) (or r), or how to formalize “good chance” still need to be answered.

To develop the solution which will solve the presented problem, we consider closely the existing ones, namely the methods which verify whether the set is an ellipsoid, and which differentiate the shapes between ellipses and circles. The work presented in this article aims to partially generalize the methods presented by Žunić et al. in work [31], Žunić and Žunić in [30], and Rosin in [24] which describe circularity and ellipticity measures. The reason to choose these methods is their performance superiority in the case of shape boundary defects compared to the other standard methods, namely, the behavior of these measures (i.e., numerical shape characteristics) can be relatively easily understood and their behavior can be reasonably predicted. The aforementioned articles describe explicit formula which use the first two Hu moments invariants to evaluate how much a planar shape differs from a circle or an ellipse. Detailed information of those measures—of circularity \(\mathcal {C}_H\) and ellipticity \(\mathcal {E}_H\) and \(\mathcal {E}_I\)—are presented in Sect. 2 of this article.

Our aim is to deal with the problem of ellipsoids and balls recognition in the general case of \(\mathbb {R}^N\). The main idea of our approach uses the information theory concept called Kullback–Leibler divergence and can be reduced to verifying whether the value of

$$\begin{aligned} \lambda _N(S)\big /\sqrt{\det (\Sigma _S)} \end{aligned}$$
(1)

is maximal for \(S \subset \mathbb {R}^N\), where \(\Sigma _S\) denotes the covariance matrixFootnote 1 of S and \(\lambda _N\) the N-dimensional volumeFootnote 2 (Lebesgue measure) of S.

The main result of this work presented in Theorem 2 gives an estimation of (1) in the case of circles and ellipses. This allows to derive the condition to test if a given set is elliptical-like \(\mathcal {E}_N\) or circular-like \(\mathcal {C}_N\). Hence, they can be used as a measure of ellipticity and circularity. Our measures were tested on several examples from [30, 31]. Experiments verify many advantages of our approach, e.g., behavior consistent with the human intuition and its invariance in similarity transformation. Moreover, our measures can be applied to higher dimensional data (see Figs. 9, 10, 11).

This paper is organized as follows. In the next section the foundations for the state of art are introduced. In Sect. 3 we briefly describe the main result of this work with the sketch of the proof. In Sect. 4 we set up notation and terminology for the Kullback–Leibler divergence and cross-entropy. In Sect. 5 we provide the formula for circularity and ellipticity measurement. Comments and conclusions can be found in the last section.

2 State of the Art

In this section several most standard measures of circularity and ellipticity are mentioned. Those methods range over (0, 1] and give the measurement equal to 1 if and only if the measured shape is a circle or an ellipse.

Let us consider an arbitrary set \(S\subset \mathbb {R}^2\).

2.1 Circularity Measure

Form the historical point of view, the first circularity measure was introduced in [6] an later in the digital plane in [13]. It is given by

$$\begin{aligned} \mathcal {C}_{st}(S)=\frac{4 \pi \cdot \lambda _2(S)}{(\text {perimeter}\_\text {of}\_\text {S})^2}, \end{aligned}$$

where \(\lambda _2(S)\) is the area of the set S (compare with [26]).

Geometrical moments can also be used in the circularity measure. Example of such approach can be found in [31]. Circularity can be measured by the quantity

$$\begin{aligned} \mathcal {C}_H(S)=\frac{(\mu _{0,0}(S))^2}{2\pi (\mu _{2,0}(S)+\mu _{0,2}(S))}, \end{aligned}$$

where the centralized (pq)-moment \(\mu _{p,q}(S)\) of a planar set S is

$$\begin{aligned} \mu _{p,q}(S)=\int \int _S\bigg (x-\mu _x(S)\bigg )^p\bigg (y-\mu _y(S)\bigg )^q\mathrm{{d}}x\mathrm{{d}}y \end{aligned}$$

for \((\mu _x(S),\mu _y(S))\) – the centroid of S [15].

Other examples of methods for measuring the circularity can be found in [7, 13, 14, 22].

2.2 Ellipticity Measure

The first approach is based on moment invariants. Since any ellipse can be obtained by applying an affine transform to a circle, we use the simplest affine moment invariant [9] (based on the central moments \(\mu _{p,q}\)) of the circle to characterize ellipses

$$\begin{aligned} I_1(S)=\frac{\mu _{2,0}(S)\mu _{0,2}(S)-\mu _{1,1}^2(S)}{\mu _{0,0}^4(S)}, \end{aligned}$$

where the centralized (pq)-moment of a planar set S is given by \(\mu _{p,q}(S)\). To discriminate the shape and measure of ellipticity, we use

$$\begin{aligned} \mathcal {E}_I(S)= {\left\{ \begin{array}{ll} 16 \pi ^2 I_1(S),\quad \text {if } I_1(S) \le \frac{1}{16\pi ^2},\\ \frac{1}{16 \pi ^2 I_1(S)},\quad \text {otherwise,} \end{array}\right. } \end{aligned}$$

(compare with [24]).

The second approach is based on the first two Hu invariant moments [10, 15]. The ellipticity measure of a given shape S can be computed by the formula [30]:

$$\begin{aligned} \mathcal {E}_H(S)=\bigg (2\pi ^2\bigg (\mathcal {I}_1(S)\cdot \sqrt{4\mathcal {I}_2(S)+1/\pi ^2}-2\mathcal {I}_2(S)\bigg )\bigg )^{-1}, \end{aligned}$$

where

  • \(\mathcal {I}_1(S)=m_{2,0}(S)+m_{0,2}(S)\),

  • \(\mathcal {I}_2(S)=(m_{2,0}(S)-m_{0,2}(S))^2+4(m_{1,1}(S))^2\),

for the geometric moments of a given shape defined by

$$\begin{aligned} m_{p,q}(S)=\int \int _S x^py^q\mathrm{{d}}x\mathrm{{d}}y. \end{aligned}$$

Other examples of methods for measuring the ellipticity can be found in [8, 21, 22, 25, 26].

3 Main Theorem

In this section we present the main result of this paper that a set S is an ellipsoid if the uniform probability density on it has minimal Kullback–Leibler divergence—a fundamental equation of information theory that quantifies the proximity of two probability distributions (a brief summary and the proof is presented in further part of this work). Using Kullback–Leibler divergence we show that it is enough to know three moments of the object (in \(\mathbb {R}^2\)) to check if the given set is an ellipse.

This condition for an arbitrary set \(S \subset \mathbb {R}^N\) reduces to verifying whether the value of

$$\begin{aligned} \lambda _N(S)/\sqrt{\det (\Sigma _S)} \end{aligned}$$

is maximal, where \(\Sigma _S\) denotes the covariance of the uniform probability measure on S, and equalsFootnote 3

$$\begin{aligned} \frac{((N+2)\pi )^{N/2}}{\varGamma (N/2+1)}=:e_N. \end{aligned}$$

For \(N=1\) (line) and \(N=2\) (the situation on the plane), \(e_N\) simplifies to \(e_1=2\sqrt{3}\) and \(e_2= 4 \pi \), while for \(N=3\) (situation in three dimensional space) we get \(e_3=20 \frac{\sqrt{5}\pi }{3}\).

We obtain also analogous estimations for circles and balls in \(\mathbb {R}^N\). Given a symmetric positive matrix \(\Sigma \), we recall that the Mahalanobis distance [18] is given by

$$\begin{aligned} \Vert x-y\Vert _{\Sigma }:=(x-y)^T\Sigma ^{-1}(x-y). \end{aligned}$$

Thus, our main result (compare with Corollaries 1 and 2) may be stated as follows:

Theorem 1

Let \(S \subset \mathbb {R}^N\) with mean \(\mathrm {m}_S\) and covariance \(\Sigma _S\) be given.

  • Then

    $$\begin{aligned} \mathcal {E}_N(S):=\frac{\varGamma (N/2+1)}{((N+2)\pi )^{N/2}}\cdot \frac{\lambda _N(S)}{\sqrt{\det (\Sigma _S)}}\le 1, \end{aligned}$$

    where the equality holds if S is an ellipse. If this is the case, then \(S=\mathcal {B}_{\Sigma _S}(\mathrm {m}_S,\sqrt{N+2})\).

  • Then

    $$\begin{aligned} \mathcal {C}_N(S):=\frac{\varGamma (N/2+1)}{((N+2)\pi /N)^{N/2}}\cdot \frac{\lambda _N(S)}{(\mathrm {tr}(\Sigma _S))^{N/2}}\le 1, \end{aligned}$$

    where the equality holds if S is a circle. If this is the case then \(S=\mathcal {B}(\mathrm {m}_S,\sqrt{\frac{N+2}{N}\mathrm {tr}(\Sigma _S)})\).

Fig. 2
figure 2

Ellipse vs. circle—comparison of the measure values to test if the object is elliptical-like (\(\mathcal {E}_2\)) or circle-like (\(\mathcal {C}_2\)) (higher value means that the indicated shape kind describes the object better)

We postpone the proof till Sect. 5. However, the basic idea consists of the following steps:

  • we first observe that we can restrict to the case when the mean of S is centered at zero and the covariance equals to identity;

  • next we fit to the data optimal uniform density on a ball \(\mathcal {B}(0,R)\) with R such that the volume of S equals to volume of \(\mathcal {B}(0,R)\);

  • last we show that if S would contain elements outside of \(\mathcal {B}(0,R)\), then by “moving” those elements inside of \(\mathcal {B}(0,R)\), we would increase the value of the respective Kullback–Leibler divergence.

For the convenience of the reader, we now discuss the situation on the plane. We consider \(S \subset \mathbb {R}^2\) with mean \(\mathrm {m}_S\) and covariance \(\Sigma _S\). Then

$$\begin{aligned}&\mathcal {E}_2(S):=\frac{\lambda _2(S)}{4\pi \sqrt{\det (\Sigma _S)}}, \end{aligned}$$
(2)
$$\begin{aligned}&\mathcal {C}_2(S):=\frac{\lambda _2(S)}{2\pi \mathrm {tr}(\Sigma _S)}. \end{aligned}$$
(3)

Figure 2 presents the example values of \(\mathcal {E}_2\) and \(\mathcal {C}_2\) for given sets.

Under the above definition, parameter \(\mathcal {E}_2\) is invariant to affine transformations, while \(\mathcal {C}_2\) is invariant to isometric transformations (compare with Theorem 3).

Remark 1

If S is an ellipse, then one can easily verify that \(4\pi \sqrt{\det (\Sigma _S)}\) equals its area. Thus, we see that (2) is a realization of the idea given in Problem 1.

Analogously, if S is a circle, then its area equals \(2\pi \mathrm {tr}(\Sigma _S)\), and consequently (3) gives a formalization of an analogue of Problem 1 for circles.

Directly from Theorem 1 (namely, Eqs. (2) and (3)), we can compare the new measures with the measures recalled in state of the art of this article (Sect. 2).

Observation 1

For an arbitrary set \(S\subset \mathbb {R}^2\):

  • \(\mathcal {C}_2(S)=\mathcal {C}_H(S)\);

  • \(\mathcal {E}_2(S)\le a \Rightarrow \mathcal {E}_I(S) \le a^2\) for \(a \in (0,1]\).

Proof

Recall now that the covariance matrix of the S is given by [10]

$$\begin{aligned} \Sigma _S= \frac{1}{\mu _{0,0}(S)} \begin{pmatrix} \mu _{2,0}(S) &{} \mu _{1,1}(S)\\ \mu _{1,1}(S) &{} \mu _{0,2}(S) \end{pmatrix} \end{aligned}$$

Thus, by definition of \(\mathcal {C}_H(S)\)

$$\begin{aligned} \mathcal {C}_2(S)= & {} \frac{\lambda _2(S)}{2\pi \mathrm {tr}(\Sigma _S)}\\= & {} \frac{\mu _{0,0}(S)}{2 \pi \frac{1}{\mu _{0,0}(S)}(\mu _{2,0}(S)+\mu _{0,2}(S))}\\= & {} \frac{\mu _{0,0}^2(S)}{2 \pi (\mu _{2,0}(S)+\mu _{0,2}(S))}\\= & {} \mathcal {C}_H(S) \end{aligned}$$

which completes the proof of the first equality.

Fig. 3
figure 3

Discretization of the set \(\{(x,y)\in \mathbb {R}^2:x^2+y^2\le 1\}\). The gray squares have a side length of \(\delta \). Additionally the measures of circularity \(\mathcal {C}_2\) and \(\mathcal {C}_{st}\) are presented—it is easy to notice that they depend on discretization but \(\mathcal {C}_2\) converges to 1 with \(\delta \rightarrow 0\)

For the second property, we have

$$\begin{aligned} \mathcal {E}_2(S)= & {} \frac{\mu _{0,0}}{4\pi \sqrt{(\mu _{2,0}(S)\mu _{0,2}(S)-\mu _{1,1}^2(S))/\mu _{0,0}^2(S)}}\\= & {} \bigg (\frac{1}{16 \pi ^2} \cdot \frac{\mu _{0,0}^4(S)}{\mu _{2,0}(S)\mu _{0,2}(S)-\mu _{1,1}^2(S)}\bigg )^{-1/2}\\= & {} \bigg (16 \pi ^2 I_1(S)\bigg )^{1/2}. \end{aligned}$$

Consequently, by Theorem 2 we obtain \(\bigg (16 \pi ^2 I_1(S)\bigg )^{1/2} \le 1\) which implies that \(I_1(S) \le \frac{1}{16 \pi ^2}\) for an arbitrary set \(S\subset \mathbb {R}^2\). Hence, \(\mathcal {E}_2(S)=(\mathcal {E}_I(S))^{1/2}\) which completes the proof. \(\square \)

Consequently, the authors’ approach for the two-dimensional data leads to the same conclusions as indexes \(\mathcal {E}_I\) [24] and \(\mathcal {C}_H\) [31].

At the end of this section we discuss how the above measures can be adapted for discrete (finite) sets. In general in our opinion, it is a nontrivial problem; luckily we can easily deal with the case when S is a discrete subset of \(\delta \cdot \mathbb {Z}^N\) (\(\delta > 0\)). Then instead of S we consider the set

$$\begin{aligned} \tilde{S}:=S+\delta \cdot [-\tfrac{1}{2},\tfrac{1}{2}]^N. \end{aligned}$$

Then, as one can easily check

$$\begin{aligned} \lambda _N({\tilde{S}})=\delta ^N \cdot \mathrm {card}(S) \text { and } \Sigma _{\tilde{S}}=\Sigma _S+\frac{\delta ^{N+2}}{12}I, \end{aligned}$$
(4)

where by \(\Sigma _S\) on the RHS we understand the standard covariance of the discrete set. Examples of discretization for a few \(\delta \) values are presented in the Fig. 3.

The discrete case appears to be important also in our numerical experiments, where we approximate the “exact” shape \(S \subset \mathbb {R}^N\) by its discrete approximation given by

$$\begin{aligned} S_{\delta }:=\{x \in \delta \mathbb {Z}^N : x \in S\} \text { with }\delta \rightarrow 0. \end{aligned}$$

Observe that for compact sets with nonzero Lebesgue measure all the moments of \(S_{\delta }\) converge, with \(\delta \rightarrow 0\), to the respective moments of S. Consequently, the same holds for constants \(\mathcal {E}_N\) and \(\mathcal {C}_N\).

Surprisingly this is not the case for \(\mathcal {C}_{st}\), where even for the case of unit circle

$$\begin{aligned} \mathcal {C}_{st}(\mathcal {B}(0,1)_{\delta }) \not \rightarrow \mathcal {C}_{st}(\mathcal {B}(0,1)) \text { as } \delta \rightarrow 0, \end{aligned}$$

which follows from the fact that the length of the boundary of the discrete approximation of the setFootnote 4 usually does not converge to the exact length of the boundary—it occurs that in the optimal value is obtained for octagon instead of a circle. This follows from the fact that when we measure the length of the discretization \(S_{\delta }\) of the set S, by following the discrete boundary we can move only in the directions which have the angle which is the multiplicity of \(\pi /4\). Thus although \(\mathcal {C}_{st}(\mathcal {B}(0,1))=1\),

$$\begin{aligned} \mathcal {C}_{st}(\mathcal {B}(0,1)_{\delta }) \le \mathcal {C}_{st}(\text {octagon})=\frac{1}{8} \left( 1+\sqrt{2}\right) \pi \approx 0.948059. \end{aligned}$$

Remark 2

The \(\mathcal {C}_{st}\) uses one of the most popular and standard approach to circularity measure which is derived from the relation between the shape of the area and the length of its perimeter. As one can show, which can be observed also in above examples, this measure stabilizes on the octagon where it achieves the highest value. This is caused by that fact that in the calculation of the boundary of given discrete shape we can “move” only according to lines which form the angle which is a multiplicity of \(\pi /4\) with the axis, see Fig. 3c for illustration.

Since this measure was for a long time successfully applied for circle discovery on images, we conclude that from the practical point of view, octagon presents a sufficient numerical approximation of the circle in most commonly encountered application.

4 Kullback–Leibler Divergence and Cross-Entropy

4.1 Basic Definitions on Kullback–Leibler Divergence

We now remind the reader of the concept of differential entropy which is the entropy of a continuous random variable [5].

Definition 1

The differential entropy h(f) of a continuous random variable with a density function \(f:\mathbb {R}^N \rightarrow \mathbb {R}_+\) is defined as

$$\begin{aligned} h(f)=-\int _{\mathbb {R}^N} f(x)\ln f(x) \mathrm{{d}}x. \end{aligned}$$

Differential entropy is also related to the shortest description length, and is similar in many ways to the well-known entropy of a discrete random variable. Since it extends the idea of Shannon entropy, a measure of the expected value of the information in the message, to continuous probability distributions. The value of differential entropy depends only on the probability density of the random variable [5]. In this paper we shall abbreviate differential entropy as entropy.

Lets now calculate differential entropy for simplest density—uniform density.

Example 1

(Uniform distribution) Consider the random variable distributed uniformly on the set \(S\subset \mathbb {R}^N\), so that its density \(\mathrm {u}_S\) is \(1/\lambda _N(S)\) on S and 0 elsewhere. Then its differential entropy is

$$\begin{aligned} h(\mathrm {u}_S)=-\int _S \frac{1}{\lambda _N(S)} \ln \frac{1}{\lambda _N(S)} \mathrm{{d}}x =\ln \lambda _N(S). \end{aligned}$$

Since our aim is to study mainly uniform densities, for a measurable \(S \subset \mathbb {R}^N\) with finite and nonzero measure to shorten notation, we will use the symbol \(\mathrm {u}_S\) in the place of the uniform probability density

$$\begin{aligned} \mathrm {u}_S:=\frac{1}{\lambda _N(S)}\mathrm {1}_S \end{aligned}$$

on S, where

$$\begin{aligned} \mathrm {1}_S(x):= {\left\{ \begin{array}{ll} 1, &{} \text{ if } x \in S,\\ 0, &{} \text{ if } x \not \in S.\\ \end{array}\right. } \end{aligned}$$

As a consequence, we will write \(\mu _S\) and \(\Sigma _S\) to denote the mean and covariance of \(\mathrm {u}_S\).

Remark 3

In general case one could consider various densities, not just the uniform one. However in practical applications, Gaussian densities are typically considered, as they are easy to work with. In many practical cases, the methods developed under the assumption that data have normal distribution work quite well even when the density is not normal. Furthermore, the central limit theorem provides a theoretical basis for why it has such a wide applicability. Therefore, this density approximates many natural phenomena so well, and it has developed into a standard of reference for many probability problems. As an excellent example we can refer the reader to [11] where Gaussian distributions were used for modeling contours and have been applied for shape retrieval.

Since in this paper we focus on circular and elliptical shapes, to detect them we could theoretically use densities which have ellipses or circles as level sets. We have decided to use Gaussian ones, while for them we have accurate, explicit, and numerically efficient formulas for the estimation of their parameters.

The differential entropy of Gaussian density is considered in following example.

Example 2

(Gaussian distribution [28]) For the multivariate Gaussian distribution, the entropy goes as the log determinant of the covariance; specifically, the differential entropy of a N-dimensional random variable with the density function

$$\begin{aligned} \mathcal {N}_{\mu ,\Sigma }(x):=\frac{1}{\sqrt{(2\pi )^N\det (\Sigma )}} \exp \big (-\frac{1}{2}\Vert x-\mu \Vert _\Sigma ^2\big ) \end{aligned}$$

is given by the formula

$$\begin{aligned} h( \mathcal {N}_{\mu ,\Sigma })= & {} -\int _{\mathbb {R}^N} \mathcal {N}_{\mu ,\Sigma }(x) \ln \mathcal {N}_{\mu ,\Sigma }(x)\mathrm{{d}}x \\= & {} \frac{N}{2} \ln (2\pi e) + \frac{1}{2} \ln (\det (\Sigma )). \end{aligned}$$

We can now proceed to Kullback–Leibler divergence which is the “cost” associated with selecting a distribution q from distribution family \(\mathbb {Q}\) to approximate the true distribution p [5].

Definition 2

The Kullback–Leibler divergence (or relative entropy) \(D_{KL}(p\Vert q)\) between two densities p and q is defined by

$$\begin{aligned} D_{KL}(p\Vert q):=\int _{\mathbb {R}^N} \ln (p(x)/q(x)) \cdot p(x) \, \mathrm{{d}}x. \end{aligned}$$

For a family of densities \(\mathbb {Q}\), the Kullback–Leibler divergence is given by

$$\begin{aligned} D_{KL}(p\Vert \mathbb {Q}):=\inf _{q \in \mathbb {Q}} D_{KL}(p\Vert q). \end{aligned}$$

\(D_{KL}\) is nonnegative in p and q, zero if the distributions match exactly and can potentially equal infinity. However, since the Kullback–Leibler divergence is a nonsymmetric information theoretical measure of distance of densities p from q, namely \(D_{KL}(p\Vert q)\not = D_{KL}(q\Vert p)\), it is not strictly a distance metric. However, there are some natural modifications which deal with this problem, e.g., [2, 17].

By introducing next definition—cross-entropy, we can simplify the \(D_{KL}(p\Vert q)\) for arbitrary densities p and q.

Definition 3

The cross-entropy \(H^{\times }(p\Vert q)\) of two continuous probability densities p and q is defined as

$$\begin{aligned} H^{\times }(p\Vert q)=-\int _{\mathbb {R}^N} p(x)\ln q(x) \mathrm{{d}}x. \end{aligned}$$

It is worth specifying that cross-entropy is a variant of the entropy definition that allows us to compare two probability distributions for the same random variable. We treat the first argument as the “target” probability distribution and the second as the estimated one for which we are trying to evaluate how well it “fits” the target.

In the case when p has finite entropy, by Definition 3 we can use the following equation

$$\begin{aligned} D_{KL}(p\Vert q)= & {} -\int _{\mathbb {R}^N} p(x)\ln q(x) \mathrm{{d}}x+\int _{\mathbb {R}^N} p(x)\ln p(x) \mathrm{{d}}x\\= & {} H^{\times }(p\Vert q)-h(p). \end{aligned}$$

Thus, \(D_{KL}(p\Vert q)\) is the measure of the additional cost to pay for the model mismatch—the difference between the descriptions of the random variables by q and by p.

4.2 Kullback–Leibler Divergence Between Uniform and Gaussian Densities

We proceed to comparison of uniform and normal distributions by relative entropy.

Let \(\mathcal {G}\) denote the set of all Gaussian densities. One can easily verify that for arbitrary density f and Gaussian density \(g \in \mathcal {G}\), we have

$$\begin{aligned} H^{\times }(f\Vert g)=H^{\times }(\mathcal {G}[f]\Vert g), \end{aligned}$$

where \(\mathcal {G}[f]\) denotes Gaussian density with the same mean and covariance as f. This means that the Kullback–Leibler divergence \(D_{KL}(f\Vert \mathcal {G})\) is realized for \(g=\mathcal {G}[f]\).

We will now show the formula for the Kullback–Leibler divergence of uniform densities.

Observation 2

For a given uniform density \(\mathrm {u}_S\) on the set \(S\in \mathbb {R}^N\) and the Gaussian densities, we have

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert \mathcal {G})=\frac{N}{2}\ln (2\pi e)+\frac{1}{2}\ln (\det (\Sigma _S))-\ln (\lambda _N(S)). \end{aligned}$$

Proof

Clearly \( D_{KL}(\mathrm {u}_S \Vert \mathcal {G})=H^{\times }(\mathrm {u}_S\Vert \mathcal {G})-h(\mathrm {u}_S)= H^{\times }(\mathcal {G}[\mathrm {u}_S]\Vert \mathcal {G})-\ln (\lambda _N(S)) =H^{\times }(\mathcal {G}[\mathrm {u}_S]\Vert \mathcal {G}[\mathrm {u}_S])-\ln (\lambda _N(S))= h(\mathcal {G}[\mathrm {u}_S])-\ln (\lambda _N(S)) \) \( =\frac{N}{2}\ln (2\pi e)+\frac{1}{2}\ln (\det (\Sigma _S))-\ln (\lambda _N(S))\). \(\square \)

A crucial role in our investigation will be played by the following constant

$$\begin{aligned} d_N:=D_{KL}(\mathrm {u}_{\mathcal {B}_N(0,1)}\Vert \mathcal {G}), \end{aligned}$$

where \(\mathcal {B}_N(0,1)\) denotes the unit ball in \(\mathbb {R}^N\) and \(\mathcal {G}\) denotes the set of all Gaussian densities. From the scientific point of view, it describes how good the compression of uniform density on the unit ball by Gaussians is compared to the optimal compression. Since \(d_N\) plays the basic role in our considerations, let us calculate it in the following series of examples.

Example 3

Consider a uniform probability density on the unit ball \(\mathcal {B}_N(0,1)\) in \(\mathbb {R}^N\). Clearly \(\mathrm {m}_{\mathcal {B}_N(0,1)}=0\). Its covariance matrix \(\Sigma _{\mathcal {B}_N(0,1)}=[s_{ij}]\) shall be computed. Obviously, \(s_{ij}=0\) if \(i \ne j\). Consider the case \(i=j\). Since \(s_{ii}=\frac{1}{\lambda _N(\mathcal {B}_N(0,1))}\int \limits _{\mathcal {B}_N(0,1)}x_i^2 \mathrm{{d}}x\), the constant \(s=s_{ii}\) is well defined and does not depend on the choice of i, and

$$\begin{aligned} Ns= & {} s_{11}+\cdots +s_{NN}=\frac{1}{\lambda _N(\mathcal {B}_N(0,1))}\int _{\mathcal {B}_N(0,1)}\Vert x\Vert ^2 \mathrm{{d}}x\\= & {} \frac{1}{\lambda _N(\mathcal {B}_N(0,1))} \int _0^1 r^2 \cdot r^{N-1} \lambda _{N-1}(\partial \mathcal {B}_N(0,1))dr \\= & {} \frac{1}{N+2} \frac{\lambda _{N-1}(\partial \mathcal {B}_N(0,1))}{\lambda _N(\mathcal {B}_N(0,1))}. \end{aligned}$$

Since

$$\begin{aligned} \lambda _N(\mathcal {B}_N(0,1))= & {} \int _0^1 1 \cdot r^{N-1} \lambda _{N-1}(\partial \mathcal {B}_N(0,1))dr\\= & {} \frac{1}{N} \lambda _{N-1}(\partial \mathcal {B}_N(0,1)), \end{aligned}$$

we obtain \(\Sigma _{\mathcal {B}_N(0,1)}=\frac{1}{N+2}\mathrm {I}\), where \(\mathrm {I}\) denotes the identity matrix. As a direct consequence, we derive \(\Sigma _{\mathcal {B}(0,\sqrt{N+2})}=\mathrm {I}\).

Example 4

From Example 3 we have \(\Sigma _{\mathcal {B}_N(0,1)}=\frac{1}{N+2}\mathrm {I}\), which implies that

$$\begin{aligned} H^{\times }(\mathrm {u}_{\mathcal {B}_N(0,1)}\Vert \mathcal {G})=\frac{N}{2} \ln \Bigg (\frac{2\pi e}{N+2}\Bigg ). \end{aligned}$$

Comparing this with \(h(\mathrm {u}_{\mathcal {B}_N(0,1)})=\ln (\lambda _N({\mathcal {B}_N(0,1)}))=\frac{N}{2}\ln \pi -\ln \varGamma (\frac{N}{2}+1)\), we obtain

$$\begin{aligned} d_N=\frac{N}{2}\ln \Bigg (\frac{e}{N/2+1}\Bigg )+\ln (\varGamma (N/2+1)). \end{aligned}$$
(5)

Consequently,

$$\begin{aligned} d_1= & {} \frac{1}{2}(1+\ln (\pi /6)) \approx 0.18,\\ d_2= & {} 1-\ln (2) \approx 0.31,\\ d_3= & {} \frac{1}{2}(3+\ln \frac{9}{250}+\ln \pi ) \approx 0.41. \end{aligned}$$

Having used the Stirling formula \(\varGamma (k+1) \approx \sqrt{2\pi k}(k/e)^k\), we obtain

$$\begin{aligned} d_N \approx \ln \bigg (\frac{\sqrt{\pi N}}{e}\bigg ) \end{aligned}$$

for large N.

5 Optimal Estimations and Main Results

5.1 Basis for the Simple Case

We shall now show that \(d_N\) gives a lower bound on the compression of the uniform densities, namely, we will calculate the mismatch of the optimal model given by the uniform density of \(S\subset \mathbb {R}^N\) and the approximation given by normal densities \(\mathcal {G}\), measured by the Kullback–Leibler divergence (see Definition 3).

Proposition 1

Let \(S \subset \mathbb {R}^N\) be such that \(\mu _S=0\) and \(\Sigma _S=\mathrm {I}\). Then

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert g) \ge d_N \hbox { for }g \in \mathcal {G}\end{aligned}$$

with the equality holding if \(S=B(0,\sqrt{N+2})\) and \(g=\mathcal {N}(0,\mathrm {I})\).

Proof

By the observation from the previous section, we know that \(D_{KL}(\mathrm {u}_S\Vert \mathcal {G})\) is realized for \(\mathcal {N}{(0,\mathrm {I})}\):

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert \mathcal {G})=H^{\times }(\mathrm {u}_S\Vert \mathcal {N}{(0,\mathrm {I})})-h(S). \end{aligned}$$

This means that if \(g \in \mathcal {G}\) is arbitrary, then \(D_{KL}(\mathrm {u}_S\Vert g)\ge D_{KL} (\mathrm {u}_S\Vert \mathcal {N}(0,\mathrm {I}))\) with the equality holding if \(g=\mathcal {N}(0,\mathrm {I})\). Consequently, we may reduce to the case \(g=\mathcal {N}(0,\mathrm {I})\).

Clearly, if S is a ball centered at zero such that \(\Sigma _S=\mathrm {I}\), then by the Example 3, we obtain \(S=\mathcal {B}(0,\sqrt{N+2})\).

Consider now the case when S is not a ball centered at zero (modulo a set of zero measure) and let B(0, r) denote a ball centered at zero with the same Lebesgue measure as S. We will show that

$$\begin{aligned} d_N\le & {} D_{KL}(\mathcal {B}(0,r)\Vert g)=H^{\times }(\mathcal {B}(0,r)\Vert g)-h(\mathcal {B}(0,r)) \\= & {} H^{\times }(\mathcal {B}(0,r)\Vert \mathcal {N}{(0,\mathrm {I})})-\ln (\lambda _N(\mathcal {B}(0,r))) \\< & {} H^{\times }(\mathrm {u}_S\Vert \mathcal {N}{(0,\mathrm {I})})-\ln (\lambda _N(S))=D_{KL}(\mathrm {u}_S\Vert \mathcal {G}) , \end{aligned}$$

which will complete the proof. Since by the assumptions \(\lambda _N(\mathcal {B}(0,r))=\lambda _N(S)\), it is sufficient to show that

$$\begin{aligned} H^{\times }(\mathrm {u}_{\mathcal {B}(0,r)}\Vert \mathcal {N}{(0,\mathrm {I})})<H^{\times }(\mathrm {u}_S\Vert \mathcal {N}{(0,\mathrm {I})}). \end{aligned}$$
(6)

Since

$$\begin{aligned}&H^{\times }(\mathrm {u}_S\Vert \mathcal {N}{(0,\mathrm {I})})\nonumber \\&\quad =\frac{1}{\lambda _N(S)} \int _S \bigg [\frac{\Vert x\Vert ^2}{2}-\frac{N}{2} \ln (\sqrt{2\pi })\bigg ]\mathrm{{d}}x\\&\quad =\frac{1}{\lambda _N(\mathcal {B}_N(0,r))}\int _S \frac{\Vert x\Vert ^2}{2}\mathrm{{d}}x-\frac{N}{2} \ln (\sqrt{2\pi }), \\ \end{aligned}$$

and

$$\begin{aligned}&H^{\times }(\mathrm {u}_{\mathcal {B}(0,r)}\Vert \mathcal {N}{(0,\mathrm {I})})\nonumber \\&\quad =\frac{1}{\lambda _N(\mathcal {B}_N(0,r))}\int _{\mathcal {B}(0,r)} \frac{\Vert x\Vert ^2}{2}\mathrm{{d}}x-\frac{N}{2} \ln (\sqrt{2\pi }) \end{aligned}$$

to verify (6) it is sufficient to show that the following inequality is true

$$\begin{aligned} \int _{\mathcal {B}_N(0,r)}\Vert x\Vert ^2 \mathrm{{d}}x <\int _{S} \Vert x\Vert ^2\mathrm{{d}}x. \end{aligned}$$
(7)

Let \(C=\mathcal {B}(0,r) {\setminus } S\), \(D=S {\setminus } \mathcal {B}(0,r)\). Clearly from the assumptions both C and D have nonzero measures.

Since

$$\begin{aligned} \lambda _N(C)+\lambda _N(\mathcal {B}(0,r) \cap S)= & {} \lambda _N(\mathcal {B}(0,r))\\= & {} \lambda _N(S)\\= & {} \lambda _N(D)+\lambda (\mathcal {B}(0,r)\cap S), \end{aligned}$$

hence, measures of C and D are equal. To prove (7) it is sufficient to observe that since \(C \subset \mathcal {B}(0,r)\) and \(D \subset \mathbb {R}^N {\setminus } \mathcal {B}(0,r)\), we have

$$\begin{aligned} \int _C \Vert x\Vert ^2 \mathrm{{d}}x < \int _C r^2 \mathrm{{d}}x =\int _D r^2 \mathrm{{d}}x < \int _D \Vert x\Vert ^2\mathrm{{d}}x, \end{aligned}$$

and therefore

$$\begin{aligned} \int _{\mathcal {B}(0,r)}\Vert x\Vert ^2 \mathrm{{d}}x= & {} \int _{\mathcal {B}(0,r)\cap S}\Vert x\Vert ^2 \mathrm{{d}}x+\int _{C}\Vert x\Vert ^2 \mathrm{{d}}x\\< & {} \int _{\mathcal {B}(0,r)\cap S}\Vert x\Vert ^2 \mathrm{{d}}x+\int _{D}\Vert x\Vert ^2 \mathrm{{d}}x\\= & {} \int _S \Vert x\Vert ^2 \mathrm{{d}}x. \end{aligned}$$

\(\square \)

5.2 Main Results: New Measures

We shall now broaden the previous theorem to a more general case. This will provide the grounds for defining the new measures.

Theorem 2

Let \(S \subset \mathbb {R}^N\) with mean \(\mu _S\) and covariance \(\Sigma _S\). Then

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert g) \ge d_N \hbox { for }g \in \mathcal {G}\end{aligned}$$

with the equality holding if \(S=\mathcal {B}_{\Sigma _S}(\mu _S,\sqrt{N+2})\) and \(g=\mathcal {N}(\mu _S,\Sigma _S)\).

Proof

Without loss of generality, by applying translation if necessary, we can reduce to the case when \(\mu _S=0\). Next, by applying transformation \(x \rightarrow (\Sigma _S)^{-1/2}x\), we reduce the theorem to the case when \(\Sigma _S=\mathrm {I}\). Proposition 1 completes the proof. \(\square \)

It is worth observing that the above allows one to check whether a given set is a ball or a ellipsoid with given radius and covariance.

Fig. 4
figure 4

Approximation of the letter P by an ellipse (a) original object (b) and a circle (c) according to the Corollaries 1 and 2. Since the ellipticity for the original shape is higher then circularity, the approximation by the ellipse is more accurate

Corollary 1

Let \(S \subset \mathbb {R}^N\) with mean \(\mu _S\) and covariance \(\Sigma _S\). Then

$$\begin{aligned} \lambda _N(S)/\sqrt{\det (\Sigma _S)} \le \frac{((N+2)\pi )^{N/2}}{\varGamma (N/2+1)}, \end{aligned}$$

with the equality holding if S is an ellipsoid. In such case

$$\begin{aligned} S=\mathcal {B}_{\Sigma _S}(\mu _S,\sqrt{N+2}). \end{aligned}$$

Proof

From Observation 2 and Example 4

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert \mathcal {G})= & {} \ln \bigg ((2\pi e)^{N/2} \cdot \frac{\sqrt{\det \Sigma _S} }{\lambda (S)}\bigg )\\\ge & {} \frac{N}{2}\ln \bigg (\frac{e}{N/2+1}\bigg )+\ln (\varGamma (N/2+1)), \end{aligned}$$

with the equality holding if S is an ellipsoid, after trivial calculations, we obtain that for every S

$$\begin{aligned} \frac{\varGamma (N/2+1) \lambda _N(S)}{((N+2)\pi )^{N/2} \cdot \sqrt{\det (\Sigma _S)}} \le 1 \end{aligned}$$

with the equality holding if S is an ellipsoid.

We denote for an arbitrary \(S\subset \mathbb {R}^N\):

$$\begin{aligned} \mathcal {E}_N(S) := \frac{\varGamma (N/2+1)}{((N+2)\pi )^{N/2}} \cdot \frac{\lambda _N(S)}{\sqrt{\det (\Sigma _S)}}, \end{aligned}$$

which is an ellipticity measure.

By considering the family of all spherical Gaussians \(\mathcal {G}_{(\cdot \mathrm {I})}\), that is the Gaussians with covariance proportional to identity, we obtain the formula for N-balls identification.

Corollary 2

Let \(S \subset \mathbb {R}^N\) with mean \(\mu _S\) and covariance \(\Sigma _S\). Then

$$\begin{aligned} \lambda _N(S)/(\mathrm {tr}( \Sigma _S))^{N/2} \le \frac{((N+2)\pi /N)^{N/2}}{\varGamma (N/2+1)}, \end{aligned}$$

with the equality holding if S is a ball. In such case

$$\begin{aligned} S=\mathcal {B}\bigg (\mu _S,\sqrt{\frac{N+2}{N}\mathrm {tr}(\Sigma _S)}\bigg ). \end{aligned}$$

Proof

By Observation 2 and Example 4

$$\begin{aligned} D_{KL}(\mathrm {u}_S\Vert \mathcal {G}_{(\cdot I)})= & {} \ln \bigg (\bigg (\frac{2\pi e}{N}\bigg )^{N/2}\bigg (\mathrm {tr}(\Sigma _S)\bigg )^{N/2}\frac{1}{\lambda (S)}\bigg ) \\\ge & {} \frac{N}{2}\ln \bigg (\frac{e}{N/2+1}\bigg )+\ln (\varGamma (N/2+1)), \end{aligned}$$

with the equality holding if S is a ball, after trivial calculations, we obtain

$$\begin{aligned} \frac{\varGamma (N/2+1) \cdot \lambda _N(S)}{((N+2)\pi /N)^{N/2} \cdot (\mathrm {tr}(\Sigma _S))^{N/2}} \le 1 \end{aligned}$$

with the equality holding if S is a ball.

We denote for an arbitrary \(S\subset \mathbb {R}^N\):

$$\begin{aligned} \mathcal {C}_N(S) := \frac{\varGamma (N/2+1)}{((N+2)\pi /N)^{N/2}} \cdot \frac{\lambda _N(S)}{(\mathrm {tr}( \Sigma _S))^{N/2}} \end{aligned}$$

which is a circularity measure.

Remark 4

Directly form Corollaries 1 and 2, we can summarize that the investigated set \(S\subset \mathbb {R}^N\) can be approximated or replaced by the circle

$$\begin{aligned} \mathcal {B}\bigg (\mu _S,\sqrt{\frac{N+2}{N}\mathrm {tr}(\Sigma _S)}\bigg ) \end{aligned}$$

or the ellipse

$$\begin{aligned} \mathcal {B}_{\Sigma _S}(\mu _S,\sqrt{N+2}) \end{aligned}$$

with accuracies described by \(\mathcal {C}_N\) and \(\mathcal {E}_N\), respectively. This highlights that the given theory can be used as a technique for reducing the complexity of a given object (e.g., in the motion trackingFootnote 5)—see Fig. 4. Moreover, in the case of an elliptical shape we can obtain the object orientation in space by the \(\Sigma _S\) matrix.

6 The new measure properties in simple illustrations

The following theorem summaries the desirable properties of \(\mathcal {E}_N\) and \(\mathcal {C}_N\).

Theorem 3

The ellipticity measure \(\mathcal {E}_N(S)\) and circularity measure \(\mathcal {C}_N(S)\) of given nonempty set \(S\subset \mathbb {R}^N\) satisfies

  1. (a)

    \(\mathcal {E}_N(S)\in (0,1]\) for all sets S;

  2. (b)

    \(\mathcal {C}_N(S)\in (0,1]\) for all sets S;

  3. (c)

    \(\mathcal {E}_N(S)=1 \Leftrightarrow S \text { is an ellipse}\);

  4. (d)

    \(\mathcal {C}_N(S)=1 \Leftrightarrow S \text { is a N-ball}\)(see footnoteFootnote 6);

  5. (e)

    \(\mathcal {C}_N\) is invariant with respect to similarity and isometric transformations;

  6. (f)

    \(\mathcal {E}_N\) is invariant with respect to affine transformations.

Proof

Items (a)–(d) follow directly form the Corollaries 1 and 2.

Items (e) and (f) follows from the properties of the covariance matrix.

In the following part of this section the new circularity measure properties are illustrated.

When we apply the measure to image data, we shall threat a single pixel (point with coordinates \((x,y)\in ([1,w]\cap \mathbb {Z})\times ([1,h]\cap \mathbb {Z})\), where \(w\times h\) is the size of the image) like a square. It allows to convert the discrete data of an image to continuous, which is more natural and consistent with the human perception of images. To calculate the measure of such transformation, we use the formulas from the Eq. (4), which is helpful for the calculation properties of an arbitrary set, namely the covariance matrix. Moreover, it is convenient to treat the discrete set as a set of hypercubes, then we can, in a more natural way, introduce the continuous densities. In the case of the image data we set \(\delta =1\), which is because the coordinates of each pixel are integers. In 3-dimensional examples we use an arbitrary \(\delta \) value.

Fig. 5
figure 5

Measured ellipticity \(\mathcal {E}_2\) of a shape with added noise [31]

Fig. 6
figure 6

Circularity measure for regular polygons from equilateral triangle to dodecagon. \(\mathcal {C}_2\) of hexagon gives a one decimal place accuracy, while decagon gives an accuracy of three decimal places. Images in boxes represent the approximation of circle with an accuracy of two and three decimal places with respect to \(\mathcal {C}_2\) measure

Fig. 7
figure 7

Shapes are ranked with respect to their measure \(\mathcal {C}_2\) circularities [31]. Description under the images also contains \(\mathcal {C}_{st}\) and \(\mathcal {C}_H\)

6.1 Non-frontal View Image Correction

A basic limitation of many image processing algorithm is that they require an “on-axis” image of investigated object. Clearly in most “real-life” pictures we have only side-view of the object. Figure 12 presents a basic concept how to deal with such situation. Namely, we modify the picture by respective affine transformation so that the elliptical object becomes circle-shaped. To do so we fit an optimal ellipse (denote it by C) to the object we assume to be circular, in our case road sign (to obtain the shape of the road sign we first use the red filter, and next fill the inside). Then we apply to the picture the affine operation which transforms this ellipse into a circle, which is given by the following formula

$$\begin{aligned} x \rightarrow \Sigma _C^{-1/2}(x-\mu _C). \end{aligned}$$

This procedure transforms the elliptical object almost into a circle. For further example we refer the reader to [19].

6.2 Noise Resistance

6.2.1 Shape Boundary Noise

Figure 5 illustrates the robustness of \(\mathcal {E}_2\). Presented shapes have similar measured ellipticity even though the last shape has a very high noise level. The noise is added to the shape boundary; thus, the perimeter of the object is increased. This experiment shows that the new measure can cope with such a situation.

6.2.2 Salt and Pepper

In common applications the images which we are working with contain same noise—random, unwanted signal. Figure 13 illustrates the reliability of \(\mathcal {C}_2\) for salt and pepper noise, for which a certain amount of the pixels in the image are either black or white. The percentage level describes the probability of occurrence of such kind of noise. The experiments shows that the covariance matrix which is base component enforces that \(\mathcal {C}_2\) of plays well in such situations.

6.2.3 Missing Values Resistance

The \(\mathcal {C}_2\) is capable of handling unknown or missing values. Figure 14 presents the resistance of \(\mathcal {C}_2\) for such kind of data. The percentage describes the overall level of unknown data. It is important to notice that the circularity value increases with the increase amount of present data.

Fig. 8
figure 8

Shapes are ranked with respect to their \(\mathcal {E}_2\) ellipticity measure [30]. Description under the images also contains \(\mathcal {E}_H\) and \(\mathcal {C}_2\)

Fig. 9
figure 9

Approximation of a set \(A= \{(x,y,z)\in \mathbb {R}^3: x^2 + y^2 + z^2 \le 1\}\) by cubes with side \(\delta \). Circularity of set A increases with the approximation accuracy

Fig. 10
figure 10

Approximation of a set \(B= \{(x,y,z)\in \mathbb {R}^3: x^2 + 3 y^2 + 2 z^2 \le 1\}\) by cubes with side \(\delta \). Ellipticity of set B increases with the approximation accuracy

6.3 Circle Estimation

Figure 6 presents circularity measure \(\mathcal {C}_2\) for regular polygons from equilateral triangle to dodecagon. The aim of this example is to find the good approximation for a circle. From Theorem 3 we derive that \(\mathcal {E}_2\) reaches 1 only for a perfect circle. Thus, we want to acquire a simple template which can be treated as an approximation of a circle.

First of all, we can confirm that the circularity measure behaves in a natural way—it increases with the number of polygon sides.

Figure 6 shows that for a hexagon a value of 0.9924 is reached, which gives a two decimal places accuracy. Moreover, if a higher precision is needed, a decagon provides an accuracy of three decimal places.

6.4 \(\mathcal {E}_2\) and \(\mathcal {C}_2\) Behavior

Figure 7 presents images ranked with respect to \(\mathcal {C}_2\). Different rank is obtained by measures \(\mathcal {C}_{st}\) and \(\mathcal {C}_H\). This example illustrates how shape changing could lead to differences in the measured circularity. In this case the changes in the measured circularity \(\mathcal {C}_2\) are in accordance with the natural perception of how a circularity measure should behave.

Figure 8 presents the same experiment for \(\mathcal {E}_2\).

We highlight that the values of \(\mathcal {C}_{st}\) and \(\mathcal {C}_H\) were taken from [31], while \(\mathcal {E}_H\) was taken from [30]. Moreover, the values of \(\mathcal {C}_2\) and \(\mathcal {C}_H\) are theoretically equal—compare with Observation 1—the differences are caused by a numerical error. On the other hand, \(\mathcal {E}_2\) and \(\mathcal {E}_H\) are in general not equal—see Fig. 8i.

6.5 3D Shapes

In this experiment, 3-dimensional sets are considered and defined as follows

$$\begin{aligned} A= & {} \{(x,y,z)\in \mathbb {R}^3: x^2 + y^2 + z^2 \le 1\},\\ B= & {} \{(x,y,z)\in \mathbb {R}^3: x^2 + 3 y^2 + 2 z^2 \le 1\},\\ C= & {} \{(x,y,z)\in \mathbb {R}^3: 1 \le \frac{1}{2} x^2 + y^2 + z^2 \le 2 \text { and } x y z \ge 0\}. \end{aligned}$$

Figures 9a, 10a, and 11a illustrate the completed shapes of sets A, B, and C.

Fig. 11
figure 11

Approximation of a set \(C= \{(x,y,z)\in \mathbb {R}^3: 1 \le \frac{1}{2} x^2 + y^2 + z^2 \le 2 \text { and } x y z \ge 0\}\) by cubes with side \(\delta \). The set C consists of 4 pieces and is empty inside; thus, both ellipticity and circularity measures are low

Fig. 12
figure 12

Road sign image preprocessing. a Original road sign image. The middle image is a binarization of original image (b) according to values of its red channel. After the described operations, the shape of the sign was changed and is circle-like, see (c)

Fig. 13
figure 13

The \(\mathcal {C}_2\) measure in case of images with salt and pepper noise

Fig. 14
figure 14

The \(\mathcal {C}_2\) for the sets with unknown and missing values

The next step in this experiment is to approximate the shape by fixed size cubes. Let S denote the considered object. To obtain this approximation we proceed as follows:

  • we choose \(\delta >0\);

  • by taking \(P=S\cap (\delta \mathbb {Z})^3\) we obtain a discrete representation of our shape S;

  • each point \(x \in P\) is replaced by the cube of side \(\delta \). The center coordinates of such cube \(Q_x\) is the same as replaced point namely \(\mu (Q_x)=x\) for same \(x\in P\). We put \(Q=\cup _{x\in P}Q_x\);

  • we calculate the circularity and the ellipticity for obtained shape by equations from Corollaries 1 and 2 as follows:

    $$\begin{aligned} \mathcal {E}_3(Q)= & {} \frac{3\sqrt{5}}{100\pi }\cdot \frac{\mathrm {card}P \cdot \delta ^3}{ \sqrt{\det (\Sigma _P+\frac{1}{12}\delta ^5\mathrm {I})}},\\ \mathcal {C}_3(Q)= & {} \frac{9\sqrt{15}}{500\pi }\cdot \frac{\mathrm {card}P \cdot \delta ^3}{\mathrm {tr}(\Sigma _P)+\frac{1}{4}\delta ^5}. \end{aligned}$$

It is worth mentioning that the covariance matrix for each cube \(Q_x\) is equal to \(\frac{1}{12}\delta ^5\mathrm {I}\).

Figures 9 and 10 present examples of a sphere and an ellipsoid, respectively. The ellipticity and circularity measure increases with the approximation accuracy. It can thus be concluded that the behavior of the new measure is natural even in higher dimensions.

Figure 11 presents the situation for a shape with a hole. Both measures respond to this defect correctly and the calculated value is low (Figs. 12, 13, 14).

7 Conclusions

The authors have placed their research efforts in the field of pattern recognition to establish a new measure of circularity and ellipticity based on moments. The proposed measure works in arbitrary dimensions, so we can test, e.g., for N-squares. The theoretical background and the proof that the conditions are well defined are also presented in this work.

This approach can be treated as a generalization of the measures \(\mathcal {C}_{st}\) and \(\mathcal {E}_I\) mentioned in Sect. 2. However, the authors’ approach can be applied in arbitrary dimensions (see Sect. 6).

The fact that circles and ellipses maximize the above invariant has enabled the authors to introduce a new circularity \(\mathcal {C}_N\) and ellipticity \(\mathcal {E}_N\) measure defined in Corollaries 1 and 2. It is shown that \(\mathcal {C}_N\) and \(\mathcal {E}_N\) range over the interval (0, 1] and equal 1 if and only if the investigated set is, respectively, circle or ellipse.

Experiments provided an illustration for the theoretical observations and demonstrate the applicability of the new ellipticity and circularity measures. The presented experiments emphasize the advantages of the new measures:

  • behavior consistent with the intuition;

  • invariance for similarity transformations;

  • applicability in higher dimensions;

  • allows a simpler description of any given object;

  • the calculation time is significantly reduced.