1 Introduction

The recognition and classification of texture images, either in isolated conditions or “in the wild” is one of the most important tasks in computer vision, with numerous applications in material sciences [51], medicine [38], remote sensing [65], agriculture [56], etc.

Despite the recent popularization of learning-based approaches like the deep convolutional networks in this area [15], there still exists space for the investigation of “hand-engineered” image descriptors, especially in situations when there is a small amount of data for training, what is typical, for instance, in medical applications, or when the introduction of specific information from that particular problem (from the physical modeling for example) can contribute with the performance of the computational algorithm.

In this context, and inspired by the recent success of methods like the nonlinear operator derived from a partial differential equation (PDE) in [25, 28], we propose here new texture descriptors based on the application of an operator that corresponds to solutions of a pseudo-parabolic partial differential equation (PDE) (e.g., [2, 3]) when the original image provides the initial condition of that PDE. Pseudo-parabolic differential equations are characterized by having mixed time and space derivatives appearing in the highest-order terms [40, 64] and are applied to model many physical phenomena, for instance the fluid dynamic in porous media [35]. In [50], is presented a state-of-the-art overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network. In [8], data-driven physics-informed techniques are used for possibly improved comprehension about the underlying PDE in fluid mechanics. In [42], the authors proposed an improved PDE based total variation model that enhances grey and colored brain tumor images obtained by magnetic resonance imaging. In addition, in [43] the authors introduced an optimized support vector machine (SVM) based possibilistic fuzzy c-means clustering algorithm for tumor segmentation supported by an efficient partial differential equation modeling. On the other hand, image classification methods have achieved remarkable success in diverse applications. For instance in [45], the authors discussed an image classification method based on convolutional neural networks that is used to detect detail features by actively transforming the relative positions of the connections via application of a convolutional kernel. Here our particular interest on this type of PDE is explained by their ability to regularize a function (image profile in our case) but at the same time that it preserves discontinuities (here corresponding to edges and other relevant local variations important for the image characterization).

To obtain the image representation (descriptors) we follow the PDE application with an encoding process, which here is accomplished by local binary patterns (LBP) [52]. The descriptors are finally submitted to Karhunen-Loève transform [53] and classified by a linear discriminant classifier [24]. The proposed method is assessed on state-of-the-art benchmark databases (UIUC [44], UMD [75] and KTHTIPS-2b [36]) as well as in a practical problem of Brazilian plant species identification [9]. The results confirm that the classification accuracy is competitive both with classical texture descriptors as well as with modern approaches like convolutional neural networks.

Our study presents three main contributions:

  1. 1.

    A state-of-the-art mathematical modeling is coupled with basic local binary patterns using only the signal component and achieves promising results even on challenging texture databases in the literature, with results competitive with the modern deep-learning-based approaches.

  2. 2.

    We also introduce a simple and efficient discrete pseudo-parabolic PDE operator based on finite differences as texture descriptors, based on the original version of the local binary pattern operator.

  3. 3.

    We also present here a real-world application of the methodology. While benchmark tests are relevant for comparison purposes, practical problems frequently pose new challenges to the theoretical approach as, for example, we typically do not have strict control neither over the acquisition process nor over the amount and quality of the available data.

This text is divided into six main sections, including this introduction. In the next Section 2, we present a brief review of other PDE-based methods for image analysis and processing in the literature, where we also highlight and present the three main contributions. Section 3 shows the methodology, including a short introduction to LBP theory and the proposed method. We also introduce the pseudo-parabolic diffusion PDE model and we present the novel corresponding and effective discrete pseudo-parabolic differential operator used here to solve it. Section 4 describes the setup for the texture classification experiments. Moreover, the accuracy of the proposed method in texture classification is assessed over three benchmark data sets and compared with other approaches, some of them classically used for this purpose, other ones from the state-of-the-art in the literature. Section 5 presents the results of these experiments in comparison with the state-of-the-art on this topic and corresponding discussion. Finally, Section 6 concludes the work, summarizing the main points and raising some general discussion on the impact of the proposed new methodology.

2 Related works

Most methods for the classification of texture images can be divided into two main categories: traditional approaches and deep learning methods [48]. In the first group, we have the classical second order statistical methods like Haralick descriptors [34] and Local Binary Patterns (LBP) [52], more recently we have seen the Scale-Invariant Feature Transform (SIFT) [49], Filter Banks [68], Fisher Vector [55], Fractal Descriptors [27], PCANet [11], Locally Encoded Transform Feature Histogram [60], Local Edge Signature (LES) [29], Multipattern Maximum-Minimum-Center encoding (MMC) [74], and many others. On the second category, deep learning methods, we have those based on Convolutional Neural Networks (CNN), such as Deep Convolutional Activation Feature (DeCAF) [22], Deep Filter Banks [15], Deep Texture Encoding Network (DeepTEN) [80], Locally Transfered Fisher Vectors (LFV) [61], Deep Texture Manifold (DEP) [76], First and Second Order Network (FASON) [21], Multiple-Attribute-Perceived (MAP-Net) [79], Bilinear CNN Pair-wise Difference Pooling (BCNN-PDP) [23], Wavelet Multi-Level Attention Capsule Network (WMACapsNet) [63], Ranking Global Pooling Multiple CNN (RaNKGP-3M-CNN) [16], and others.

Differential (partial and ordinary) equation model approach is effective in a wide range of applications ranging from up-to-date texture image classification [42, 43] (SVM) to data-driven and physics-informed neural networks (PINNs) involving inverse problems [50] and machine learning for fluid mechanics [8]. In [77], the authors propose a new physics-based deformation method and efficient character that uses PDE surfaces for an improved modeling framework for creation of detailed 3D virtual geometric models of characters. In [6], the authors used a differential equation inspired by Newton’s law of motion for accurate coarse soft tissue modeling using finite element method-based fine simulation. In [70], we have a novel approach combining PDE and physics-based method to interactively manipulate surface shapes of 3D models with C1 continuity in real time, by using a fourth-order partial differential equation involving a sculpting force originating from elastic bending of thin plates to define physics-based deformations. In [1], the authors study on a feedforward neural network to solve partial differential equations in hyperbolic-transport problems motivated by multiphase fluid flow through porous media and fluid mechanics modeling. In [12] is presented a fractional-order differential equation for modeling of COVID-19 infection of epithelial cells. We refer the interested reader to [1, 6, 8, 26, 42, 43, 45, 50, 70, 77] and the references cited therein for a survey on the partial differential equation modeling approaches along with several fine tuning tools as such as physics-informed, physically-based simulation, data-driven method and data-driven enrichment for state-of-the-art ideas on development methods of multimedia applications involving differential equation as a crucial tool.

The disseminated use of PDEs to provide different viewpoints of a digital image can be traced back to the space-scale theory of Witkin [73], formulated in terms of the canonical model of diffusion equation, i.e., the heat equation. The well-known maximum principle of the elliptic equations was employed in that occasion to ensure that spurious details not present in finer scales would also not appear in coarser scales. This is a fundamental property that must be satisfied by any robust multiscale theory [41].

However, the smoothing effect intrinsic to diffusion equations can also represent a problem in image analysis as, at the same time that the control over the smoothing parameter allows for fine tuning of the desired scale of observation, it also smooths out image edges. Edges are known to be powerful attributes in the discrimination of objects present in the image. To address this issue, Perona and Malik proposed the anisotropic diffusion equation in [54]. This ensures the smoothing multiscale effect while preserving edges by means of a less aggressive action of the operator in regions with high-valued gradients.

Evolutions over the anisotropic model can be found in [10], where the image is regularized previous to the application of the PDE operator. Cottet and Germain [18] and Weickert [71] employ diffusion tensors to address the problem of “ghost” stair-case edges arising in some situations after the application of the operator. A review of this family of approaches can be found in [72]. Another strategy, focused especially in speckle reduction and avoiding the direct use of gradients, is presented in [78]. Guidotti et al. [32] and Guidotti [31] also propose an interesting alternative named Backward-Forward Regularized Diffusion, again attenuating the “stair-case” effect and using two regularizing parameters. More recently, a promising method combining the model in [32] with texture descriptors is also proposed in [5].

With regards to pseudo-parabolic equations, the early existence, uniqueness, and regularity theory is well developed, for example, in [57,58,59, 64]. Such theory predicts that the additional pseudo-parabolic term decreases the smoothing property characteristic to parabolic problems. This character has consequences on the behavior of the solution and it was also observed in subsequent works. For instance, if the initial data has jump discontinuity at some point, then so does the solution for every time [19, 20]. Here this is a particularly important feature as it allows a controlled smoothing effect preserving relevant edges to some extent. It is also worth to mention that, in general, there is no maximum principle for pseudo-parabolic equations such as one expected of solutions to parabolic equations [62], although the specific simplified model adopted here still preserves that principle. In this work, our particular interest on this type of PDE is explained by their ability to regularize an image profile but at the same time that it preserves discontinuities to account the edges local variations for the image characterization.

3 Proposed methodology

3.1 The partial differential equation modeling approach and the discrete pseudo-parabolic operator

The model of PDE adopted here is that of a pseudo-parabolic type. In its more general form, those equations can be represented by

$$ \frac{\partial}{\partial t}\mathcal{A}(u) + \mathcal{B}(u) = 0, $$
(1)

where \(\mathcal {A}\) and \({\mathscr{B}}\) are nonlinear elliptic operators. More specifically, we build our model on the pseudo-parabolic equation:

$$ \frac{\partial u}{\partial t} = \nabla \cdot \left[ g(x , y , t) \nabla \left( u + \tau \frac{\partial u}{\partial t} \right) \right], $$
(2)

where g(x,y,t) is a real function and τ > 0 is a real constant named damping coefficient. We also identify the pseudo-parabolic diffusive flux w as given by

$$ \mathbf{w} = g(x,y,t) \nabla \left( u + \tau \frac{\partial u}{\partial t} \right). $$
(3)

Let \({{\varOmega }} \subset \mathbb {R}^{2}\) denote a rectangular domain and \({\displaystyle u(\cdot , \cdot , t):{{\varOmega }} \rightarrow \mathbb {R} }\) be a family of gray scale images that satisfies the pseudo-parabolic (2). The original image corresponds to t = 0, i.e., an initial value for the differential problem. To completely define the differential problem, we impose zero flux condition across the domain boundary Ω,

$$ \mathbf{w} \cdot \mathbf{n} = 0, \qquad (x , y) \in \partial {{\varOmega}}, $$
(4)

where n stands for the normal vector.

In general, the coefficient g(x,y,t) is a positive definite tensor and, in the context of nonlinear diffusion models for image processing, it may depend on ∇u [10, 54, 72]. Here, we describe the numerical approach considering g as a scalar. However, the method may be directly extended for the other cases.

For the numerical modeling of the discrete pseudo-parabolic operator we employ a uniform rectangular mesh and cell-centered finite differences. To construct this scheme, we use ideas learned from the mixed finite element method in [2, 3]. The mixed finite element framework has two advantages [13]: it yields a very natural and physical discretization of the boundary conditions; and it gives a consistent way of defining a gradient flux.

Now, we will discuss a simple and effective finite difference approach for the pseudo-parabolic (2). Consider a uniform partition of Ω into rectangular subdomains Ωi,j, for \(i = 1 , {\dots } , m\) and \(j = 1 , {\dots } , l\), with dimensions Δx ×Δy. The center of each subdomain Ωi,j is denoted by (xi,yj). Given a final time of simulation T, consider a uniform partition of the interval [0,T] into N subintervals. The time step Δt = T/N is usually defined by a stability condition. We denote the time instants as tn = nΔt, for \(n = 0 , {\dots } , N\).

Let \(U_{i,j}^{n}\) be a finite difference approximation for u(xi,yj,tn). A discretization of (2) by the finite difference method is given by

$$ \frac{U_{i,j}^{n+1} - U_{i,j}^{n}}{{{\varDelta}} t} = \frac{W_{i+\frac{1}{2},j}^{n+1} - W_{i-\frac{1}{2},j}^{n+1}}{{{\varDelta}} x} + \frac{W_{i,j+\frac{1}{2}}^{n+1} - W_{i,j-\frac{1}{2}}^{n+1}}{{{\varDelta}} y}, \quad \text{where} $$
(5)

the approximation of the diffusive flux is given by a centered difference formula:

$$ \begin{array}{@{}rcl@{}} {}W_{i+\frac{1}{2},j}^{n+1} &=& G_{i+\frac{1}{2},j}^{n} \left( \frac{U_{i+1,j}^{n+1} - U_{i,j}^{n+1}}{{{\varDelta}} x} \right) + \frac{\tau G_{i+\frac{1}{2},j}^{n}}{{{\varDelta}} t} \left( \frac{U_{i+1,j}^{n+1} - U_{i,j}^{n+1}}{{{\varDelta}} x} - \frac{U_{i+1,j}^{n} - U_{i,j}^{n}}{{{\varDelta}} x} \right), \end{array} $$
(6a)
$$ \begin{array}{@{}rcl@{}} {}W_{i,j+\frac{1}{2}}^{n+1}& =& G_{i,j+\frac{1}{2}}^{n} \left( \frac{U_{i,j+1}^{n+1} - U_{i,j}^{n+1}}{{{\varDelta}} y} \right) + \frac{\tau G_{i,j+\frac{1}{2}}^{n}}{{{\varDelta}} t} \left( \frac{U_{i,j+1}^{n+1} - U_{i,j}^{n+1}}{{{\varDelta}} y} - \frac{U_{i,j+1}^{n} - U_{i,j}^{n}}{{{\varDelta}} y} \right). \end{array} $$
(6b)

The coefficients are chosen as the arithmetic mean at interfaces:

$$ G_{i+\frac{1}{2},j}^{n} = \frac{G_{i,j}^{n} + G_{i+1,j}^{n}}{2}, \qquad G_{i,j+\frac{1}{2}} = \frac{G_{i,j}^{n} + G_{i,j+1}^{n}}{2}, $$
(7)

where \(G_{i,j}^{n}\) is an approximation for g(xi,yj,tn). We chose to use the coefficients at time tn. This choice leads to an algebraic linear system over the unknowns \(U_{i,j}^{n+1}\). The algebraic equations may be written as

$$ a_{i,j-\frac{1}{2}} U_{i,j-1}^{n+1} + a_{i-\frac{1}{2},j} U_{i-1,j}^{n+1} + a_{i,j} U_{i}^{n+1} + a_{i+\frac{1}{2},j} U_{i+1,j}^{n+1} + a_{i,j+\frac{1}{2}} U_{i,j+1}^{n+1} = b_{i,j}, $$
(8)

where,

$$ \begin{array}{@{}rcl@{}} b_{i,j} &=& U_{i,j}^{n} -\tau \left[ \frac{ G_{i-\frac{1}{2},j}^{n} U_{i-1,j}^{n} - \left( G_{i-\frac{1}{2},j}^{n} + G_{i+\frac{1}{2},j}^{n}\right) U_{i,j}^{n} + G_{i+\frac{1}{2},j}^{n} U_{i+1,j}^{n} }{{{\varDelta}} x^{2}} \right. \\ &&\left. + \frac{ G_{i,j-\frac{1}{2}}^{n} U_{i,j-1}^{n} -\left( G_{i,j-\frac{1}{2}}^{n} + G_{i,j+\frac{1}{2}}^{n}\right) U_{i,j}^{n} + G_{i,j+\frac{1}{2}}^{n} U_{i,j+1}^{n} }{{{\varDelta}} y^{2}} \right], \end{array} $$
(9)

and the coefficients are given by

$$ \begin{array}{@{}rcl@{}} a_{i,j-\frac{1}{2}} &=& - \frac{({{\varDelta}} t + \tau)}{{{\varDelta}} y^{2}} G_{i,j-\frac{1}{2}}^{n}, \\ a_{i,j+\frac{1}{2}} &=& - \frac{({{\varDelta}} t + \tau)}{{{\varDelta}} y^{2}} G_{i,j+\frac{1}{2}}^{n}, \\ a_{i-\frac{1}{2},j} &=& - \frac{ ({{\varDelta}} t + \tau)}{{{\varDelta}} x^{2}} G_{i-\frac{1}{2},j}^{n}, \\ a_{i+\frac{1}{2},j} &=& - \frac{({{\varDelta}} t + \tau)}{{{\varDelta}} x^{2}} G_{i+\frac{1}{2},j}^{n}, \\ a_{i,j}& =& -\left( a_{i,j-\frac{1}{2}} + a_{i,j+\frac{1}{2}} + a_{i-\frac{1}{2},j} + a_{i+\frac{1}{2}},j\right) + 1. \end{array} $$

Finally, to impose the boundary conditions, we have written the algebraic equations for the subdomains that intersect the boundary.

For each time step, we have to solve a linear system AnUn+ 1 = bn, where Un+ 1 is the vector of unknowns (numerical solution approximation at tn+ 1) and bn is the vector of right-hand side terms (9). The matrix An of linear system is symmetric positive definite and sparse, thus efficient methods may be applied to solve the algebraic system. In this work, we use the Preconditioned Conjugate Gradient Method (PCG).

In this work, we consider the g(x, y,t) ≡ 1, thus (2) becomes a linear pseudo-parabolic equation. For an image to be processed, each subdomain Ωi,j corresponds to a pixel, i.e., the mesh parameters are Δx = Δy = 1. We choose the time step Δt = Δx and the damping coefficient τ = 5.

Algorithms 1 and 2 shows the computational routines involved in the described numerical scheme in a pseudo-code language. Tables 1 and 2 list the auxiliary notation used in the algorithms. Algorithm 2 refers to the inner subdomains. Those subdomains intercepting the boundary can be implemented similarly with only small adaptations. More details can be found in [69].

figure a
figure b
Table 1 Variables used in Algorithms 1 and 2
Table 2 Pre-programmed routines supposed to be available by Algorithms 1 and 2

3.2 Local binary patterns

Local binary patterns (LBP) [52] are gray level texture descriptors developed mainly for texture classification purposes. In its original and most simple version, LBP assigns a code to each pixel (reference pixel) taking into consideration its gray value gc and the gray values gp of P neighbor pixels equally spaced on a circle centered at the reference pixel with radius R. Such code is provided by

$$ LBP_{P,R} = \sum\limits_{p=0}^{P-1}H(g_{p}-g_{c})2^{p}, $$
(10)

where H is the step (Heaviside) function: H(x) = 1 if x ≥ 0 and H(x) = 0, otherwise, and the coordinates of gp are given by

$$ (x_{p},y_{p}) = (x_{c} - R\sin(2\pi p/P),y_{c} + R\cos(2\pi p/P)), $$
(11)

being (xc,yc) the coordinates of the central reference pixel. If (xp,yp) are not integer values, gp value is obtained by linear interpolation.

Here we use an improved version of (10), which ensures rotation invariance and is more discriminative. It discards patterns that are very similar, the so-called uniform patterns. The new descriptor is given by

$$ LBP_{P,R}^{riu2} = \left\{ \begin{array}{ll} {\sum}_{p=0}^{P-1}H(g_{p}-g_{c})2^{p} & \text{if} \mathcal{U}(LBP_{P,R})\geq 2\\ P+1 & \text{otherwise}, \end{array} \right. $$
(12)

provided that

$$ \mathcal{U}(LBP_{P,R}) = |H(g_{P-1}-g_{c})-H(g_{0}-g_{c})| + \sum\limits_{p=1}^{P-1}|H(g_{p}-g_{c})-H(g_{p-1}-g_{c})|. $$
(13)

3.3 Proposed descriptors

Recalling the notation in Section 3.1, the texture descriptors developed here are obtained by evolving the pseudo-parabolic operator on the image u(⋅,⋅,t0) over a range of time t1,t2,⋯ ,tN. In this way we have a family of transformed images {u(⋅,⋅,tn)} for n = 1,⋯ ,N. Empirically we found out that N = 50 provide a reasonable compromise between computational cost and recognition accuracy.

In the next step, we apply the \(LBP_{P,R}^{riu2}\) operator over each evolved image. We adopted the combinations of P,R values listed in Table 3.

Table 3 Parameters of \(LBP_{P,R}^{riu2}\) codes used in this work

These combinations were chosen both based on recent publications on LBP, e.g. [47], which demonstrated their suitability, and on empirical tests specifically designed for the problem that we are addressing here.

Finally, as usual in the LBP pipeline, we compute the histogram h for all possible values of \(LBP_{P,R}^{riu2}\) codes. We can formally summarize the descriptors by

$$ \mathfrak{D}(u) = \bigcup\limits_{\substack{(P,R) = \{(8,1),\\(16,2),(24,3),\\(24,4)\}}}\bigcup\limits_{n=0}^{N} \mathrm{h}\left( LBP_{P,R}^{riu2}(u(\cdot,\cdot,t_{n}))\right). $$
(14)

We end up with a large feature vector for each image, with 4000 components in each one. This requires some strategy to avoid issues like the dimensionality curse. To cope with that we apply Karhunen-Loève transform [53] and select the number of uncorrelated features that provides sufficiently discriminative capacity. Figure 1 shows a flow chart of the main steps involved in the proposed method, whereas Figs. 23, and 4 visually illustrate the outcomes of those corresponding steps for three exemplar texture images. To facilitate the visualization, we show only some of the parameter values (tn, P and R) used to provide the descriptors.

Fig. 1
figure 1

Flow chart of the proposed method

Fig. 2
figure 2

Outcomes of the most important steps of the proposed descriptors: original texture image, application of the pseudo-parabolic operator, LBP encoding and histogram

Fig. 3
figure 3

Outcomes of the most important steps of the proposed descriptors: original texture image, application of the pseudo-parabolic operator, LBP encoding and histogram

Fig. 4
figure 4

Outcomes of the most important steps of the proposed descriptors: original texture image, application of the pseudo-parabolic operator, LBP encoding and histogram

3.4 Motivation

The numerical model employed here is inspired by works like [66], where the Buckley-Leverett equation is extended to a two-phase flow in porous media with dynamic capillary pressure, yielding the following general equation:

$$ \frac{\partial }{\partial t}(\phi u) + \frac{\partial F(u)}{\partial u} = \frac{\partial }{\partial x} \left( G(u) \frac{\partial}{\partial x} \left( J(u) + \tau \frac{\partial }{\partial t}(\phi u) \right) \right). $$
(15)

This model presents some important characteristics that are helpful for an application like the one described here. First, it is sufficiently flexible to represent a diversity of physical phenomena, partially due to the existence of a third order mixed derivative term. It can be demonstrated, for example, that the coefficient τ is critical for the type of solution profile with respect to the regularized Buckley-Leverett problem. Such parameter acts as a bifurcation threshold: if τ is within a certain range, the solution profile is similar to the classical shock waves; on the other hand, for higher values of τ, the profile changes abruptly and new types of shock waves take place in the solution. In some extreme cases, the solution may contain a non-monotone profile or even exhibit damped oscillation.

Another interesting characteristic is that pseudo-parabolic equations like that in [66] can also be interpreted as a regularization of the hyperbolic Buckley-Leverett equation [2, 3]. Nevertheless, unlike classical regularization schemes based, for example, on standard diffusion processes, the pseudo-parabolic term attenuates the smoothing effect typically associated to parabolic solutions. It is known, for instance, that jump discontinuities present at some time persist in the solution for the remaining of the time evolution. In practice, this implies that the model employed here is strongly characterized by regularity preservation. Whereas parabolic differential equations are known by their effect of increasing regularity regardless the information conveyed by the image, here the edges and other important elements of the image expressed by discontinuities are not vanished by the operator with the same aggressiveness. Preserving discontinuities is an important feature in space-scale approaches to image analysis and well explored, for example, in the classical work of Perona and Malik [54].

Generally speaking, the numerical modeling employed here consists of a fully coupled space-time approach, capable of appropriately accounting for the diffusive flux naturally appearing in the pseudo-parabolic equation. In terms of its use as an image analysis tool, it implies in controlled conditions for the regularization process. Being followed by the local encoding of binary patterns makes the proposed descriptors robust in many aspects both to geometrical variations addressed by LBP as well as to edge localization preserved by the nonlinear differential operator.

4 Experiments

The accuracy of the proposed method in texture classification is assessed over three benchmark data sets and compared with other approaches, some of them classically used for this purpose, other ones from the state-of-the-art in the literature.

UIUC [44] is a data set of gray level texture images. Each image has dimensions 256×256 and we have 40 images per class and 25 classes, totalizing 1000 images. Those images are collected under uncontrolled conditions with variations in scale, illumination, perspective and albedo. The photographed materials include bark, wood, water, floor, pebbles, wall, brick, glass, carpet, upholstery, wallpaper, fur, knit, corduroy, and plaid. The training/testing split follows the protocol classically used with half of the images from each class randomly selected for training and the remaining ones for testing. Such random division is repeated 10 times to compute the average accuracy of the classification process.

UMD [75] is another gray scale texture database that shares some similarities with UIUC, like the number of classes and images per class. The images are also acquired under uncontrolled conditions. Nevertheless, UMD images have high resolution, each sample has dimensions 1280×960. The intra-class variation in scale and viewpoint is also more intense, which adds difficulties to the class-wise discrimination. In terms of materials, we have representations of store shelves, pebbles, food, foliage, wall, fruits, fabrics, grass, wood, and others. The training/testing split follows the same protocol of UIUC.

Finally, KTHTIPS-2b [36] is a collection if color texture images (here we convert the samples to gray scales), comprising a total of 4752 images divided into 11 classes, each one corresponding to a particular material. The materials in each class are aluminium foil, brown bread, corduroy, cork, cotton, cracker, lettuce leaf, linen, white bread, wood, and wool. The most important motivation behind the development of this database is the focus on the real material regardless the conditions under which the object was photographed. Each class is divided into 4 samples, and each sample represents a particular setting of illumination, scale and pose. The training/testing split is the classically used in the literature, where one sample is used for training and the remaining ones are employed for testing.

The feature vectors resulting from the concatenation of different transform parameters and LBP settings are large and a Karhunen-Loéve transform is used to reduce dimensionality. Finally, we tested three classifiers to process the descriptors, i.e., linear discriminant analysis (LDA) [24], support vector machines (SVM) [17], and random forests (RF) [37]. In our experiments, all the results in the literature compared with our approach were obtained on the same databases and with ideal configurations.

5 Results and discussion

Figure 5 shows the classification accuracy when using LDA, SVM and RF. All these classifiers use the data in its original form, without any transformation. In this way they are expected to highlight the discriminative power of the descriptors. LDA outperforms SVM and RF in this task with relevant difference. The nature of each classifier can explain such difference. LDA was originally developed to handle multi-class problems, avoiding strategies like “one-against all” employed by SVM. Especially in ideal situations when the classes are balanced, methods dealing with all classes on an equal footing tend to work more appropriately.

Fig. 5
figure 5

Classification accuracy using different classifiers

Figure 6 depicts the behavior of the classification accuracy when the parameter N (maximum iteration time of the PDE operator) is varied. This is actually the most relevant free parameter that we have to tune in this method. According to the plots, the results achieved by N ≥ 50 are quite similar. This is the critical point where the image is so smoothed that no more relevant information can be captured by the image descriptors. Based on this outcome, we employ N = 50 for the remaining experiments.

Fig. 6
figure 6

Accuracy for different values of N

Table 4 lists the accuracy (percentage of images assigned by the classifier to the correct classes) for each database and compared with results recently published in the literature on the same databases. Our approach outperforms several “handcrafted” descriptors and even some learning-based methods like those described in [15]. Whereas in UIUC and UMD several solutions have already been proposed that achieve accuracy close to 100%, KTHTIPS-2b is much more challenging, especially due to its training protocol, in which the algorithm should be able to recognize a material at different viewpoints, illuminations, etc. based on features representing only the material at an original condition. Even here, our proposal outperforms several “hand-engineered” descriptors like SIFT and LBP. It was also competitive with the modern CNN-based approaches.

Table 4 Accuracy of the proposed descriptors compared with other texture descriptors in the literature

Another important test in texture recognition is on how the classifier performance is affected by a reduced number of samples for training. In particular, results on UIUC varying the number of training images are fairly common in the literature, possibly because the usual protocol (20 images for training) can be considered as not sufficiently challenging. Figure 7 illustrates how the accuracy of our proposal is affected by changing the number of training samples in comparison with other methods whose corresponding results are also published in the literature. More specifically, we show the accuracy when using 10, 15 and 20 training samples and compare with MRS4 [68], MR8 [68] and affine spin [44]. It can be noticed that the proposed algorithm can be considered robust to the reduction of the training subset. Its accuracy was only a little smaller than affine spin for 10 training images but was larger than the compared approaches in other situations.

Fig. 7
figure 7

Accuracy when the number of training samples in UIUC database is set to 5, 10, 15, and 20

Figure 8 depicts the confusion matrices for the proposed method on the benchmark datasets. In that representation, a gray map is employed to visually express the number of samples assigned by the classifier to class A and actually pertaining to class B. An ideal matrix would present solid diagonal and no gray point outside (Fig. 9). The pictures confirm the accuracy achieved for each database. UIUC and UMD, for example, have a nearly solid diagonal, as expected from their success rate in the classification close to 100%. Figure 10 illustrates two images from classes 3 (“bark”) and 13 (“wall”) where we have some significant confusion. This can be explained by similar directional patterns present in both materials. Even in this case, however, the error is below 5%. Similar illustration is presented in Fig. 11 for UMD. Again we notice that the presence of a regular pattern with similar directionality made the discrimination more challenging, even though here we also have a small error ≈ 6%. KTH-TIPS2b, on the other hand, has a much more complex picture. It is interesting to observe here the classes that were more difficult to classify. In this case, classes 3 (“corduroy”), 5 (“cotton”), 8 (“linen”), and 11 (“wool”) were the most usually confused. All these materials are related, as they are types of fabric, which makes such confusion somehow expected. Figure 9 exemplifies two images from classes 5 and 8. Notice how they look really similar and discriminating them is a difficult task even by visual inspection.

Fig. 8
figure 8

Confusion matrices. (a) KTHTIPS-2b. (b) UIUC. (c) UMD

Fig. 9
figure 9

Examples of textures from the most confused classes in KTH-TIPS2b

Fig. 10
figure 10

Examples of textures from the most confused classes in UIUC

Fig. 11
figure 11

Examples of textures from the most confused classes in UMD

In summary, the proposed methodology achieved an accuracy in the classification of benchmark texture databases that is competitive with the state-of-the-art in this topic. And this was obtained by a method that does not require neither so much computational power nor large amounts of data for training. The use of pseudo-parabolic PDE-based descriptors was expected to provide meaningful descriptors as it filters out unnecessary details and noise while effectively preserving discontinuity regions. Here, this model confirmed its capacity of providing an alternative description of the way that pixels are locally related on the image. Here such locality is quantified by the well-known LBP codes and such combination demonstrated its powerfulness in the achieved results. In terms of limitations, the only relevant mention is the fact that the proposed approach has more hyper-parameters involved than some deep learning approaches. Nevertheless, it is well known that hyper-parameter tuning is not a big issue if we separate a validation set for that purpose.

5.1 Application

We also apply the proposed descriptors to a practical problem, to know, the identification of Brazilian plant species based on scanned images of leaf surface. The samples are collected in vivo, washed and aligned with the basal/apical axis. The analyzed data set is called 1200Tex [9] and comprises a total of 1200 images, previously converted to gray scales, corresponding to 20 species (classes). Each image has dimension 128 × 128.

Table 5 lists the accuracy for the proposed PDE descriptors on this task in comparison with other results recently published in the literature for the same database. It is particularly noticeable how deep learning approaches like FC-CNN and FV-CNN are outperformed by the proposed method.

Table 5 State-of-the-art accuracies for 1200Tex

Figure 12 depicts the corresponding confusion matrix for the proposed descriptors on 1200Tex. The matrix is nearly diagonal, as ideally expected, and the most relevant confusion takes place at class 8 (confused with class 6). Figure 13 shows a few samples from those two groups illustrating the complexity of separating the respective classes. It is known in botany that the most important elements to distinguish species when looking into leaf surface are nervures, especially primary nervure. And in Fig. 13 we observe that those structures are pretty similar in both classes, such that the limited performance was expected.

Fig. 12
figure 12

Confusion matrix for 1200Tex

Fig. 13
figure 13

Illustrative samples of the most confused classes in 1200Tex

In general terms, we see a good classification performance, competitive with the most advanced methods in the state-of-the-art. This is quite interesting considering that we have a model that is neither computationally expensive nor depends on large amounts of data for training. It also confirms the potential of the developed methodology in practice, especially in problems where visual texture is a relevant attribute, a common situation for example in biological applications and material sciences.

6 Conclusions

This work presented a methodology to recognize texture images based on the application of a multiscale operator derived from the pseudo-parabolic diffusion PDE model. By using the simple and efficient discrete pseudo-parabolic differential operator based on finite differences as texture descriptors introduced in this work, LBP codes are extracted from the transformed images to compose the feature vector, which is finally used as input to a classifier. Moreover, the method was tested on benchmark databases and in a practical application in botany. The classification accuracy was compared to classical and state-of-the-art texture recognition approaches. The results showed that our proposal was competitive even with the most recently published approaches on this topic. This also confirmed how such “hand-crafted” descriptors can still be useful, especially when the computational structure at hand is not sufficiently powerful or when large amounts of data are not available, a common situation in areas like medicine, for example. It is worth highlighting that the differential model introduced in this work, along with its discrete counterpart and its effectiveness in a texture recognition task, suggests the future investigation of even more complex models, for instance, such as that in [1,2,3]. More specifically, the numerical scheme adopted here can be coupled, for instance, with modern convolutional neural networks to provide rich and robust description for textures or even for general object recognition tasks.