Abstract
This work proposes a novel method based on a pseudo-parabolic diffusion process to be employed for texture recognition. The proposed operator is applied over a range of time scales giving rise to a family of images transformed by nonlinear filters. Therefore each of those images are encoded by a local descriptor (we use local binary patterns for that purpose) and they are summarized by a simple histogram, yielding in this way the image feature vector. Three main novelties are presented in this manuscript: (1) The introduction of a pseudo-parabolic model associated with the signal component of binary patterns to the process of texture recognition and a real-world application to the problem of identifying plant species based on the leaf surface image. (2) We also introduce a simple and efficient discrete pseudo-parabolic differential operator based on finite differences as texture descriptors. While the work in [26] uses complete local binary patterns, here we use the original version of the local binary pattern operator. (3) We also discuss, in more general terms, the possibilities of exploring pseudo-parabolic models for image analysis as they balance two types of processing that are fundamental for pattern recognition, i.e., they smooth undesirable details (possibly noise) at the same time that highlight relevant borders and discontinuities anisotropically. Besides the practical application, the proposed approach is also tested on the classification of well established benchmark texture databases. In both cases, it is compared with several state-of-the-art methodologies employed for texture recognition. Our proposal outperforms those methods in terms of classification accuracy, confirming its competitiveness. The good performance can be justified to a large extent by the ability of the pseudo-parabolic operator to smooth possibly noisy details inside homogeneous regions of the image at the same time that it preserves discontinuities that convey critical information for the object description. Such results also confirm that model-based approaches like the proposed one can still be competitive with the omnipresent learning-based approaches, especially when the user does not have access to a powerful computational structure and a large amount of labeled data for training.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The recognition and classification of texture images, either in isolated conditions or “in the wild” is one of the most important tasks in computer vision, with numerous applications in material sciences [51], medicine [38], remote sensing [65], agriculture [56], etc.
Despite the recent popularization of learning-based approaches like the deep convolutional networks in this area [15], there still exists space for the investigation of “hand-engineered” image descriptors, especially in situations when there is a small amount of data for training, what is typical, for instance, in medical applications, or when the introduction of specific information from that particular problem (from the physical modeling for example) can contribute with the performance of the computational algorithm.
In this context, and inspired by the recent success of methods like the nonlinear operator derived from a partial differential equation (PDE) in [25, 28], we propose here new texture descriptors based on the application of an operator that corresponds to solutions of a pseudo-parabolic partial differential equation (PDE) (e.g., [2, 3]) when the original image provides the initial condition of that PDE. Pseudo-parabolic differential equations are characterized by having mixed time and space derivatives appearing in the highest-order terms [40, 64] and are applied to model many physical phenomena, for instance the fluid dynamic in porous media [35]. In [50], is presented a state-of-the-art overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network. In [8], data-driven physics-informed techniques are used for possibly improved comprehension about the underlying PDE in fluid mechanics. In [42], the authors proposed an improved PDE based total variation model that enhances grey and colored brain tumor images obtained by magnetic resonance imaging. In addition, in [43] the authors introduced an optimized support vector machine (SVM) based possibilistic fuzzy c-means clustering algorithm for tumor segmentation supported by an efficient partial differential equation modeling. On the other hand, image classification methods have achieved remarkable success in diverse applications. For instance in [45], the authors discussed an image classification method based on convolutional neural networks that is used to detect detail features by actively transforming the relative positions of the connections via application of a convolutional kernel. Here our particular interest on this type of PDE is explained by their ability to regularize a function (image profile in our case) but at the same time that it preserves discontinuities (here corresponding to edges and other relevant local variations important for the image characterization).
To obtain the image representation (descriptors) we follow the PDE application with an encoding process, which here is accomplished by local binary patterns (LBP) [52]. The descriptors are finally submitted to Karhunen-Loève transform [53] and classified by a linear discriminant classifier [24]. The proposed method is assessed on state-of-the-art benchmark databases (UIUC [44], UMD [75] and KTHTIPS-2b [36]) as well as in a practical problem of Brazilian plant species identification [9]. The results confirm that the classification accuracy is competitive both with classical texture descriptors as well as with modern approaches like convolutional neural networks.
Our study presents three main contributions:
-
1.
A state-of-the-art mathematical modeling is coupled with basic local binary patterns using only the signal component and achieves promising results even on challenging texture databases in the literature, with results competitive with the modern deep-learning-based approaches.
-
2.
We also introduce a simple and efficient discrete pseudo-parabolic PDE operator based on finite differences as texture descriptors, based on the original version of the local binary pattern operator.
-
3.
We also present here a real-world application of the methodology. While benchmark tests are relevant for comparison purposes, practical problems frequently pose new challenges to the theoretical approach as, for example, we typically do not have strict control neither over the acquisition process nor over the amount and quality of the available data.
This text is divided into six main sections, including this introduction. In the next Section 2, we present a brief review of other PDE-based methods for image analysis and processing in the literature, where we also highlight and present the three main contributions. Section 3 shows the methodology, including a short introduction to LBP theory and the proposed method. We also introduce the pseudo-parabolic diffusion PDE model and we present the novel corresponding and effective discrete pseudo-parabolic differential operator used here to solve it. Section 4 describes the setup for the texture classification experiments. Moreover, the accuracy of the proposed method in texture classification is assessed over three benchmark data sets and compared with other approaches, some of them classically used for this purpose, other ones from the state-of-the-art in the literature. Section 5 presents the results of these experiments in comparison with the state-of-the-art on this topic and corresponding discussion. Finally, Section 6 concludes the work, summarizing the main points and raising some general discussion on the impact of the proposed new methodology.
2 Related works
Most methods for the classification of texture images can be divided into two main categories: traditional approaches and deep learning methods [48]. In the first group, we have the classical second order statistical methods like Haralick descriptors [34] and Local Binary Patterns (LBP) [52], more recently we have seen the Scale-Invariant Feature Transform (SIFT) [49], Filter Banks [68], Fisher Vector [55], Fractal Descriptors [27], PCANet [11], Locally Encoded Transform Feature Histogram [60], Local Edge Signature (LES) [29], Multipattern Maximum-Minimum-Center encoding (MMC) [74], and many others. On the second category, deep learning methods, we have those based on Convolutional Neural Networks (CNN), such as Deep Convolutional Activation Feature (DeCAF) [22], Deep Filter Banks [15], Deep Texture Encoding Network (DeepTEN) [80], Locally Transfered Fisher Vectors (LFV) [61], Deep Texture Manifold (DEP) [76], First and Second Order Network (FASON) [21], Multiple-Attribute-Perceived (MAP-Net) [79], Bilinear CNN Pair-wise Difference Pooling (BCNN-PDP) [23], Wavelet Multi-Level Attention Capsule Network (WMACapsNet) [63], Ranking Global Pooling Multiple CNN (RaNKGP-3M-CNN) [16], and others.
Differential (partial and ordinary) equation model approach is effective in a wide range of applications ranging from up-to-date texture image classification [42, 43] (SVM) to data-driven and physics-informed neural networks (PINNs) involving inverse problems [50] and machine learning for fluid mechanics [8]. In [77], the authors propose a new physics-based deformation method and efficient character that uses PDE surfaces for an improved modeling framework for creation of detailed 3D virtual geometric models of characters. In [6], the authors used a differential equation inspired by Newton’s law of motion for accurate coarse soft tissue modeling using finite element method-based fine simulation. In [70], we have a novel approach combining PDE and physics-based method to interactively manipulate surface shapes of 3D models with C1 continuity in real time, by using a fourth-order partial differential equation involving a sculpting force originating from elastic bending of thin plates to define physics-based deformations. In [1], the authors study on a feedforward neural network to solve partial differential equations in hyperbolic-transport problems motivated by multiphase fluid flow through porous media and fluid mechanics modeling. In [12] is presented a fractional-order differential equation for modeling of COVID-19 infection of epithelial cells. We refer the interested reader to [1, 6, 8, 26, 42, 43, 45, 50, 70, 77] and the references cited therein for a survey on the partial differential equation modeling approaches along with several fine tuning tools as such as physics-informed, physically-based simulation, data-driven method and data-driven enrichment for state-of-the-art ideas on development methods of multimedia applications involving differential equation as a crucial tool.
The disseminated use of PDEs to provide different viewpoints of a digital image can be traced back to the space-scale theory of Witkin [73], formulated in terms of the canonical model of diffusion equation, i.e., the heat equation. The well-known maximum principle of the elliptic equations was employed in that occasion to ensure that spurious details not present in finer scales would also not appear in coarser scales. This is a fundamental property that must be satisfied by any robust multiscale theory [41].
However, the smoothing effect intrinsic to diffusion equations can also represent a problem in image analysis as, at the same time that the control over the smoothing parameter allows for fine tuning of the desired scale of observation, it also smooths out image edges. Edges are known to be powerful attributes in the discrimination of objects present in the image. To address this issue, Perona and Malik proposed the anisotropic diffusion equation in [54]. This ensures the smoothing multiscale effect while preserving edges by means of a less aggressive action of the operator in regions with high-valued gradients.
Evolutions over the anisotropic model can be found in [10], where the image is regularized previous to the application of the PDE operator. Cottet and Germain [18] and Weickert [71] employ diffusion tensors to address the problem of “ghost” stair-case edges arising in some situations after the application of the operator. A review of this family of approaches can be found in [72]. Another strategy, focused especially in speckle reduction and avoiding the direct use of gradients, is presented in [78]. Guidotti et al. [32] and Guidotti [31] also propose an interesting alternative named Backward-Forward Regularized Diffusion, again attenuating the “stair-case” effect and using two regularizing parameters. More recently, a promising method combining the model in [32] with texture descriptors is also proposed in [5].
With regards to pseudo-parabolic equations, the early existence, uniqueness, and regularity theory is well developed, for example, in [57,58,59, 64]. Such theory predicts that the additional pseudo-parabolic term decreases the smoothing property characteristic to parabolic problems. This character has consequences on the behavior of the solution and it was also observed in subsequent works. For instance, if the initial data has jump discontinuity at some point, then so does the solution for every time [19, 20]. Here this is a particularly important feature as it allows a controlled smoothing effect preserving relevant edges to some extent. It is also worth to mention that, in general, there is no maximum principle for pseudo-parabolic equations such as one expected of solutions to parabolic equations [62], although the specific simplified model adopted here still preserves that principle. In this work, our particular interest on this type of PDE is explained by their ability to regularize an image profile but at the same time that it preserves discontinuities to account the edges local variations for the image characterization.
3 Proposed methodology
3.1 The partial differential equation modeling approach and the discrete pseudo-parabolic operator
The model of PDE adopted here is that of a pseudo-parabolic type. In its more general form, those equations can be represented by
where \(\mathcal {A}\) and \({\mathscr{B}}\) are nonlinear elliptic operators. More specifically, we build our model on the pseudo-parabolic equation:
where g(x,y,t) is a real function and τ > 0 is a real constant named damping coefficient. We also identify the pseudo-parabolic diffusive flux w as given by
Let \({{\varOmega }} \subset \mathbb {R}^{2}\) denote a rectangular domain and \({\displaystyle u(\cdot , \cdot , t):{{\varOmega }} \rightarrow \mathbb {R} }\) be a family of gray scale images that satisfies the pseudo-parabolic (2). The original image corresponds to t = 0, i.e., an initial value for the differential problem. To completely define the differential problem, we impose zero flux condition across the domain boundary ∂Ω,
where n stands for the normal vector.
In general, the coefficient g(x,y,t) is a positive definite tensor and, in the context of nonlinear diffusion models for image processing, it may depend on ∇u [10, 54, 72]. Here, we describe the numerical approach considering g as a scalar. However, the method may be directly extended for the other cases.
For the numerical modeling of the discrete pseudo-parabolic operator we employ a uniform rectangular mesh and cell-centered finite differences. To construct this scheme, we use ideas learned from the mixed finite element method in [2, 3]. The mixed finite element framework has two advantages [13]: it yields a very natural and physical discretization of the boundary conditions; and it gives a consistent way of defining a gradient flux.
Now, we will discuss a simple and effective finite difference approach for the pseudo-parabolic (2). Consider a uniform partition of Ω into rectangular subdomains Ωi,j, for \(i = 1 , {\dots } , m\) and \(j = 1 , {\dots } , l\), with dimensions Δx ×Δy. The center of each subdomain Ωi,j is denoted by (xi,yj). Given a final time of simulation T, consider a uniform partition of the interval [0,T] into N subintervals. The time step Δt = T/N is usually defined by a stability condition. We denote the time instants as tn = nΔt, for \(n = 0 , {\dots } , N\).
Let \(U_{i,j}^{n}\) be a finite difference approximation for u(xi,yj,tn). A discretization of (2) by the finite difference method is given by
the approximation of the diffusive flux is given by a centered difference formula:
The coefficients are chosen as the arithmetic mean at interfaces:
where \(G_{i,j}^{n}\) is an approximation for g(xi,yj,tn). We chose to use the coefficients at time tn. This choice leads to an algebraic linear system over the unknowns \(U_{i,j}^{n+1}\). The algebraic equations may be written as
where,
and the coefficients are given by
Finally, to impose the boundary conditions, we have written the algebraic equations for the subdomains that intersect the boundary.
For each time step, we have to solve a linear system AnUn+ 1 = bn, where Un+ 1 is the vector of unknowns (numerical solution approximation at tn+ 1) and bn is the vector of right-hand side terms (9). The matrix An of linear system is symmetric positive definite and sparse, thus efficient methods may be applied to solve the algebraic system. In this work, we use the Preconditioned Conjugate Gradient Method (PCG).
In this work, we consider the g(x, y,t) ≡ 1, thus (2) becomes a linear pseudo-parabolic equation. For an image to be processed, each subdomain Ωi,j corresponds to a pixel, i.e., the mesh parameters are Δx = Δy = 1. We choose the time step Δt = Δx and the damping coefficient τ = 5.
Algorithms 1 and 2 shows the computational routines involved in the described numerical scheme in a pseudo-code language. Tables 1 and 2 list the auxiliary notation used in the algorithms. Algorithm 2 refers to the inner subdomains. Those subdomains intercepting the boundary can be implemented similarly with only small adaptations. More details can be found in [69].
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11042-022-12048-2/MediaObjects/11042_2022_12048_Figa_HTML.png)
![figure b](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs11042-022-12048-2/MediaObjects/11042_2022_12048_Figb_HTML.png)
3.2 Local binary patterns
Local binary patterns (LBP) [52] are gray level texture descriptors developed mainly for texture classification purposes. In its original and most simple version, LBP assigns a code to each pixel (reference pixel) taking into consideration its gray value gc and the gray values gp of P neighbor pixels equally spaced on a circle centered at the reference pixel with radius R. Such code is provided by
where H is the step (Heaviside) function: H(x) = 1 if x ≥ 0 and H(x) = 0, otherwise, and the coordinates of gp are given by
being (xc,yc) the coordinates of the central reference pixel. If (xp,yp) are not integer values, gp value is obtained by linear interpolation.
Here we use an improved version of (10), which ensures rotation invariance and is more discriminative. It discards patterns that are very similar, the so-called uniform patterns. The new descriptor is given by
provided that
3.3 Proposed descriptors
Recalling the notation in Section 3.1, the texture descriptors developed here are obtained by evolving the pseudo-parabolic operator on the image u(⋅,⋅,t0) over a range of time t1,t2,⋯ ,tN. In this way we have a family of transformed images {u(⋅,⋅,tn)} for n = 1,⋯ ,N. Empirically we found out that N = 50 provide a reasonable compromise between computational cost and recognition accuracy.
In the next step, we apply the \(LBP_{P,R}^{riu2}\) operator over each evolved image. We adopted the combinations of P,R values listed in Table 3.
These combinations were chosen both based on recent publications on LBP, e.g. [47], which demonstrated their suitability, and on empirical tests specifically designed for the problem that we are addressing here.
Finally, as usual in the LBP pipeline, we compute the histogram h for all possible values of \(LBP_{P,R}^{riu2}\) codes. We can formally summarize the descriptors by
We end up with a large feature vector for each image, with 4000 components in each one. This requires some strategy to avoid issues like the dimensionality curse. To cope with that we apply Karhunen-Loève transform [53] and select the number of uncorrelated features that provides sufficiently discriminative capacity. Figure 1 shows a flow chart of the main steps involved in the proposed method, whereas Figs. 2, 3, and 4 visually illustrate the outcomes of those corresponding steps for three exemplar texture images. To facilitate the visualization, we show only some of the parameter values (tn, P and R) used to provide the descriptors.
3.4 Motivation
The numerical model employed here is inspired by works like [66], where the Buckley-Leverett equation is extended to a two-phase flow in porous media with dynamic capillary pressure, yielding the following general equation:
This model presents some important characteristics that are helpful for an application like the one described here. First, it is sufficiently flexible to represent a diversity of physical phenomena, partially due to the existence of a third order mixed derivative term. It can be demonstrated, for example, that the coefficient τ is critical for the type of solution profile with respect to the regularized Buckley-Leverett problem. Such parameter acts as a bifurcation threshold: if τ is within a certain range, the solution profile is similar to the classical shock waves; on the other hand, for higher values of τ, the profile changes abruptly and new types of shock waves take place in the solution. In some extreme cases, the solution may contain a non-monotone profile or even exhibit damped oscillation.
Another interesting characteristic is that pseudo-parabolic equations like that in [66] can also be interpreted as a regularization of the hyperbolic Buckley-Leverett equation [2, 3]. Nevertheless, unlike classical regularization schemes based, for example, on standard diffusion processes, the pseudo-parabolic term attenuates the smoothing effect typically associated to parabolic solutions. It is known, for instance, that jump discontinuities present at some time persist in the solution for the remaining of the time evolution. In practice, this implies that the model employed here is strongly characterized by regularity preservation. Whereas parabolic differential equations are known by their effect of increasing regularity regardless the information conveyed by the image, here the edges and other important elements of the image expressed by discontinuities are not vanished by the operator with the same aggressiveness. Preserving discontinuities is an important feature in space-scale approaches to image analysis and well explored, for example, in the classical work of Perona and Malik [54].
Generally speaking, the numerical modeling employed here consists of a fully coupled space-time approach, capable of appropriately accounting for the diffusive flux naturally appearing in the pseudo-parabolic equation. In terms of its use as an image analysis tool, it implies in controlled conditions for the regularization process. Being followed by the local encoding of binary patterns makes the proposed descriptors robust in many aspects both to geometrical variations addressed by LBP as well as to edge localization preserved by the nonlinear differential operator.
4 Experiments
The accuracy of the proposed method in texture classification is assessed over three benchmark data sets and compared with other approaches, some of them classically used for this purpose, other ones from the state-of-the-art in the literature.
UIUC [44] is a data set of gray level texture images. Each image has dimensions 256×256 and we have 40 images per class and 25 classes, totalizing 1000 images. Those images are collected under uncontrolled conditions with variations in scale, illumination, perspective and albedo. The photographed materials include bark, wood, water, floor, pebbles, wall, brick, glass, carpet, upholstery, wallpaper, fur, knit, corduroy, and plaid. The training/testing split follows the protocol classically used with half of the images from each class randomly selected for training and the remaining ones for testing. Such random division is repeated 10 times to compute the average accuracy of the classification process.
UMD [75] is another gray scale texture database that shares some similarities with UIUC, like the number of classes and images per class. The images are also acquired under uncontrolled conditions. Nevertheless, UMD images have high resolution, each sample has dimensions 1280×960. The intra-class variation in scale and viewpoint is also more intense, which adds difficulties to the class-wise discrimination. In terms of materials, we have representations of store shelves, pebbles, food, foliage, wall, fruits, fabrics, grass, wood, and others. The training/testing split follows the same protocol of UIUC.
Finally, KTHTIPS-2b [36] is a collection if color texture images (here we convert the samples to gray scales), comprising a total of 4752 images divided into 11 classes, each one corresponding to a particular material. The materials in each class are aluminium foil, brown bread, corduroy, cork, cotton, cracker, lettuce leaf, linen, white bread, wood, and wool. The most important motivation behind the development of this database is the focus on the real material regardless the conditions under which the object was photographed. Each class is divided into 4 samples, and each sample represents a particular setting of illumination, scale and pose. The training/testing split is the classically used in the literature, where one sample is used for training and the remaining ones are employed for testing.
The feature vectors resulting from the concatenation of different transform parameters and LBP settings are large and a Karhunen-Loéve transform is used to reduce dimensionality. Finally, we tested three classifiers to process the descriptors, i.e., linear discriminant analysis (LDA) [24], support vector machines (SVM) [17], and random forests (RF) [37]. In our experiments, all the results in the literature compared with our approach were obtained on the same databases and with ideal configurations.
5 Results and discussion
Figure 5 shows the classification accuracy when using LDA, SVM and RF. All these classifiers use the data in its original form, without any transformation. In this way they are expected to highlight the discriminative power of the descriptors. LDA outperforms SVM and RF in this task with relevant difference. The nature of each classifier can explain such difference. LDA was originally developed to handle multi-class problems, avoiding strategies like “one-against all” employed by SVM. Especially in ideal situations when the classes are balanced, methods dealing with all classes on an equal footing tend to work more appropriately.
Figure 6 depicts the behavior of the classification accuracy when the parameter N (maximum iteration time of the PDE operator) is varied. This is actually the most relevant free parameter that we have to tune in this method. According to the plots, the results achieved by N ≥ 50 are quite similar. This is the critical point where the image is so smoothed that no more relevant information can be captured by the image descriptors. Based on this outcome, we employ N = 50 for the remaining experiments.
Table 4 lists the accuracy (percentage of images assigned by the classifier to the correct classes) for each database and compared with results recently published in the literature on the same databases. Our approach outperforms several “handcrafted” descriptors and even some learning-based methods like those described in [15]. Whereas in UIUC and UMD several solutions have already been proposed that achieve accuracy close to 100%, KTHTIPS-2b is much more challenging, especially due to its training protocol, in which the algorithm should be able to recognize a material at different viewpoints, illuminations, etc. based on features representing only the material at an original condition. Even here, our proposal outperforms several “hand-engineered” descriptors like SIFT and LBP. It was also competitive with the modern CNN-based approaches.
Another important test in texture recognition is on how the classifier performance is affected by a reduced number of samples for training. In particular, results on UIUC varying the number of training images are fairly common in the literature, possibly because the usual protocol (20 images for training) can be considered as not sufficiently challenging. Figure 7 illustrates how the accuracy of our proposal is affected by changing the number of training samples in comparison with other methods whose corresponding results are also published in the literature. More specifically, we show the accuracy when using 10, 15 and 20 training samples and compare with MRS4 [68], MR8 [68] and affine spin [44]. It can be noticed that the proposed algorithm can be considered robust to the reduction of the training subset. Its accuracy was only a little smaller than affine spin for 10 training images but was larger than the compared approaches in other situations.
Figure 8 depicts the confusion matrices for the proposed method on the benchmark datasets. In that representation, a gray map is employed to visually express the number of samples assigned by the classifier to class A and actually pertaining to class B. An ideal matrix would present solid diagonal and no gray point outside (Fig. 9). The pictures confirm the accuracy achieved for each database. UIUC and UMD, for example, have a nearly solid diagonal, as expected from their success rate in the classification close to 100%. Figure 10 illustrates two images from classes 3 (“bark”) and 13 (“wall”) where we have some significant confusion. This can be explained by similar directional patterns present in both materials. Even in this case, however, the error is below 5%. Similar illustration is presented in Fig. 11 for UMD. Again we notice that the presence of a regular pattern with similar directionality made the discrimination more challenging, even though here we also have a small error ≈ 6%. KTH-TIPS2b, on the other hand, has a much more complex picture. It is interesting to observe here the classes that were more difficult to classify. In this case, classes 3 (“corduroy”), 5 (“cotton”), 8 (“linen”), and 11 (“wool”) were the most usually confused. All these materials are related, as they are types of fabric, which makes such confusion somehow expected. Figure 9 exemplifies two images from classes 5 and 8. Notice how they look really similar and discriminating them is a difficult task even by visual inspection.
In summary, the proposed methodology achieved an accuracy in the classification of benchmark texture databases that is competitive with the state-of-the-art in this topic. And this was obtained by a method that does not require neither so much computational power nor large amounts of data for training. The use of pseudo-parabolic PDE-based descriptors was expected to provide meaningful descriptors as it filters out unnecessary details and noise while effectively preserving discontinuity regions. Here, this model confirmed its capacity of providing an alternative description of the way that pixels are locally related on the image. Here such locality is quantified by the well-known LBP codes and such combination demonstrated its powerfulness in the achieved results. In terms of limitations, the only relevant mention is the fact that the proposed approach has more hyper-parameters involved than some deep learning approaches. Nevertheless, it is well known that hyper-parameter tuning is not a big issue if we separate a validation set for that purpose.
5.1 Application
We also apply the proposed descriptors to a practical problem, to know, the identification of Brazilian plant species based on scanned images of leaf surface. The samples are collected in vivo, washed and aligned with the basal/apical axis. The analyzed data set is called 1200Tex [9] and comprises a total of 1200 images, previously converted to gray scales, corresponding to 20 species (classes). Each image has dimension 128 × 128.
Table 5 lists the accuracy for the proposed PDE descriptors on this task in comparison with other results recently published in the literature for the same database. It is particularly noticeable how deep learning approaches like FC-CNN and FV-CNN are outperformed by the proposed method.
Figure 12 depicts the corresponding confusion matrix for the proposed descriptors on 1200Tex. The matrix is nearly diagonal, as ideally expected, and the most relevant confusion takes place at class 8 (confused with class 6). Figure 13 shows a few samples from those two groups illustrating the complexity of separating the respective classes. It is known in botany that the most important elements to distinguish species when looking into leaf surface are nervures, especially primary nervure. And in Fig. 13 we observe that those structures are pretty similar in both classes, such that the limited performance was expected.
In general terms, we see a good classification performance, competitive with the most advanced methods in the state-of-the-art. This is quite interesting considering that we have a model that is neither computationally expensive nor depends on large amounts of data for training. It also confirms the potential of the developed methodology in practice, especially in problems where visual texture is a relevant attribute, a common situation for example in biological applications and material sciences.
6 Conclusions
This work presented a methodology to recognize texture images based on the application of a multiscale operator derived from the pseudo-parabolic diffusion PDE model. By using the simple and efficient discrete pseudo-parabolic differential operator based on finite differences as texture descriptors introduced in this work, LBP codes are extracted from the transformed images to compose the feature vector, which is finally used as input to a classifier. Moreover, the method was tested on benchmark databases and in a practical application in botany. The classification accuracy was compared to classical and state-of-the-art texture recognition approaches. The results showed that our proposal was competitive even with the most recently published approaches on this topic. This also confirmed how such “hand-crafted” descriptors can still be useful, especially when the computational structure at hand is not sufficiently powerful or when large amounts of data are not available, a common situation in areas like medicine, for example. It is worth highlighting that the differential model introduced in this work, along with its discrete counterpart and its effectiveness in a texture recognition task, suggests the future investigation of even more complex models, for instance, such as that in [1,2,3]. More specifically, the numerical scheme adopted here can be coupled, for instance, with modern convolutional neural networks to provide rich and robust description for textures or even for general object recognition tasks.
References
Abreu E, Florindo JB (2021) A study on a feedforward neural network to solve partial differential equations in hyperbolic-transport problems. In: International conference on computational science. Springer, pp 398–411
Abreu E, Vieira J (2017) Computing numerical solutions of the pseudo-parabolic Buckley–Leverett equation with dynamic capillary pressure. Math Comput Simul 137:29–48
Abreu E, Ferraz P, Vieira J (2020) Numerical resolution of a pseudo-parabolic Buckley-Leverett model with gravity and dynamic capillary pressure in heterogeneous porous media. J Comput Phys 411:109395
Ahonen T, Matas J, He C, Pietikäinen M (2009) Rotation invariant image description with local binary pattern histogram fourier features. In: Salberg AB, Hardeberg JY, Jenssen R (eds) Image analysis. Springer, Berlin, Heidelberg, pp 61–70
Barros Neiva M, Guidotti P, Bruno OM (2018) Enhancing LBP by preprocessing via anisotropic diffusion. Int J Modern Phys C 29(08):1850071
Bounik Z, Shamsi M, Sedaaghi MH (2020) Accurate coarse soft tissue modeling using FEM-based fine simulation. Multimed Tools Appl 79(11):7121–7134
Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886
Brunton SL, Noack BR, Koumoutsakos P (2020) Machine learning for fluid mechanics. Annu Rev Fluid Mech 52:477–508
Casanova D, de Mesquita Sá Junior JJ, Bruno OM (2009) Plant leaf identification using Gabor wavelets. Int J Imaging Syst Technol 19(3):236–243
Catté F, Lions PL, Morel JM, Coll T (1992) Image selective smoothing and edge detection by nonlinear diffusion. SIAM J Numer Anal 29(1):182–193
Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) PCANet: A simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032
Chatterjee AN, Ahmad B (2021) A fractional-order differential equation model of COVID-19 infection of epithelial cells. Chaos, Solitons Fractals 147:110952
Chavent G, Roberts J (1991) A unified physical presentation of mixed, mixed-hybrid finite elements and standard finite difference approximations for the determination of velocities in waterflow problems. Adv Water Resour 14 (6):329–348
Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A (2014) Describing textures in the wild. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, CVPR ’14. IEEE Computer Society, Washington, DC, USA, pp 3606–3613
Cimpoi M, Maji S, Kokkinos I, Vedaldi A (2016) Deep filter banks for texture recognition, description, and segmentation. Int J Comput Vis 118 (1):65–94
Condori RHM, Bruno OM (2021) Analysis of activation maps through global pooling measurements for texture classification. Inform Sci 555:260–279. https://doi.org/10.1016/j.ins.2020.09.058
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Cottet GH, Germain L (1993) Image processing through reaction combined with nonlinear diffusion. Math Comp 61(1):659–673
Cuesta C, Hulshof J (2003) A model problem for groundwater flow with dynamic capillary pressure: stability of travelling waves. Nonlinear Anal Theory Methods Appl 52(4):1199–1218
Cuesta C, Pop I (2009) Numerical schemes for a pseudo-parabolic burgers equation: discontinuous data and long-time behaviour. J Comput Appl Math 224:269–283
Dai X, Ng JY, Davis LS (2017) FASON: First and second order information fusion network for texture recognition. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2017.646, pp 6100–6108
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st international conference on international conference on machine learning, JMLR.org, ICML’14, vol 32, pp I–647–I–655
Dong X, Zhou H, Dong J (2020) Texture classification using pair-wise difference pooling-based bilinear convolutional neural networks. IEEE Trans Image Process 29:8776–8790. https://doi.org/10.1109/TIP.2020.3019185
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Florindo JB (2020) DSTNet: Successive applications of the discrete schroedinger transform for texture recognition. Inf Sci 507:356–364
Florindo JB, Abreu E (2021) An application of a pseudo-parabolic modeling to texture image recognition. In: International conference on computational science. Springer, pp 386–397
Florindo JB, Bruno OM (2016) Local fractal dimension and binary patterns in texture recognition. Pattern Recogn Lett 78:22–27
Florindo JB, Bruno OM (2017) Discrete Schroedinger transform for texture recognition. Inform Sci 415:142–155
Ghazouani H, Barhoumi W (2020) Genetic programming-based learning of texture classification descriptors from local edge signature. Expert Syst Appl 161:113667
Gonçalves WN, da Silva NR, da Fontoura Costa L, Bruno OM (2016) Texture recognition based on diffusion in networks. Inform Sci 364(C):51–71
Guidotti P (2009) A new nonlocal nonlinear diffusion of image processing. J Differ Equ 246(12):4731–4742
Guidotti P, Kim Y, Lambers J (2013) Image restoration with a new class of forward-backward-forward diffusion equations of Perona–Malik type with applications to satellite image enhancement. SIAM J Imaging Sci 6(3):1416–1444
Guo Z, Zhang L, Zhang D (2010b) Rotation invariant texture classification using LBP variance (LBPV) with global matching. Pattern Recogn 43 (3):706–719
Haralick R, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Trans Syst Man Cybern 3(6)
Hassanizadeh S, Gray W (1993) Thermodynamic basis of capillary pressure in porous media. Water Resour Res 29:3389–3405
Hayman E, Caputo B, Fritz M, Eklundh JO (2004) On the significance of real-world conditions for material classification. In: Pajdla T, Matas J (eds) Computer vision - ECCV 2004. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 253–266
Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition (Volume 1) - Volume 1, ICDAR ’95. IEEE Computer Society, Washington, DC, USA, p 278
Jasionowska M, Przelaskowski A (2019) Wavelet-like selective representations of multidirectional structures: a mammography case. Pattern Anal Appl 22(4):1399–1408
Kannala J, Rahtu E (2012) BSIF: Binarized statistical image features. In: ICPR. IEEE Computer Society, pp 1363–1366
Karch G (1997) Asymptotic behaviour of solutions to some pseudoparabolic equations. Math Methods Appl Sci 20(3):271–289
Koenderink JJ (1984) The structure of images. Biol Cybern 50 (5):363–370
Kollem S, Reddy KR, Rao DS (2021a) Improved partial differential equation-based total variation approach to non-subsampled contourlet transform for medical image denoising. Multimed Tools Appl 80(2):2663–2689
Kollem S, Reddy KR, Rao DS (2021a) An optimized SVM based possibilistic fuzzy c-means clustering algorithm for tumor segmentation. Multimed Tools Appl 80(1):409–437
Lazebnik S, Schmid C, Ponce J (2005) A sparse texture representation using local affine regions. IEEE Trans Pattern Anal Mach Intell 27(8):1265–1278
Li D, Deng L, Cai Z (2021) Research on image classification method based on convolutional neural network. Neural Comput Appl 33(14):8157–8167
Liu L, Zhao L, Long Y, Kuang G, Fieguth P (2012) Extended local binary patterns for texture classification. Image Vision Comput 30(2):86–99
Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: Taxonomy and experimental study. Pattern Recogn 62:135–160
Liu L, Chen J, Fieguth PW, Zhao G, Chellappa R, Pietikäinen M (2019) From BoW to CNN: Two decades of texture representation for texture classification. Int J Comput Vis 127(1):74–109
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lu L, Meng X, Mao Z, Karniadakis GE (2021) DeepXDE: A deep learning library for solving differential equations. SIAM Rev 63(1):208–228
Naik DL, Khan R (2019) Identification and characterization of fracture in metals using machine learning based texture recognition algorithms. Eng Fract Mech 219
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Pearson FK (1901) LIII on lines and planes of closest fit to systems of points in space. The Lond Edinb Dublin Philos Mag J Sci 2(11):559–572
Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. IEEE Trans Pattern Anal Mach Intell 12(7):629–639
Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. In: Proceedings of the 11th european conference on computer vision: Part IV, ECCV’10. Springer-Verlag, Berlin, Heidelberg, pp 143–156
Safdar A, Khan MA, Shah JH, Sharif M, Saba T, Rehman A, Javed K, Khan JA (2019) Intelligent microscopic approach for identification and recognition of citrus deformities. Microsc Res Tech 82(9):1542–1556
Showalter R (1969) Partial differential equations of Sobolev-Galpern type. Pac J Math 31(3):787–793
Showalter R (1975) A nonlinear parabolic-Sobolev equation. J Math Anal Appl 50(1):183–190
Showalter R, Ting T (1970) Pseudoparabolic partial differential equations. SIAM J Math Anal 1(1):1–26
Song T, Li H, Meng F, Wu Q, Cai J (2018) LETRIST: locally encoded transform feature histogram for rotation-invariant texture classification. IEEE Trans Circ Syst Video Technol 28(7):1565–1579
Song Y, Zhang F, Li Q, Huang H, O’Donnell LJ, Cai W (2017) Locally-transferred Fisher vectors for texture classification. In: 2017 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2017.526, pp 4922–4930
Stecher M, Rundell W (1977) Maximum principles for pseudoparabolic partial differential equations. J Math Anal Appl 57(1):110–118
Tao Z, Wei T, Li J (2021) Wavelet multi-level attention capsule network for texture classification. IEEE Sig Process Lett 28:1215–1219. https://doi.org/10.1109/LSP.2021.3088052
Ting T (1969) Parabolic and pseudo-parabolic partial differential equations. J Math Soc Jpn 21(3):440–453
Tu B, Kuang W, Zhao G, He D, Liao Z, Ma W (2019) Hyperspectral image classification by combining local binary pattern and joint sparse representation. Int J Remote Sens 40(24, SI):9484–9500
van Duijn C, Peletier L, Pop I (2007) A new class of entropy solutions of the Buckley-Leverett equation. SIAM J Math Anal 39:507–536
Varma M, Zisserman A (2005) A statistical approach to texture classification from single images. Int J Comput Vis 62(1):61–81
Varma M, Zisserman A (2009) A statistical approach to material classification using image patch exemplars. IEEE Trans Pattern Anal Mach Intell 31 (11):2032–2047
Vieira J, Abreu E (2018) Numerical modeling of the two-phase flow in porous media with dynamic capillary pressure. PhD thesis, University of Campinas Campinas, SP, Brazil
Wang S, Xiang N, Xia Y, You L, Zhang J (2021) Real-time surface manipulation with c1 continuity through simple and efficient physics-based deformations. Vis Comput 1–13
Weickert J (1996) Anisotropic diffusion in image processing
Weickert J (1997) A review of nonlinear diffusion filtering. In: Proceedings of the first international conference on scale-space theory in computer vision, SCALE-SPACE ’97. Springer-Verlag, Berlin, Heidelberg, pp 3–28
Witkin AP (1983) Scale-space filtering. In: Proceedings of the eighth international joint conference on artificial intelligence, IJCAI’83, vol 2. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 1019–1022
Xu X, Li Y, Wu QMJ (2021) A compact multi-pattern encoding descriptor for texture classification. Digit Sig Process 114. https://doi.org/10.1016/j.dsp.2021.103081
Xu Y, Ji H, Fermüller C (2009) Viewpoint invariant texture description using fractal analysis. Int J Comput Vis 83(1):85–100
Xue J, Zhang H, Dana K (2018) Deep texture manifold for ground terrain recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
You L, Yang X, Pan J, Lee TY, Bian S, Qian K, Habib Z, Sargano AB, Kazmi I, Zhang JJ (2020) Fast character modeling with sketch-based PDE surfaces. Multimed Tools Appl 79:23161–23187
Yu Y, Acton S (2002) Speckle reducing anisotropic diffusion. IEEE Trans Image Process 11(11):1260–1270
Zhai W, Cao Y, Zhang J, Zha ZJ (2019) Deep multiple-attribute-perceived network for real-world texture recognition. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)
Zhang H, Xue J, Dana K (2017) Deep TEN: Texture encoding network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2017.309, pp 2896–2905
Acknowledgements
J. B. Florindo gratefully acknowledges the financial support of São Paulo Research Foundation (FAPESP) (Grant #2016/16060-0) and from National Council for Scientific and Technological Development, Brazil (CNPq) (Grants #301480/2016-8 and #423292/2018-8). E. Abreu gratefully acknowledges the financial support of São Paulo Research Foundation (FAPESP) (Grant #2019/20991-8), from National Council for Scientific and Technological Development - Brazil (CNPq) (Grant #2 306385/2019-8) and PETROBRAS - Brazil (Grant #2015/00398-0). E. Abreu and J. B. Florindo also gratefully acknowledge the financial support of Red Iberoamericana de Investigadores en Matemáticas Aplicadas a Datos (MathData).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vieira, J., Abreu, E. & Florindo, J.B. Texture image classification based on a pseudo-parabolic diffusion model. Multimed Tools Appl 82, 3581–3604 (2023). https://doi.org/10.1007/s11042-022-12048-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12048-2