Erratum to: Multimed Tools Appl

DOI 10.1007/s11042-012-1130-0

The Publisher regrets that the original version of this article incorrectly cited the references. Corrected citations to the references are presented below.

1 1 Introduction

As a new multimedia type, 3D models play an important role in many applications, such as movie production, entertainment, virtual reality, and so on. The popularization of 3D data has created an urgent demand for effective 3D model retrieval systems. Searching 3D databases for similar models can sometimes speed up design or research processes, thus, content-based 3D model retrieval methods have become a hot topic during the last few years [9]. Such methods can be used to discover geometric relationships in 3D data, and find the required data from local databases or from the Internet.

The main challenge to a content-based 3D model retrieval system is the extraction of the most representative features of 3D models [39]. Several retrieval methods for searching databases for similar 3D models were surveyed [7], and a comparative study of these methods was undertaken [8]. Various 3D shape searching techniques were classified and compared based on shape representations, and the further research directions were introduced [21]. The existing algorithms are mainly limited to two problems, one is dealing with the degeneracy of 3D objects, and the other is rotation normalization. This is due to the fact that normalization for moving and scaling can be easily solved. Concerning the description of rotation invariance, two widely acceptable solutions have been applied. One is the rotation normalization of 3D models prior to feature extraction. The existing rotation normalization methods are mainly used in principal component analysis (PCA), continuous PCA [40], and normal PCA [30]. Although descriptors based on rotation normalization may improve the discriminative power of the descriptors, some similar models cannot easily be normalized using the same coordinates. The other method makes use of native rotation invariant 3D model descriptions, such as D2 descriptor [29], spherical harmonics (SH) [22] and Shells [2]. A shell descriptor is easy to compute but results in low retrieval performance. SH can produce better results, but they are not robust against the degeneracy of 3D objects. In contrast, the D2 descriptor and its improved types are robust against a 3D object’s degeneracy. It is meaningful to do more work on the improvements of shape distribution-based methods.

The combined methods have also been of great interest in recent years. An exterior shape feature called ART-based elevation descriptor has been combined with an interior shape feature, which is named the shell grid descriptor, for 3D model retrieval [35]. The weights can be fixed [4] or adjusted [26]. Daras et al. presented a combination framework for three different types of descriptors and made an in-depth study of the factors that can further improve the 3D model retrieval accuracy [14].

In order to improve the retrieval performance, the symmetrical plane of 3D model is presented by using Principal plane analysis [10], which is a distinctive characteristic of 3D models, and finding more principal planes is useful in representing more information on 3D models.

The main contributions of this paper are listed as follows:

First, an in-depth research of a 3D model’s symmetrical plane has been conducted, as well as the second and the third principal planes. In other words, the second and the third symmetrical planes of a 3D model are found based on sequential quadratic programming and geometric theory. Since most of 3D objects are regular symmetrical models, 3D model retrieval is effective by adopting some characteristics of these principal planes. Two characteristics of the three principal planes are used in this paper: angle information and histogram classification.

Second, a novel and effective 3D model retrieval method using combined shape distribution is proposed. The classical shape distribution methods are improved via three phases by using the proposed Combined Shape Distribution (CSD). First, the angles between the direction vectors and the normal vectors of the three principal planes are taken into account in the design of the shape distribution descriptors, and then the Alpha/Eta/Distance (AED) and Beta/Eta/Distance (BED) histograms are presented. Second, both the proposed AED and BED histograms can be classified as belonging to one of three types: positive, negative, and crossed, by each principal plane. Based on this concept, two combined descriptors, Symmetric AED (SAED) and Symmetric BED (ABED) are proposed. Finally, the weighted compounding of these two descriptors is the final descriptor, which is named a Combined Shape Distribution (CSD) descriptor. Both the average performance measures and visual results show that the proposed CSD outperformed good retrieval results.

The remainder of the paper is organized as follows: In section 2, the related work is presented. The novel 3D model retrieval is described in section 3. The experimental results are given in section 4. Finally, the conclusions are drawn in section 5.

2 2 Related work

2.1 2.1 State of the art methods

The commonly adopted methods are histogram-based [1, 2, 20, 24, 27, 29, 33, 44], transform-based [16, 22, 31, 40], and 2D view-based [3, 1113, 1719, 25, 28, 32, 36, 38, 4143].

Histogram-based approaches rely on the idea of accumulating feature information to obtain a global shape description. A D2 descriptor [29], which is a probability distribution histogram of two randomly selected points from the object’s surface, is robust against the 3D object’s degeneracy. However, it sacrifices discriminative accuracy. A modified D2 descriptor has been proposed [20], in which each D2 histogram is classified into three types: in distance, mixed distance, and out distance. The assignment of these distances depends on whether the line segment connecting the two points lies completely inside, partially inside or completely outside an object. However, the classification of these three cases is a complex task. The angle-distance (AD) and absolute angle-distance (AAD) descriptors have been proposed to compare 3D models [27]. AAD measures the distribution of absolute angles between the normal vectors of two associated surfaces, where the randomly selected points are located. Then it is combined with the distance between two selected points. It is a two dimensional descriptor which contains both the distance and the angle information. An exhaustive study of second order 3D shape features has been carried out, and many combined shape descriptors have proposed based on group integration, such as Beta/Distance (BD) and Alpha/Beta/Distance (ABD). Experiments showed that further improvements of the shape distributions can lead to better results than the well known methods [33]. The concentric ABD (CABD) and symmetrical ABD were introduced to improve the ABD descriptor, because they can exploit more location information of random points on the surface of 3D models [44].

Histogram-based methods collect numerical values from certain attributes of the 3D model, such as the local or global features, and are efficient and easy to implement. Although most lack the fine grain discrimination required for retrieval, the retrieval performance can be greatly improved by the combination of descriptors or some novel descriptors [1, 24]. Discovering some untapped and distinctive characteristics to improve the distribution-based methods is meaningful, hence, there have been many distribution-based methods proposed in recent years.

Transform-based methods register the surface points onto a 3D voxel or spherical grid by using a mapped function, which is then processed by transform tools, and the compacted descriptors can be obtained by keeping the first few transform coefficients in the descriptor vector. Furthermore, the pose invariance can be obtained by discarding the “phase” of the transform coefficients at the expense of some shape information. The Gaussian Euclidean Distance Transform (GEDT) descriptor and spherical Harmonic Descriptor (SHD) were proposed by Kazhdan et al. [22]. GEDT is a volumetric representation of the Gaussian Euclidean Distance Transform of a 3D object, expressed by the norms of the spherical harmonic frequencies, where SHD is a rotation invariant representation of the GEDT. The Radialized Spherical Extent Function (REXT) [40] is a collection of spherical functions giving the maximal distance from the center of mass as a function of spherical radius and angle. A Concrete Radialized Spherical Projection (CRSP) descriptor was proposed by Papadakis et al. [31], which used the Continuous PCA (CPCA) along with Normals PCA (NPCA) to alleviate the rotation invariance problem and describes a 3D object using a volumetric spherical function based representation expressed by spherical harmonics.

2D view-based methods consider the 3D shape as a collection of 2D projections taken from different view points of the 3D model, and each projection is then described by standard 2D image descriptors, such as Fourier descriptors or Zernike moments. These methods can obtain a good retrieval performance but have large feature size and high matching cost. Chen et al. [11] proposed the Light Field Descriptor (LFD), which is comprised of Zernike moments and Fourier coefficients computed on a set of projections taken from the vertices of a dodecahedron. The views are captured using five groups of cameras, where each group of views is modeled by Markov Chain to measure the similarity between two 3D models [18]. Vranic proposed a shape descriptor where the features are extracted from depth buffers produced by six projections of the object, one for each side of a cube which encloses the object [41]. In the same work, the Silhouette-based (SIL) descriptor was proposed which uses the silhouettes produced by the three projections taken from the Cartesian planes. Vranic [42] developed a hybrid descriptor called DESIRE, which consists of the Silhouette, Ray and Depth buffer based descriptors, which are combined linearly by fixed weights. A 3D shape descriptor called PANORAMA was proposed [32], which using a set of panoramic views of a 3D model which describes the position and orientation of the model’s surface in 3D space.

3D retrieval methods based on views selecting is a hot research field in recent years. A Bayesian 3D search engine using adaptive views clustering was proposed by Ansary et al. [3], to provide an optimal selection of 2D views from a 3D model, and a probabilistic Bayesian method for 3D model retrieval from these views. A Compact Multi-View Descriptor (CMVD) is introduced. A set of 2D rotation-invariant shape descriptors is extracted from the automatically generated multi-views. [12]. It is further improved to support multimodal queries, e.g. 2D images, sketches, and 3D objects, to handle the different types of multimedia data [13]. Each 3D model is represented by a free set of views, and the CCFV model is generated on the basis of the query Gaussian models by combining the positive matching model and the negative matching model [43]. A shape similarity system is presented based on the correspondence of visual 2D parts. These parts are obtained by a shape segmentation approach using the Curvature Scale Space (CSS) descriptor in order to solve scale problems. Then the partial search method is combined with a probabilistic approach [25]. For a 3D model and a corresponding query range image, a virtual camera with intrinsic and extrinsic parameters that would generate an optimum range image, in terms of minimizing an error function that takes into account the salient features of the models, when compared to other parameter sets or other target 3D models. The optimal solution of parameter space is based on hierarchically searching [38]. An efficient approach to learn a distance metric for the newly selected query view and the weights for combining all of the selected query views is proposed [19]. A retrieval method based on multi-scale local visual features was proposed based on the rendering of a set of range images from multiple viewpoints. For each image, the well-known Scale Invariant Feature Transform (SIFT) is used to extract local features. Then, all the local features of the model are integrated into a single feature vector by using the Bag-of-Features approach [28]. Shih et al. proposed a rotation invariant elevation descriptor, and each elevation is represented by a gray-level image that is decomposed into several concentric circles [36]. A spatial structure circular descriptor (SSCD) is proposed by Gao et al., \( {A_{{1}}}^{{2}} + {B_{{1}}}^{{2}} + {C_{{1}}}^{{2}} = {1} \)which described the spatial structure of 3D model by 2D images, and the attribute values of each pixel represent 3D spatial information [18].

2.2 2.2 Principal plane analysis

In the case of 3D feature space S 3, the principal plane H 1 can be represented as [10].

$$ {A_{{1}}}x + {B_{{1}}}y + {C_{{1}}}z = {D_{{1}}} $$
(1)

The origin of the 3D space is translated to the centroid of the 3D model, the D 1 becomes zero. A 1, B 1, and C 1 are the directional normal vectors of H 1 that satisfy the following relationship:

$$ A_{1}^{2} + B_{1}^{2} + C_{1}^{2} = 1 $$
(2)

The principal plane is the plane H 1 with a minimal value of\( \delta ({S^3},H) \):

$$ J = \min \delta ({S^3},H) = \sum {\sum\limits_{{(x,y,z) \in {S^3}}} {\sum {{{\left( {{A_1}x + {B_1}y + {C_1}z} \right)}^2}} } } $$
(3)

Differentiate J with respect to A 1, B 1 , C 1, and then equate them to zero, gives

$$ \left\{ {\matrix{ {{m_{{2,0,0}}}{A_1} + {m_{{0,1,1}}}{B_1} + {m_{{1,0,1}}}{C_1} = 0} \\ {{m_{{1,1,0}}}{A_1} + {m_{{0,2,0}}}{B_1} + {m_{{0,1,1}}}{C_1} = 0} \\ {{m_{{1,1,0}}}{A_1} + {m_{{0,2,0}}}{B_1} + {m_{{0,1,1}}}{C_1} = 0} \\ }<!end array> } \right. $$
(4)

\( {m_{{s,t,u}}} \) is a 3D moment given by

$$ {m_{{s,t,u}}} = \sum {\sum\limits_{{\left( {x,y,z} \right) \in {S^3}}} {\sum {{x^s}{y^t}{z^u}} } } $$
(5)

From (4) and (5), it can be seen that

$$ \frac{{{A_1}}}{{{B_1}}} = \frac{{{m_{{0,2,0}}}{m_{{1,0,1}}} - {m_{{1,1,0}}}{m_{{0,1,1}}}}}{{{m_{{2,0,0}}}{m_{{0,1,1}}} - {m_{{1,1,0}}}{m_{{1,0,1}}}}} = {k_1} $$
(6)
$$ \frac{{{C_1}}}{{{B_1}}} = \frac{{{m_{{0,2,0}}}{m_{{1,0,1}}} - {m_{{0,1,1}}}{m_{{1,1,0}}}}}{{{m_{{0,0,2}}}{m_{{1,1,0}}} - {m_{{0,1,1}}}{m_{{1,0,1}}}}} = {k_2} $$
(7)

Combine (2), (6) and (7), and the normal vector (A 1, B 1 , C 1,) of H 1 is obtained.

$$ \left( {{A_1},{B_1},{C_1}} \right) = \left( {\frac{{{k_1}}}{{\sqrt {{1 + k_1^2 + k_2^2}} }},\;\frac{1}{{\sqrt {{1 + k_1^2 + k_2^2}} }},\;\frac{{{k_2}}}{{\sqrt {{1 + k_1^2 + k_2^2}} }}} \right) $$
(8)

Thus, the principal plane of the 3D model is obtained, whose normal is n p1 = (A 1, B 1, C 1). It is regarded as the first principal plane in this research.

2.3 2.3 Group integration and ABD descriptor

Group integration is an effective method to obtain invariant features [34], and is a constructive approach. It is supposed that f(x) is a kernel function for the object; an invariant feature can then be constructed by integrating the kernel over the considered group.

$$ F = \int_{\xi } {f\left( {gx} \right)} dg $$
(9)

Shape distribution is embedded in the theory of group integration, and BD and ABD descriptors are proposed [33]. The feature extraction process of 3D models can be briefly explained as follows. Random points are sampled using Osada’s method [22], and then the distance of two points, the angles of two associated surface normal vectors, and the angle of the distance vector with one of the surface normal vectors, are measured. Finally their probability distributions are computed and the ABD histogram is obtained.

The ABD histogram combines all three attributes, the distance, the angles between the surface normal vectors, and the angles between the distance vector and one of the surface normal vectors in a three dimensional histogram. It is defined by

$$ ABD\left( {d,\alpha, \beta } \right) = \int_{{({p_{{1,}}}{p_2}) \in S \times S}} {{\delta_d}} \left( {\left\| {{p_1} - {p_2}} \right\|} \right){\delta_{\alpha }}\left( {\left| {{n_1} \cdot {n_2}} \right|} \right){\delta_{\beta }}\left( {\frac{{\left| {{n_1} \cdot ({p_1} - {p_2})} \right|}}{{\left\| {{p_1} - {p_2}} \right\|}}} \right) $$
(10)

3 3 The proposed 3D model retrieval scheme

There are four steps to achieve the proposed combined shape distribution (CSD) for the 3D model. First, the second and the third principal planes are solved. Second, AED and BED descriptors are proposed based on group integration. Third, SAED and SBED are represented using the histogram classification for AED and BED. Finally, CSD is undertaken by dynamically weighting. The details are shown in Fig. 1.

Fig. 1
figure 1

The flow chart of proposed CSD after identifying the three principal planes

3.1 3.1 Solving for the second and third principal planes

n p2  = (A 2, B 2, C 2) and n p3  = (A 3, B 3, C 3) are the normal vectors of the second and third principal planes. The problem of the second principal plane should be satisfied by the following conditions: it is perpendicular to the first principal plane, and points on the 3D model’s surface have the least distance to the second principal plane. Thus the problem of solving the second principal plane can be transformed into the following constrained quadratic programming:

$$ f = \min \left\{ {{m_{{200}}}x_1^2 + {m_{{020}}}x_2^2 + {m_{{002}}}x_3^2 + 2{m_{{110}}}{x_1}{x_2} + 2{m_{{101}}}{x_1}{x_3} + 2{m_{{011}}}{x_2}{x_3}} \right\} $$
(11)
$$ s.t\quad \left\{ {\matrix{ {{A_1}{x_1} + {B_1}{x_2} + {C_1}{x_3} = 0} \\ {x_1^2 + x_2^2 + x_3^2 = 1} \\ }<!end array> } \right. $$
(12)

The solution of this problem is obtained by sequential quadratic programming [6], given n p2  = (A 2, B 2, C 2) = (x 1, x 2, x 3).

The normal vector of the third principal plane n p3  = (A 3, B 3, C 3) should be perpendicular to the normal vectors of the first and the second principal planes, and the centroid of the 3D model should be located on the third principal plane. Thus:

$$ \left\{ {\matrix{ {{A_1}{A_3} + {B_1}{B_3} + {C_1}{C_3} = 0} \\ {{A_2}{A_3} + {B_2}{B_3} + {C_2}{C_3} = 0} \\ { A_3^2 + B_3^2 + C_3^2 = 1} \\ }<!end array> } \right. $$
(13)

Hence

$$ \frac{{{A_3}}}{{{B_3}}} = \frac{{{B_1}{C_2} - {C_1}{B_2}}}{{{C_1}{A_2} - {A_1}{C_2}}} = {k\prime_1} $$
(14)
$$ \frac{{{C_3}}}{{{B_3}}} = \frac{{{A_1}{B_2} - {B_1}{A_2}}}{{{C_1}{A_2} - {A_1}{C_2}}} = {k\prime_2} $$
(15)

The above equations are combined, giving

$$ \left( {{A_3},{B_3},{C_3}} \right) = \left( {\frac{{{{k\prime}_1}}}{{\sqrt {{1 + k\prime_1^2 + k\prime_2^2}} }},\;\frac{1}{{\sqrt {{1 + k\prime_1^2 + k\prime_2^2}} }},\;\frac{{{{k\prime}_2}}}{{\sqrt {{1 + k\prime_1^2 + k\prime_2^2}} }}} \right) $$
(16)

Finally, the three principal planes are obtained.

3.2 3.2 The proposed AED and BED descriptors

The goal of this paper is to add more angle information to the distribution-based methods. Since most 3D models are symmetrical or partially symmetrical, the three symmetrical planes are important. The Eta angle is selected and gives the position between the points to the symmetrical plane. AED and BED are the improved types of AAD and BD by embedding the Eta angle information. The Eta angle is defined as the angle between the distance vectors of two selected points and the normal vectors of the principal planes in this work, and describes the angle information of the line connecting the random points with the principal plane.

The AED histogram is a three dimensional descriptor like ABD and combines the distance of two points, the cosine of the angle of two associated surface normal vectors \( {n_1} \cdot {n_2} \), and the cosine of the direction vector with the normal vector of the i-th principal plane \( ({p_1} - {p_2}) \cdot {n_{{pi}}} \), which is defined by

$$ \begin{array}{*{20}{c}} \hfill {AE{{D}_{i}}(d,\alpha ,\eta ) = \int_{{({{p}_{{1,}}}{{p}_{2}}) \in S \times S}} {{{\delta }_{d}}} \left( {\left\| {{{p}_{1}} - {{p}_{2}}} \right\|} \right){{\delta }_{\alpha }}\left( {\left| {{{n}_{1}}\cdot {{n}_{2}}} \right|} \right){{\delta }_{\eta }}\left( {\frac{{\left| {({{p}_{1}} - {{p}_{2}})\cdot {{n}_{{pi}}}} \right|}}{{\left\| {{{p}_{1}} - {{p}_{2}}} \right\|}}} \right)} \\ \hfill {i = 1,\:2,3} \\ \end{array} $$
(17)

AED is defined by AED = (AED 1, AED 2, AED 3).

The BED histogram is another three dimensional descriptor, that combines the distance of two points, the cosine of the angle of two associated surface normal vectors \( {n_1} \cdot {n_2} \), and the cosine of the direction vector and the principal plane \( ({p_1} - {p_2}) \cdot {n_{{pi}}} \), which is defined by

$$ \begin{array}{*{20}{c}} \hfill {BE{{D}_{i}}(d,\beta ,\eta ) = \int_{{({{p}_{{1,}}}{{p}_{2}}) \in S \times S}} {{{\delta }_{d}}} \left( {\left\| {{{p}_{1}} - {{p}_{2}}} \right\|} \right){{\delta }_{\beta }}\left( {\frac{{\left| {{{n}_{1}}\cdot ({{p}_{1}} - {{p}_{2}})} \right|}}{{\left\| {{{p}_{1}} - {{p}_{2}}} \right\|}}} \right){{\delta }_{\eta }}\left( {\frac{{\left| {({{p}_{1}} - {{p}_{2}})\cdot {{n}_{{pi}}}} \right|}}{{\left\| {{{p}_{1}} - {{p}_{2}}} \right\|}}} \right)} \\ \hfill {i = 1,\:2,3} \\ \end{array} $$
(18)

The histogram of BED is also defined by BED = (BED 1, BED 2, BED 3).

3.3 3.3 Histogram classification

The 3D model is partitioned into two parts, the positive part and the negative part, by each principal plane. Thus the two sampled points are of three types: both in the positive part as in Fig. 2 (a), or both in the negative part as in Fig. 2 (b), and in the different part as in Fig. 2 (c).

Fig. 2
figure 2

Three types of the two points on a 3D model surface: the two points in the positive part (a), in the negative part (b), and in the different parts (c). p 1 and p 2 are the two selected points, n 1 and n2 are the normal vectors of the selected triangle surface, and n pi is the normal vector of the i-th principal plane (i = 1,2 or3)

AED and BED histograms are partitioned into three histograms by each principal plane: positive, negative, and crossed histograms. These three types of histogram are then combined as one feature named as symmetric AED (SAED) and symmetric BED (SBED) histograms respectively. Take the AED descriptor as an example: the SAED, which is classified by the i-th plane, is defined as

$$ SAE{D_i} = \left( {AE{D_{{positave}}},\;AE{D_{{crossed}}}, \;AE{D_{{negative}}} } \right)\quad i = 1,2,3 $$
(19)

Thus a total SAED histogram can be defined as

$$ SAED = \left[ {\matrix{ {AE{D_{{1positave}}}} &{AE{D_{{1crossed}}}} &{AE{D_{{1negative}}}} \\ {AE{D_{{2positave}}}} &{AE{D_{{2crossed}}}} &{AE{D_{{2negative}}}} \\ {AE{D_{{3positave}}}} &{AE{D_{{3crossed}}}} &{AE{D_{{3negtive}}}} \\ }<!end array> } \right] $$
(20)

Likewise,the SBED is defined as

$$ SBE{D_i} = \left( {BE{D_{{positave}}},\;BE{D_{{crossed}}}, \;BE{D_{{negative}}} } \right)\quad i = 1,2,3 $$
(21)

Thus the total SBED histogram can be defined as

$$ SBED = \left[ {\matrix{ {BE{D_{{1positave}}}} &{BE{D_{{1crossed}}}} &{BE{D_{{1negative}}}} \\ {BE{D_{{2positave}}}} &{BE{D_{{2crossed}}}} &{BE{D_{{2negative}}}} \\ {BE{D_{{3positave}}}} &{BE{D_{{3crossed}}}} &{BE{D_{{3negtive}}}} \\ }<!end array> } \right] $$
(22)

3.4 3.4 Similarity matching and the proposed CSD

The similarity matching method of the descriptors presented is based on distance measuring; the most widely adopted is the L n distance. However, it is not a sufficiently dissimilarity metric for histogram-based matching. Some more sophisticated metrics, such as the normalized distance [15] and the diffusion distance [23], have been proposed for similarity measures. Normalized distance is used as the distance measure of the proposed descriptors in this research.

Let a, b, be two 3D models, and h a and h b are their two histograms (SAED or SBED). Since each 3D model has a separate coordinate system, the negative part of one model may be more similar to the positive part than the negative part of another model. Hence, the similarity measure is adopted as the maximum similarity of these two types: the similarity of h a and h b, and the similarity of \( h_a^{\prime} \) and h b., where \( h_a^{\prime} \) is the symmetrical histogram of h a. Taking the SAED histogram as an example, the symmetrical histogram of the 3D model a is\( h_{{SAEDa}}^{\prime} \), and is defined as

$$ h_{{SAEDa}}^{\prime} = \left[ {\matrix{ {AE{D_{{1negative}}}} &{AE{D_{{1crossed}}}} &{AE{D_{{1positave}}}} \\ {AE{D_{{2negative}}}} &{AE{D_{{2crossed}}}} &{AE{D_{{2positave}}}} \\ {AE{D_{{3negtive}}}} &{AE{D_{{3crossed}}}} &{AE{D_{{3positave}}}} \\ }<!end array> } \right] $$
(23)

Where \( {h_{{SAEDa}}} \)is:

$$ {h_{{SAEDa}}} = \left[ {\matrix{ {AE{D_{{1positave}}}} &{AE{D_{{1crossed}}}} &{AE{D_{{1negative}}}} \\ {AE{D_{{2positave}}}} &{AE{D_{{2crossed}}}} &{AE{D_{{2negative}}}} \\ {AE{D_{{3positave}}}} &{AE{D_{{3crossed}}}} &{AE{D_{{3negtive}}}} \\ }<!end array> } \right] $$
(24)

The similarity measure of the SAED descriptor is compared using the following formula.

$$ SAE{D_{{similarity}}} = \left( {1 - \min \left\{ {\sum\limits_{{i = 1}}^n {\frac{{2\left| {{h_{{SAEDa}}}(i) - {h_{{SAEDb}}}(i)} \right|}}{{\;\left| {{h_{{SAEDa}}}(i) + {h_{{SAEDb}}}(i)} \right|}},\;\sum\limits_{{i = 1}}^n {\frac{{2\left| {h_{{SAEDa}}^{\prime}(i) - {h_{{SAEDb}}}(i)} \right|}}{{\;\left| {h_{{SAEDa}}^{\prime}(i) + {h_{{SAEDb}}}(i)} \right|}}} } } \right\}} \right) \times 100\% $$
(25)

Likewise, the similarity measure of the SBED descriptor is:

$$ SBE{D_{{similarity}}} = \left( {1 - \min \left\{ {\sum\limits_{{i = 1}}^n {\frac{{2\left| {{h_{{SBEDa}}}(i) - {h_{{SBEDb}}}(i)} \right|}}{{\;\left| {{h_{{SBEDa}}}(i) + {h_{{SBEDb}}}(i)} \right|}},\;\sum\limits_{{i = 1}}^n {\frac{{2\left| {h_{{SBEDa}}^{\prime}(i) - {h_{{SBEDb}}}(i)} \right|}}{{\;\left| {h_{{SBEDa}}^{\prime}(i) + {h_{{SBEDb}}}(i)} \right|}}} } } \right\}} \right) \times 100\% $$
(26)

The final score for the CSD is computed by using

$$ CS{D_{{similarity}}} = {w_1}SAE{D_{{similarity}}} + {w_2}SBE{D_{{similarity}}} $$
(27)

Where w 1 and w 2 are the selected weights of SAED and SBED respectively, and they satisfy that

$$ {w_1} + {w_2} = 1 $$
(28)

They are dynamically adjusted by the method proposed in a previous work [26].

The purity is an estimate of the “goodness” of a shape descriptor in characterizing a category in a given database. The purity assumes that the database of the 3D models is pre-classified into M classes. Let SD i be the shape descriptor of SAED and SBED (SD 1 = SAED, SD 2c = SAED). Let \( S_i^k \) be the number of models retrieved from a class C k, 1 ≤ kM, using the shape descriptor SDi. The purity, written purity (SDi, q, t), is computed as below for a shape descriptor SD i, query q, and a positive integer constant t.

$$ purity\left( {S{D_i},q,t} \right) = \mathop{{\max }}\limits_{{1 \leqslant k \leqslant M}} \left( {\left| {S_i^k} \right|} \right) $$
(29)

In other words, the purity for the SD i is higher if the SD returned more models from the category C k in the top t retrievals, regardless of the class. Using the purity (SD i, q, t), the adaptive weight\( {w_i} \) between the query q and the 3D model oU is computed as follows.

$$ \left\{ {\matrix{ {{w_1} = purity\left( {SAED,q,t} \right) - 1} \\ {{w_2} = purity\left( {SBED,q,t} \right) - 1} \\ }<!end array> } \right. $$
(30)

4 4 Experimental results

The database of the Princeton Shape Benchmark (PSB) [37] and Shape Retrieval Contest 2009 (SHREC2009) of partial 3D models [5] was used in the experiments. The PSB provides a test data set that includes 907 models in 92 classes. SHREC2009 3D partial model dataset consists of query and target dataset. The target set is composed of 720 3D models classified in 40 categories, and the query set consists of 20 3D models.

The precision-recall diagram and the other four quality measures in the following were used for performance evaluation.

Nearest neighbor

The percentage of models that are the closest to the query one and are in the same category.

First-tier and Second-tier

These are the percentage of models belonging to the same category as the query one that appears within the top (C-1) and 2*(C-1) matches respectively.

Discounted cumulative gain (DCG)

A user is more likely to consider elements near the front of the ranked list. DCG is a statistic measure that weights correct results near the front of the ranked list more than correct results later in the list.

Precision-recall curve

For each query model in class C and any number K of top matches, Recall is the percentage of models in class C accurately retrieved within the top K matches. Precision represents the percentage of the top K matches which are members of class C. The precision-recall plot indicates the relationship between precision and recall in a ranked list of matches. Curves closer to the upper right corner represent superior retrieval performance. In all cases, an ideal score for these measures is 100 %, and higher scores represent better results.

The proposed method was coded in Matlab R2011a and tested on an Intel (R) CoreTM 2 Quad 2.7 GHz CPU with a 2 G of RAM personal computer.

4.1 4.1 Compare with shape distribution-based methods

Three shape distribution-based methods D2, AAD, and ABD were implemented in the experiment by us, and the normalized distance was used as the distance measure of these descriptors. The bar plots of First Tier (FT), Second Tier (ST), Nearest Neighbor (NN), and Discounted Cumulative Gain (DCG) of these descriptors and the proposed CSD are shown in Fig. 3. Obviously, all the performance measures are highly improved by using the proposed CSD. For example, compared with D2, the FT value of the proposed CSD is improved by almost 90 %, and improved by 30 % comparing with ABD.

Fig. 3
figure 3

Bar plot of First Tier (FT), Second Tier (ST), Nearest Neighbor (NN), and Discounted Cumulative Gain (DCG)

Visual results were also given by using the PSB dataset. When the 3D model ‘325.off’ from the ‘_Potted Plant_’ is selected as a query model, the visual retrieval results by using the CSD and other methods are as shown in Fig. 4. Figure 4(a) is the result achieved by using the D2 method, where only 3 models were retrieved; Fig. 4(b) shows the results produced by the AAD method.

Fig. 4
figure 4

Visual retrieval results of the ‘_PottedPlant_’ class using different methods

Figure 4(c) is the result produced by the ABD method; where 4 models were retrieved from the top 7 models; these three models are more similar to the potted plant model than the others. 7 models were retrieved by using the CSD method in Fig. 4(d).

When the 3D model ‘26.off’ from the ‘_Biplane_’ is selected as a query model, the results are shown in Fig. 5. Figure 5(a) is the result produced by using the D2 method; only 2 models were retrieved. Figure 5(b) is the result produced by using the AAD method; 3 relative models were retrieved.

Fig. 5
figure 5

Visual retrieval results of the ‘_Biplane_’ class using different methods

Figure 5(c) is the result of using the ABD method; 5 models were retrieved from among the top 7 models; whereas 7 models were retrieved by the CSD method. It is obvious that the CSD method is greatly superior in actual practice.

The speed of feature extracting is dependent on the number of vertices in the 3D model. The size, extract and retrieval time of shape distribution-based descriptors are shown in Table 1. The retrieval times of all these descriptors are very small, which guaranteed a fast searching online. Compared with ABD, the extract time of CSD is increased by 20 %, but the retrieval performance is improved more than 30 % shown in Figs. 3, 4, and 5. Thus, the proposed CSD is also meaningful for use. The iterations are exhaustively time consuming in the current implementation but an optimization method could be developed to improve its speed.

Table 1 The size, extract and retrieval time of shape distribution-based descriptors

4.2 4.2 Compare with state of the art methods

The experiment results were also compared to those of the following methods: Compact Multi-View Descriptor uses depth images (CMVD-depth) [42] (a), Adaptive views clustering (AVC) (b) [3], Salient Hierarchy [38] (c), and the bag-of-features Scale Invariant Feature Transform (BF-SIFT) [28] (d). The precision-recall curves of these descriptors are borrowed from their published papers [3, 28, 38, 42]. The average precision-recall curves of these descriptors and the CSD on the PSB dataset are shown in Fig. 6. Obviously, the Proposed CSD is better than AVC and Salient Hierarchy. When the recall values ranked from 0.7 to 1, the precision values of the CSD are better than the CMVD and the BF-SIFT, which means that the CSD has a better retrieval accuracy for the last 30 % of the relevant retrieved results. Although the BF-SIFT outperforms better at the first 50 % of the relevant retrieved results, and the CMVD outperforms better at the first 70 % of the relevant retrieved models, the proposed CSD has its own advantages. Fist it is a rotation invariant descriptor, the pose normalization of 3D models is not needed. Second, it is based on the randomly selected points on 3D surface, it has a good characteristic for degeneracy and partial distortion. Overall, it is concluded that CSD gives a good compromise between quality and cost for retrieving 3D models on the PSB dataset.

Fig. 6
figure 6

The average precision-recall curves on the PSB dataset

The average precision-recall curves of these descriptors and the CSD on the SHREC2009 3D partial dataset are shown in Fig. 7. When the recall values ranked from 0 to 0.5, the precision values of the CSD are higher than other descriptors, it means that the CSD has a better retrieval accuracy for the first 60 % of the relevant retrieved results. Overall, it is concluded that CSD gives a good compromise between quality and cost for retrieving 3D models on the SHREC2009 3D partial dataset.

Fig. 7
figure 7

The average precision-recall curves on SHREC2009 partial shape retrieval track

5 5 Conclusion

This study involves researching into principal plane analysis to extract meaningful characteristics of 3D models for further improvements of shape distribution based methods. Firstly, based upon principal plane analysis, the second and the third principal planes are obtained. In order to make sure that the second principal plane identified is the most symmetrical plane, which is perpendicular to the first principal plane, the problem of determining the second principal plane is transferred into nonlinear programming with constraint conditions. Sequential quadratic programming is used for solving this problem. The third plane is perpendicular to the first and the second principal planes. Secondly, the normal vectors of the three principal planes are adopted to obtain two novel, enhanced shape distributions, which are named as Alpha/Eta/Distance (AED), and Beta/Eta/Distance (BED) histograms. Thirdly, since a histogram can be classified into one of three types: positive, negative, and crossed by each principal plane, the integration of these three types of AED and BED histogram are named Symmetric AED (SAED) and Symmetric BED (SBED) respectively, and then a combined shape distribution (CSD) is proposed based upon the synthesis and combination of SAED and SBED. As a consequence of the good characteristics of the shape distribution based method, further work will be directed towards the extraction of more deeply embedded and distinctive characteristics of 3D models in order to improve the retrieval performance. In addition, intelligent learning algorithms, e.g. fuzzy logic rules, neural network, and support vector machine, could be consider to dynamically adjust the weights of different descriptors for a better combination.