Simple primitive recognition via hierarchical face clustering

We present a simple yet efficient algorithm for recognizing simple quadric primitives (plane, sphere, cylinder, cone) from triangular meshes. Our approach is an improved version of a previous hierarchical clustering algorithm, which performs pairwise clustering of triangle patches from bottom to top. The key contributions of our approach include a strategy for priority and fidelity consideration of the detected primitives, and a scheme for boundary smoothness between adjacent clusters. Experimental results demonstrate that the proposed method produces qualitatively and quantitatively better results than representative state-of-the-art methods on a wide range of test data.


Introduction
Triangular meshes are one of the most popular representations of 3D shapes in computer graphics and 3D vision. In recent years, with the rapid development of 3D data acquisition techniques, it has become much easier to obtain precise geometric data. However, the obtained raw data has a large number of elements and lacks high-level information, so is difficult to use directly in downstream applications. For example, in industrial design and manufacturing applications, the original computer aided design (CAD) model might be unavailable or unsuitable for processing by current software for various reasons, leaving only a tessellated triangular mesh available. Hence the detection and recognition of high-level primitives in complex 3D data are required by many applications, such as reverse engineering [1,2], 3D printing [3,4], and other digital technologies.
Amongst these methods, hierarchical clustering is the simplest and most efficient algorithm for primitive detection, especially for CAD models [18,19]. It performs pairwise clustering of adjacent clusters from bottom to top, according to designed merging criteria. In this paper, we propose an improved hierarchical clustering algorithm for extracting simple planar and quadric primitives from triangular meshes. Three major improvements are made upon previous algorithms, and an example result is shown in Fig. 1. Firstly, the primitive priority is taken into account during clustering, which is plane > cylinder > sphere > cone, where > denotes "is preferred to". Secondly, the smoothness of the boundary of each cluster is characterized by an additional regularization term. Third, fidelity is considered to avoid unnecessary merging even under acceptable fitting error, as is shown in Fig. 3.
We have conducted extensive comparisons with existing representative approaches [11,[18][19][20], and  [19], (b) Variational Mesh Segmentation (VMS) [11], (c) Feature-Aligned Segmentation (FAS) [20], and (d) our result. Colors are randomly assigned to segmented patches. (e) Error colormap. The color represents the distance from the point in the model to the fitted surface. Blue (value=0) indicates that the point is exactly located on the fitted surface, while red (value=1) indicates the largest distance between the fitted surface and the data point. Our result has better boundary smoothness and more meaningful extracted primitives, as highlighted.
demonstrated the advantages of our approach using a wide range of test data. The main contributions of this paper include the following: • a simple yet efficient algorithm for primitive recognition using hierarchical face clustering; • simultaneous consideration of both shape priority and fidelity during clustering; • a new boundary smoothing formulation for improved boundary regularization.

Related work
Primitive extraction can be regarded as a mesh segmentation problem, which has been comprehensively studied in recent decades [21,22]. Various criteria have been proposed for different tasks. For example, approximation fidelity and patch smoothness are the major concerns in reverse engineering and shape approximation, 3D printing imposes the printability and size constraints for each part, shape analysis and parsing segment shapes along concave lines, while scene understanding performs semantic labeling for high-level primitives. In the following, we focus on simple primitive extraction algorithms, and briefly consider major mesh segmentation approaches for different applications.

Shape approximation
Region growing is commonly used in many reverse engineering systems to extract smooth regions for scanned CAD meshes [1,15,23,24]. It is also the key component of other clustering-based algorithms. Starting from a seed point, it repeatedly groups neighboring unlabeled elements with similarity of local properties, e.g., normals or curvatures. Although region growing is very efficient, it is difficult to predict the number of patches and always needs a certain amount of post-processing. Variational approximation performs primitive fitting and region growing repeatedly to minimize an energy function with respect to different shape primitives. This optimization process is also known as Lloyd iteration [25]. Various metrics have been designed for extracting different primitives, e.g., planes [6], spheres and cylinders [7], ellipsoidal patches [9] and volumes [10], developable surfaces [8], and general quadric surfaces [11,16]. This type of approach works well for clean shapes with clear structures, but is often time consuming due to its need for optimization.
RANSAC can be used to extract parametric primitives from raw data directly [26]; it originated in the computer vision community. However, the extracted primitives cannot be guaranteed to form connected regions and misclassification can occur. Hence, it is restricted to use as a pre-processing step for other algorithms [27].
Hierarchical clustering performs pairwise clustering of adjacent elements from bottom to top. A priority queue is used to determine which pair should be clustered. A cost function is designed to measure the cost of merging a pair of clusters. For example, Garland et al. [18] measured planarity and the regularity of clusters. Attene et al. [19] extended the cost function to simple quadric primitives (sphere and cylinders) and considered the convexity of volumetric components. These approaches work very well for shapes consisting of simple primitives, but always lead to non-smooth segmentation boundaries for scanned data.

Shape decomposition
Part salience and minima rules [28] are widely used in many shape analysis tasks; the concavity of shapes is the main cue used to guide the segmentation. This type of approach is also known as part-based segmentation [29]. Katz and Tal [13] proposed a novel hierarchical mesh decomposition algorithm based on fuzzy clustering and graphcut. Lai et al. [30] solved the shape decomposition problem using a random walk formulation. Lafarge et al. [31] used a Markov random field (MRF) to label the vertices of the mesh. Lien et al. [5] explored an alternative partitioning strategy that decomposes a given model into approximately convex pieces for applications such as collision detection. Chen et al. [32] described a benchmark for evaluation of 3D mesh segmentation algorithms, which revealed the underlying theoretical concepts and classified segmentation algorithms. Due to the difficulties of automatic segmentation, some approaches allow user interaction to assist the segmentation process [14,33,34]. Instead of segmentation in Euclidean space, some approaches first transform the input mesh into the frequency domain and apply spectral clustering of the shape [35,36].
Instead of using minima rules, feature lines can also be used for segmentation. Such algorithms first extract ridges and valleys [37] from input meshes, and then remove small features by filtering and extend the major ones to form closed feature loops [20,38]. The enclosed regions are extracted as the final segmentation.
Mesh decomposition is also required in 3D printing applications: due to size and printability constraints, an input shape has to be decomposed into small pieces [3] or sub-components that can readily be printed [4,39] or packed [40].
More recently, machine learning techniques were introduced in the geometry processing community [41,42]. Guo et al. [43] presented a novel approach for 3D mesh labeling using deep convolutional neural networks (CNNs); it proved to be more robust. On this basis, Kalogerakis et al. [41] combined image-based fully convolutional networks (FCNs) and surface-based conditional random fields (CRFs) to yield coherent segmentations of 3D shapes. Simultaneously, Charles et al. [44] designed a novel type of neural network directly applicable to point clouds. It is invariant under rigid transformations of point positions in the input, and showed strong performance. Xu et al. [45] proposed a 3D shape representation learning approach, a directionally convolutional network (DCN), to extend convolution operations from images to the surface meshes of 3D shapes. However, such approaches still cannot provide an exact segmentation for surface approximation purposes.

Semantic segmentation
Apart from the above-mentioned low-level segmentation tasks, many techniques have been proposed recently for high-level primitive segmentation for indoor or outdoor scenes. For example, Kim et al. [12] exploited the special structure of indoor environments to accelerate 3D acquisition and recognition with a lowend hand-held scanner. Nan et al. [46] also presented an algorithm for recognition and reconstruction of scanned 3D indoor scenes, reinforcing classification by a template fitting step to provide a scene reconstruction. Dai et al. [47] designed an easy-to-use and scalable RGB-D capture system that achieved good performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. Nguyen et al. [48] built a robust annotation tool that effectively and conveniently enabled segmentation and annotation of massive 3D data.

Our approach
There are many other mesh segmentation techniques that are not directly related to our work. The reader is referred to survey papers for more details [21,22]. Our approach falls into the category of lowlevel primitive extraction by hierarchical clustering. Instead of considering only fitting errors, we also take shape priority and boundary regularity into account, which lead to better segmentation results than representative competing counterparts.

Overview
Our goal is to partition M into a set of non-overlapping components, or clustering regions, denoted R = {R i } n i=1 such that each region R i can be approximated by a best-fitting simple primitive {P i } (a plane, cylinder, sphere, or cone). Hence R i consists of a set of connected We assign a cluster id C i to each region R i , and also assign this cluster id C i to every triangle t i,k (k = 1, . . . , n i ) inside this region.
Our segmentation algorithm is an improved version of the well-known hierarchical face clustering (HFC) approach [19]. There, at the beginning, each triangle t i is considered to be a clustering region, with cluster id set to the index of the triangle. A cost function is defined for pairs of adjacent clusters, which measures the error on fitting a single primitive to the two clusters. All pairs of adjacent clusters are fed into a priority queue, such that the pair with smallest fitting error are at the head of the queue. On each iteration, the pair at the head of the queue is removed, and the two clusters in this pair are merged into a new cluster. The cost values of all pairs of clusters affected by the merging operation are updated in the queue. The algorithm terminates when all clusters are well represented, for example, stopping when the total error increases as the number of clusters decreases, or the error of some cluster exceeds a user-specified threshold. Figure 2 illustrates an example of the hierarchical clustering process. Our cost function for merging neighboring clusters also considers both primitive priority and boundary smoothness in a unified framework, as detailed in the next section.

Cost function
In this section, we present the details of the cost function used for merging two adjacent clusters. The following principles are key to the design of our cost function: (i) try to find a simple primitive with the highest priority that best fits the merged clusters, (ii) take boundary smoothness into consideration before and after merging, and (iii) consider fidelity to avoid unnecessary merging. The cost function for merging the i-th and j-th clusters is defined as follows: The meaning of each term above is described next.

Fitting energy
We first consider use of a simple primitive P to approximate the merged i-th and j-th clusters, leading to a fitting energy defined as where the term E P fit evaluates the approximation of the merged clusters by primitive P, and the parameter α P takes the priority and fidelity of primitive approximation into account. Note that both terms are defined with respect to the i-th and j-th cluster, but we omit i, j here for brevity. These two terms are analysed further below.

Fitting by a fixed primitive
The approximation error of the merged clusters by a fixed primitive P can be evaluated by where {p k } m k=1 includes the vertices, barycenters of triangles, and midpoints of all edges belonging to the i-th and j-th clusters, and dist is the Euclidean distance from data point p k to the fitted primitive P. Note that since our primitives are all simple quadrics, the distance function dist(·) has an explicit representation.

Priority and fidelity
Since the target primitive P can be a plane, cylinder, sphere or cone, we adopt a parameter α P to control the priority and fidelity of the choice, defined as Here, α P pri is based on the priority of the primitive type, i.e., plane > cylinder > sphere > cone, and is set to The term α P fide controls the fidelity of the merging, and is defined as pri is the priority value of the current primitive approximating the i-th cluster. Equation (6) works as follows. Suppose that the current i-th and jth clusters are approximated by different types of primitives, e.g., a plane and a cylinder, and the area of the i-th cluster is far bigger than that of the jth cluster. Then α P ensures we choose a plane to approximate the merged area rather than a cylinder. See Fig. 3(b) for an illustration. We show how E P fit and α P work together in Fig. 4. In Fig. 4(a), two clusters (red and blue) are to be merged; the final choice of primitive to approximate the merged area is a cylinder as shown in Fig. 4(b). Figures 4(c)-4(f) show the approximation result if instead a plane, cylinder, sphere, or cone is used, respectively, without taking α P into account; numerically a cone gives the smallest E P fit . However, when taking α P into account, since the cone has the lowest priority, a cylinder gives the lowest α P E P fit . Note that in this example, since the red cluster and the blue cluster in Fig. 4(a) have comparable areas, the term α P fide has little effect.

Boundary smoothness
Smooth segmentation boundaries are necessary in CAD models.
Existing work always processes irregular boundaries in a post-processing step. Here we take boundary smoothness into consideration when deciding merging of i-th and j-th clusters. This is done by evaluating is the clockwise angle between the k th and k + 1 th boundary edge in the merged region formed by the i-th and j-th clusters, θ (i) l is the clockwise angle between the l th and l + 1 th boundary edge of the i-th cluster, and similarly for θ (j) n . Equation (7) encourages merging of two clusters with rough boundaries into a region with smoother boundaries. Very rough boundaries with zigzags will yield a large sum of the turning angles of the edges (see Fig. 5). Rate(B/D) = (total boundary length)/(box diagonal length) is a penalty, which reflects the complexity of the clustering results relative to the original input model.
The parameter β is set to balance the magnitude of the energy terms E i,j smth and E i,j pri :

Background
The proposed algorithm was implemented using the open-source platform Graphite x . We have validated our algorithm on a wide range of input meshes including both tessellated CAD models and scanned mechanical/organic shapes. All results shown in this section were produced automatically without user interaction. The results were produced on a machine with an Intel Core i7-7700 CPU with 16 GB RAM and Windows 10 operating system. In the following, we perform a detailed analysis of our method, as well as comparing it to several representative approaches, hierarchical face clustering (HFC) [18], hierarchical mesh segmentation (HMS) [19], variational mesh segmentation (VMS) [11], and feature-aligned segmentation (FAS) [20].

Analysis & comparison
First, we compare our method with the competing algorithms HMS [19], VMS [11], and FAS [20] in x http://alice.loria.fr/software/graphite Fig. 6. All are able to extract complex primitives rather than planar structures. All the algorithms were tested on different types of input: the Chess and the Sword models are tessellated CAD models, the Blade is a scanned mechanical model, and the Bone model is scanned freeform shapes. The segmentation results shown have the same number of clusters for each model. All competing algorithms work well generally for models with simple quadric primitives such as Chess.
However, HMS and FAS cannot segment the base of Chess successfully because two primitives are smoothly connected. Our algorithm obtains the optimal result, as does VMS, but it is more efficient since only hierarchical clustering is performed. The Sword and Blade models have more complicated structures, for which the other methods either produce wrong clusters (HMS and VMS) or cannot segment simple primitives with smooth blending regions (FAS). Our approach outperforms the others as we consider both approximation error and regularity of the segmentation boundary simultaneously.
When processing organic models, our algorithm Fig. 6 Comparisons, left to right, to HMS [19], VMS [11], FAS [20], ours, and a color map. Top to bottom: tessellated CAD models Chess and Sword, scanned mechanical model Cup, and organic shape Bone. Segmentation patches are randomly coloured.
can detect better, more meaningful segmentation boundaries than the other methods. For example, in the simple Bone model, HMS, VMS, and our method produce correct segmentations using five clusters, while FAS gives unsatisfactory output. Our results exhibit better boundary smoothness in such examples. Our method achieves better output because we jointly perform primitive detection and boundary regularization, even though the segmented patches are not regular primitives. Further results of our approach are shown in Fig. 7. Earlier algorithms either extract only simple primitives (plane, sphere, and cylinder) [18,19], which is too limited, or fit general quadrics [11], which introduces unwanted primitives (such as hyperboloids of one or two sheets). In our approach, we include the cone surface as a basic primitive, which is an improvement over the HFC framework. Benefits of providing conical surfaces can be seen in Fig. 8. HMS cannot detect the cone at the top of the Screw and outputs incorrect clusters at the connection between  the two primitives in red box. While VMS works as well as our method, it sometimes becomes stuck in local minima due to its random initialization, so success of the algorithm cannot be guaranteed. In such cases, user interaction is required to indicate where to insert or delete new clusters.

Effectiveness of boundary smoothing
To demonstrate the effectiveness of our newly introduced boundary regularization term, we carried out additional comparisons with HFC [18] and HMS [19] algorithms, which are also based on hierarchical clustering. HFC only detects planar primitives and forces each cluster to be as nearly circular as possible.
To make a fair comparison with HFC [18], we only enabled planar primitive fitting and disabled the other higher order primitives. We tested both HFC and our method on the Venus model, which has a very irregular low resolution triangulation.
As shown in Fig. 9, although HFC tends to avoid sharp boundary changing in angle, the clusters (top row) are poor-grouped and hard to be satisfactory. In Fig. 9 Boundary regularization comparison to the HFC method [18] using the Venus Body model with 5672 faces. Top: HFC results, bottom: our results, each method using its own boundary optimization. Left to right: with 18, 13, and 8 clusters.
contrast, the result of our algorithm (bottom row) is more consistent with the distribution of the structure of human body. We can also detect better meaningful segmentation boundaries. At the same time, our results exhibit better boundary smoothness.
Next, we compared to HMS by only enabling plane, cylinder, and sphere primitives. Figure 10 shows several intermediate results of both methods with the same number of clusters. We found that during the whole iteration, our algorithm always produces better results, avoiding incorrect clusters and performing boundary regularization. In the process of cylinder forming and merging, HMS produces many irregular results while ours are much better. We give a quantitative comparison in terms of boundary smoothness in Table 1, by evaluating the total length of boundary edges of the corresponding segmentation.

Effectiveness of primitive priority
To demonstrate the effectiveness of our new priority term, we first compare our approach to HMS [19] using a simple CAD model, the Joint, as shown in Fig. 11. Because of the priority parameter, the order of the clustering changes significantly during iteration. Our algorithm tends to extract larger clusters of simple primitives as early as possible.
To explore the effectiveness of our method in terms of fidelity, we tested both HMS and our method on the Anchor model and the Fandisk model, as shown in Fig. 12. For the Anchor, HMS cannot segment the structure formed by a plane and cylinder, which is a common transitional design in industrial CAD models. When the final number of primitives to be fitted is specified, it splits a well-grouped cylinder to add a new cluster, rather than the grooved structure forming by a pair of orthogonal planes. As for the Fandisk model, the same situation occurs for a more complicated structure comprising a plane smoothly blended with two cylinders. Because these two models are composed of exact simple quadric primitives, our method can recover the original ground truth structure exactly. To summarize, our method avoids producing undesired clustering in transition regions, and so provides more reasonable results.

Comparison with deep learning methods
Deep learning has been widely used in various graphics and geometry processing applications. To demonstrate the effectiveness and understandability of our results, we compare our approach with work on component segmentation (LMVCNN) [50] and human body segmentation STC [51] using two simple models, the Ant and the Bracket, which have clear branching structures for semantic and patch-based segmentation.   As is shown in Fig. 13, deep learning relies heavily on functional relationships within the model. These methods are good at handling functional parts, such as the body and aircraft wings in their paper, but cannot recognize the many legs and antennae of the Ant model. For standard CAD models, they also fail to produce meaningful results. On the contrary, our algorithm based on primitive surface fitting can segment semantic structure well and obtains very reasonable results. It also shows that our algorithm can be applied to a wider range of models without learning.

Global regularization
Although our algorithm can ensure that each cluster uses the closest quadric for fitting and parametric form for storage, in practical industrial production, global relationships between disconnected parts are often considered to be an important issue, such as multiple parts forming subcomponents, and different parts aligned with each other via symmetry (e.g., parallel or orthogonal axes). We use the symmetry detection method in Ref. [52] to identify all components with the same transformation (translation, rotation), and then determine whether a subcomponent is formed according to the connection relationship. We also consider that objects with geometric properties such as discrete cylindrical and coaxial circular surfaces should have global symmetry.
As shown in Fig. 14, each part is obtained by fitting a corresponding quadric primitive. However, the result appears too fragmentary and cumbersome to be used easily and conveniently. In Fig. 14(c), we detect that the cylinder and plane on the edge exhibit the same rotational transformation. By their spatial relationships, we judge that two adjacent cylinders and a plane can form a tooth subcomponent and in this model, all parts on the rim just repeat this tooth subcomponent. The gap between any two teeth, which we consider to be a discrete partition of a complete cylinder, form a hollow cylinder together with the internal entity. The regularized result is more compact and concise, so is more convenient for subsequent engineering processing and applications.

Performance
The running time required for all demonstrated models is presented in Table 2. The time taken is affected by both the size of the input model and the number of the final clusters. Our algorithm is slightly slower than Attene's HMS [19] due to the additional computation, but our results are improved significantly. Compared to optimizationbased approaches [11,20], our algorithm is much faster. However, our implementation is not fully optimized, and we hope to further improve the performance in future work.

Limitations
Although our method can produce acceptable results for various inputs, it depends on the fact that the detected primitives themselves must have certain geometric properties. This means that a segmented cluster, to some extent, should be approximated by a simple quadric shape properly. Figure 15 shows an unsatisfactory example, which exhibits no clear structure. Therefore, our algorithm cannot work well and produces unsatisfactory output. Other competing algorithms cannot deal with such input either. Fig. 15 An unsatisfactory example. Left to right: results of HMS [19], VMS [11], FAS [20], our method, and colormap, for 28 clusters.

Conclusions & future work
We have proposed a simple primitive detection algorithm, which is based on hierarchical clustering [18,19]. We improve the original algorithm in three ways. Firstly, we add cones as a new primitive, which improves the approximation flexibility of the framework. Next, we introduce primitive priority to encourage simpler primitives. Finally, we propose a new boundary regularization term which improves the results significantly. Our algorithm works well on a wide range of inputs. However, we cannot produce satisfactory output for shapes without clear structure, as shown in Fig. 15. This is a common drawback of many mesh segmentation algorithms. There are several promising ways in which this work could be extended.
Firstly, the current algorithm performs only local processing with little consideration of global relationships between nonadjacent clusters. We plan to take more global constraints (angular relationships, symmetry, and so on) into account to further improve the regularity of the detected primitives. Secondly, although we can extract correct primitives for many CAD meshes, we do not convert them into a solid model that can be processed in modeling systems such as Solidworks. Such conversion is very important for many reverse engineering applications. In the future, we hope to put more effort into completing the whole pipeline for reverse engineering. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.