Keywords

1 Introduction

In object-based image analysis, vectorization of image data is the most essential concept to compress and abstract image information. It has numerous applications, e.g., digitizing calligraphy, creating topographic databases, or deriving semantic representations of image content. An important intermediate step of the latter application is classification of image regions [11, 18]. Our main application area is creation of virtual cityscapes with buildings and roads as most important objects of recognition value. Vectorization of outlines or (medial) axes of these objects is the principal task in urban terrain reconstruction [4].

Preliminaries. A plausible boundary of a classified object would be the pixel conducted polygonal chain separating all in-object pixels from outside pixels. To outline exterior and interior contours, we utilize a modification of the Moore contour-tracing algorithm [14] on the mask of the detected object and on its complement. Our methods are designed to generate outlines of buildings and medial axes of road courses, but all procedures may immediately be applied to other objects featuring dominating edge expansions. The contours are pre-simplified by means of Douglas-Peucker [6] with an extremely tight threshold (1 to 1.5 pixels) to eliminate only the axially parallel stair-shaped polygonal chains from contour tracing without changing the initial shape. The choice of this tight threshold avoids topological inconsistencies and dependencies of initial vertices [16] in closed polygonal chains. To identify the medial axes of objects like roads, we use thinning of the corresponding classification mask with a following vectorization according to [17] instead. This is done to preserve the topology of junctions and branch points. Nevertheless, the final step is again pre-simplification with a small threshold value. The outlines of four classified objects – one building, two building complexes and a road course – are exemplary shown in Fig. 1.

Fig. 1.
figure 1

Test data extracted from a test side of Munich, Southern Germany [15]. D1: single building outline with orthogonal directions. D2: building complex of several directions. D3: building complex of several directions and multiple atriums (holes). D4: medial axes of a road segment.

Related Work. Complicated pipelines are designed to obtain polygonal chains which approximate a classification result, and are robust against data noise and outliers. Additionally, for many man-made structures, constraints of symmetry must be imposed. The resulting polygonal chains should capture the object’s topological properties and have a small number of vertices. It has been pointed out in [12] that the generic meshes obtained by crust algorithms [1], ball-pivoting methods [2] or alpha-shapes [7] do not always satisfy these criteria.

To reduce the number of vertices, one can apply a simplification routine, such as the one presented in [6]. However, this approach depends on the choice of starting vertex and is liable to noise. Furthermore, decreasing the number of vertices does not necessarily decrease the number of included edge directions. To compensate this drawback, [10] searched for distinctive directions before polygonization. Straight lines within the images are used to add and to remove rectangles to form polygons [9]. Furthermore, [13] attempts post-processing of the directions obtained by concave hulls [12]. Given the hull of a building footprint, two directions should be derived from the histogram formed by edges of the polygonal chain weighted by their lengths. Additionally, it was suggested to work with orthogonal projections and to limit the search direction for the concave hull to multiples of \(\pi /2\). If this range is too coarse, multiples of smaller values that are typical for man-made objects can be chosen. This approach is also transferable to one of the standard polygonization algorithms, such as [1] based on medial axes, or on the Moore neighbor-tracing approach [14].

Contribution. For a variety of reasons, segmentation results may be noisy or flawed. Hence, the pre-simplified polygonal chains may wriggle with many vertices and directions; e.g., straight segments of building outlines are not obtained properly. The task of this paper is to explicate methods to reduce the number of vertices in a polygonal chain while simultaneously enforcing dominant directions of the polygonal chain’s edge expansion. These dominant directions are also called distinctive directions.

Although man-made objects feature curvy structures, too, we will focus on (complexes of) buildings and roads in metropolises, where streets are roughly laid out in piecewise straight-lined segments and where building outlines are mostly characterized by orthogonality or parallelism to several distinctive directions. Segments of distinctive directions distinguish themselves by multiple occurrences. To extract them we present and compare two approaches: The former, an intuitive way to solve the problem is to transfer the observed edge directions into a histogram weighted by the edges’ lengths followed by analysis of histogram peaks and non-maxima suppression. Alternatively, distinctive directions can be determined by fitting parametric distributions to the edge directions. The latter can be done effectively by applying the expectation-maximization (EM) algorithm for mixture models with a usually a priori unknown number of components, i.e., distinctive directions.

We apply our methods also to a special issue: Some polygons formed by closed polygonal chains contain empty areas, called holes, which are for their part limited by an interior polygonal chain. Some holes are difficult to detect from image data or could be represented by noisy or imprecise observations. A simultaneous treatment of interior and exterior polygonal chains, on the one hand, stabilizes the detection of holes and, on the other hand, guarantees the outlines to be of similar or parallel shape. Furthermore, this method enforces parallelism of segments in exterior and interior polygonal chains.

We will show the usability of our procedures exemplary for outlines of buildings (or building complexes) and a road course.

2 Distinctive Directions and Simplification

Our task is to reduce the number of vertices in a polygonal chain while keeping the most prevalent directions of its edges. In case of building outlines, most European buildings feature two distinctive rectangular directions; however, there are many exceptions. Nevertheless, these man-made structures include just a few distinctive directions. Likewise, roads often feature few distinctive directions and sometimes even rectangular structures. There are definitions for what a road course is in [5]. For our purpose, it is sufficient to consider a road course as a connected sequence of road segments. An example is shown in Fig. 1(D4).

In this section, we present two different approaches for selecting distinctive directions in a polygonal chain with n vertices \(\{\mathbf {v}_1, \mathbf {v}_2, \ldots , \mathbf {v}_n\}\), \(\mathbf {v}_i = (x_i, y_i)^\top \) and corresponding edges vectors \(\mathbf {e}_i = \mathbf { v}_{i+1} - \mathbf {v}_i\). All methods rely on the enclosed counterclockwise angles \(\alpha _i\) of the positive \(\mathrm x\)-axis and \(\mathbf {e}_i\) assessed by the edges’ lengths \(\ell _i\). Each \(\alpha _i\) represents an edge direction. Parallel edges in the polygonal chain are supposed to enhance the appearance of the same direction but are – due to polygonal sequence – often oriented oppositely. Hence, the edge direction \(\alpha _i\) is set irrespectively of the orientation of \(\mathbf {e}_i\) to \(\alpha _i \leftarrow \alpha _i\bmod \pi \).

2.1 Hill-and-Valley Decomposition

Similarly to [13], we create a length-weighted histogram for each polygonal chain with \(\pi /180\) bin discretization for \(\alpha _i\) (abscissa) and \(q_j = \sum _s \ell _s\) for all \(\mathbf {e}_s\) assigned to bin j (ordinate). If the polygon contains holes, values of \(\alpha _i\) and \(\ell _i\) are collected over all polygonal chains.

Extracting Distinctive Directions. We assume a distinctive direction to occur often in the polygonal chain and therewith to correspond to a local maximum of the histogram. A local maximum, is either a histogram bin of larger value than the incident bins to both sides, or the medial bin of a maximum plateau that is a set of successive bins of equal and larger value than the closest different valued bins to both sides. If the number of bins belonging to a plateau is even, we chose the bin of smaller angle. (Accordingly, local minima are defined the other way around.)

In the initial histogram, many local maxima are present. In order to extract distinctive directions representing the shape of polygonal chains with adequate accuracy and to suppress weak local maxima, we smooth the initial circular histogram. As a Gaussian filter lends itself to (histogram) smoothing, we use the composition of its simplest approximation, the binomial filter , several times. If we know the number of smoothing steps \(\#\mathcal {S}\), the filter might be pre-calculated as an autonomous approximation for a Gaussian filter. Hence, smoothing could take place in one step. Note, that bin width and amount of observations influences \(\#\mathcal {S}\). We found a bin discretization of \(\pi /180\) useful for simplification of polygonal chains of roads or buildings. But in general, the bin width might be adapted according to the task of application and accuracy.

The challenge is to estimate \(\#\mathcal {S}\) to get the optimal result. It is hard to guess the ideal number of desired local maxima in the histogram without knowing the true number of distinctive directions in the polygonal chain. Since the binning is quite fine \(\#\mathcal {S}\) should be bounded away from zero (\(\#\mathcal {S}\ge 30\)) but far less than 100. Plotting the number of local maxima as a function of smoothing steps, we obtain a decreasing graph with many plateaus as shown in Fig. 2a. There are some strategies to determine \(\#\mathcal {S}\) optimally, such as taking the closest point from all coordinate tuples initiating a plateau to the point of origin. But to the best of our knowledge, the optimal number is chosen best heuristically by taking all available prior information into account.

Fig. 2.
figure 2

Amount of local maxima of the length-weighted histogram during smoothing process (data set D4 in Fig. 1).

Following the assumption above, after smoothing, the remaining local maxima correspond to distinctive directions of the polygonal chain. This strategy is plausible and useful. To make the procedure more robust against the choice of bin width, we recommend a small modification. Enlarging the bin width may cause shifted local maxima because of coinciding bins. To compensate this effect, the distinctive directions \(\varphi _f\) are extracted from the centroid of a histogram hill – that means from all bins between two successive local minima (the valleys) – to cover most \(\alpha _i\), instead of its local maximum.

Elimination of Shallow Extrema. The certainty of distinctive directions depends on the varying severity of the max-to-min ratio including the distances of positions between pairs of neighboring local maxima (and minima) as well as the distances between successive local maxima and minima values. Irrespectively of extracting distinctive directions from local maxima or centroids of hills, these directions are of larger certainty if they feature a steep ascend and sharp figure. Despite smoothing to eliminate side maxima, shallow extrema of an unfavorable max-to-min ratio may be obtained. To get rid of them without further smoothing, we apply a post-processing step: If two successive minima are too close and both absolute differences between their values and the value of the enclosed maximum are too low, the maximum will be removed. The same is done the other way around with too close successive maxima. It becomes apparent that critical values for too close and too low may be determined by one third of all present valley position distances as well as local maximum to local minimum value distances in the histogram.

Extrema that are part of an oscillation in the bottommost \(10\%\) of the data of the smoothed histogram are removed as well.

Removing the extrema may cause immediately consecutive maxima or minima. We summarize each tuple of these extrema to an extremum plateau and compute from this set the medial bin as extremum.

Working with the modification, the distinctive directions can afterwards be extracted from the centroids of the remaining hills.

2.2 Parametric Distribution Analysis

A disadvantage of hill-and-valley decomposition is the large number of degrees of freedom. Varying the bin size influences \(\#\mathcal {S}\). The choice of this number is rather heuristic and per-object. The more prior knowledge about the underlying object is available the more robust the simplification of the polygonal chain works. But since we expect a few distinctive directions and since the shape of the smoothed histogram indicates the presence of an underlying compound distribution, we pursue a statistical approach. The following method models distinctive directions and background noise with parametric distributions by means of expectation maximization.

Modeling. The statistical theory of directional data usually considers distributions on the unit circle. For the statistical analysis, we transform the observed chain directions \(\alpha _i \bmod \pi \) by doubling them, estimate distinctive directions \(\varphi _f\) among other distribution parameters, and back-transform the results.

Besides the distinctive directions \(\varphi _f\), we expect background clutter due to imperfect data. The parametric approach allows to model the background explicitly by means of the uniform distribution \(p(\alpha ) = 1/(2\pi )\), \(0\le \alpha \le 2\pi \). Thus, all directions \(\alpha \) between 0 and \(2\pi \) are equally likely. For the dominant directions, we utilize the von Mises distribution which is in many respect the “natural” analogue on the circle of the normal distribution on the real line [8]. The probability density function reads

$$\begin{aligned} p(\alpha | \varphi , \kappa ) = \frac{1}{2\pi I_0(\kappa )} \exp \left\{ \kappa \cos ( \alpha -\varphi ) \right\} , \qquad 0 \le \varphi \le 2\pi , \quad 0 \le \kappa \le \infty \end{aligned}$$
(1)

where \(I_0(\kappa )\) is the modified Bessel function of order zero, \(\varphi \) is the mean direction, and \(\kappa \) is the so-called concentration parameter. As \(\kappa \rightarrow 0\), the distribution converges to the uniform distribution; as \(\kappa \rightarrow \infty \), it tends to the point distribution concentrated in the direction \(\varphi \). The maximum likelihood estimate \(\widehat{\kappa }\) for the concentration parameter \(\kappa \) can be seriously biased when the sample size n and R / n are small, whereas R is the length of the resultant \(\begin{pmatrix} \sum _i \cos \alpha _i,&\sum _i\sin \alpha _i \end{pmatrix}\) [8]. In that case, \(\widehat{\kappa }\) can substantially over-estimate the true value of \(\kappa \). Correction formulas are provided in [3].

For uncertainty analysis of each estimated mean direction \(\varphi _f\), two measures are essential: Given the concentration \(\kappa _f\) for a direction \(\varphi _f\), the circular standard deviation \(\sigma _{\!f}^{} = \sqrt{ -2\log \rho _f}\), \(\rho _f = I_1(\kappa _f)/I_0(\kappa _f)\) can be computed, to measure the spread of data. For distinctive directions, we expect circular standard deviations of less than say \(20^\circ \). For testing the potential orthogonality of distinctive directions \(\varphi _f\), we utilize the precision \(\sigma _{\widehat{\varphi }_f} = 1/\sqrt{R\widehat{\kappa }_f}\) for each \(\varphi _f\).

For the application at hand, we expect several distinctive directions and background clutter, modeled by a mixture of F von Mises distributions and a uniform distribution, whereas the number of components F is unknown a priori. For fitting this mixture, we apply the EM algorithm and determine the number of components by considering information theoretic criteria such as the Bayesian Information Criterion.

For the outlines of buildings, we expect the distinctive directions not to vary too much, i.e., the dispersion \(\sigma _{\!f}\) should be small for each direction \(\varphi _f\). Thus, dominance is not only defined by frequently occurring directions, but also by directions featuring a small variation.

Enforcing Orthogonal Directions. To detect orthogonality, we perform statistical parametric tests for all pairs of estimated distinctive directions. The corresponding test statistic for a pair of distinctive directions (\(\varphi _g^{}\), \(\varphi _h^{}\)) is

$$\begin{aligned} T_{gh} = \frac{\sin \left( \left| \widehat{\varphi }_g -\widehat{\varphi }_h \right| -\pi \right) }{{\sigma }_{\widehat{\varphi }_g\widehat{\varphi }_h}^{}} \quad \sim \mathrm {N}(0,1) \end{aligned}$$
(2)

with the standard deviation \( {\sigma }_{\widehat{\varphi }_g\widehat{\varphi }_h}^{} = \sqrt{ {\sigma }_{\widehat{\varphi }_g}^2 + {\sigma }_{\widehat{\varphi }_h}^2 } \) of the estimated difference \(\widehat{\varphi }_g -\widehat{\varphi }_h\). The values of the test statistic \(T_{gh}\) are standard normal distributed.

Once a pair of distinctive directions are in line for orthogonality, it can be enforced by applying the EM algorithm again with a bimodal von Mises distribution featuring the three parameters \(\varphi _g^{}\), \(\kappa _g^{}\) and \(\kappa _h^{}\), whereby the second mean direction is given by \(\varphi _h^{} = \varphi _g^{} +\frac{\pi }{2}\). Alternatively, the estimated directions \(\widehat{\varphi }_g\) and \(\widehat{\varphi }_h\) can be adjusted according to their uncertainty provided by the estimated variances \(\sigma _{\widehat{\varphi }_g}^2\)and \(\sigma _{\widehat{\varphi }_h}^2\). The statistical optimal corrections for the directions \(\varphi _g\) and \(\varphi _h\) are

$$\begin{aligned} \left( \begin{array}{c} \widehat{\varDelta \varphi }_g^{}\\ \widehat{\varDelta \varphi }_h^{} \end{array} \right) = \frac{ \widehat{\varphi }_g^{} -\widehat{\varphi }_h^{} + \pi /2}{ \sigma _{\widehat{\varphi }_g}^{2}+\sigma _{\widehat{\varphi }_h}^{2} } \left( \begin{array}{c} -\sigma _{\widehat{\varphi }_g}^{2}\\ \sigma _{\widehat{\varphi }_h}^{2} \end{array} \right) \end{aligned}$$
(3)

i.e., uncertain distinctive directions will be shifted more than certain directions to fulfill the orthogonality constraint.

2.3 Simplification of Polygonal Chains

The extracted distinctive directions are now used to simplify the initial polygonal chain as described in [13]. The polygon simplification is divided into two steps: The assignment of edges and their adjustment.

For assignment, we label each edge \(\mathbf {e}_i\) according to \(\widehat{f}_i = \underset{f}{\mathrm {arg\,min}} (\vert \varphi _f - \alpha _i\vert )\).

For adjustment, all vertices belonging to successive edges of same label f are replaced by a line through the centroid of this group of vertices along the vector \(\begin{pmatrix} \cos \varphi _f,&\sin \varphi _f \end{pmatrix}\). Each line is limited by intersection with neighboring lines. First and last line in open polygonal chains are limited to the “open” side by projecting the first vertex \(\mathbf {v}_1\) as well as the last vertex \(\mathbf {v}_n\) onto the line.

3 Results and Discussion

We demonstrate the usability of our methods by means of four data sets D1 to D4, extracted from a test side of Munich, Southern Germany (Fig. 1, [15]). As closed polygonal chains we chose three building structures D1 to D3 and as an open polygonal chain a road example D4. One can see that D1 and D4 have a quite simple shape with only two perpendicular directions. To show the behavior of our algorithm on polygonal chains with multiple directions, we chose complexes of buildings D2 and D3. Note that D3 also highlights the usability on polygons with holes. All bordering polygonal chains have some influence in the computation of distinctive directions.

Fig. 3.
figure 3

Color coding and polygonal chain simplification (black underlay). The color coding results from both algorithms.

Results. The color coding of the edge labels of D1’s outline resulting from hill-and-valley decomposition (Sect. 2.1) is shown in Fig. 3a. The initial polygonal chain includes 88 different edge directions and may be assigned to only two resulting distinctive directions (Table 1). Hence, it is possible to simplify the polygonal chain only with edges of these two directions. Using the parametric approach (Sect. 2.2), we obtain the estimates \(\widehat{\varphi }_1 = 108.9^\circ (30.9 \%\) proportion) and \(\widehat{\varphi }_2 = 21.0^\circ (69.1 \%)\) which enclose an angle of 87.9\(^\circ \). The circular standard deviations of the directions are in an expected magnitude of approximately \(22^\circ \). The hypothesis of a present right angle is not rejected, thus the distinctive directions can be adjusted proportionally to their estimated standard deviations according to (3), which are almost equal for this data.

Fig. 4.
figure 4

Color coding of edge labels and extraction of distinctive directions visualized on the histogram for test set D2. (Color figure online)

The adapted boundaries of D2 are depicted in Fig. 4a. Both, hill-and-valley decomposition and EM-based algorithm lead to four extracted directions as can be seen in Fig. 4b and Table 1. The results of both approaches do not differ very much (Fig. 4b) but the determined directions vary enough to result in a different edge labeling and therewith a slightly differing fitting to the original outline (Fig. 4a). Three directions obtained by hill-and-valley decomposition are outlined as sharp hills in the upper histogram of Fig. 4b. The fourth maximum is blurred and of much lower value compared to the other ones. This indicates higher uncertainty. A harder constraint concerning the value differences between local maxima and minima as well as their positions (Sect. 2.1) could eliminate the intuitively redundant appearing maximum. However, the algorithms are developed to work on large data sets with hundreds of buildings for instance. The calculation rule for a constraint needs to be global enough to work for every building in the data set.

Fig. 5.
figure 5

Results for a data set of a building outline containing atriums (D3). The corresponding polygonal chains enclose a polygon with holes. (Color figure online)

The polygon D3 includes holes. The color coding in Fig. 5a shows the assignment of hole edges parallel to the exterior polygonal chain and therewith enforcing the simplification to parallel edges. Beside some isolated “wrong labeled” edges in between, all main edge expansions in the polygonal chain are exposed: The green and gray edges coding for the nearly rectangular parts in the outline (green: “side” parts, gray: “bottom/top” parts) as well as the orange colored non-rectangular walls in the upper part of the building outline. By hill-and-valley decomposition we reduce the initial number of 251 edge directions to four (Fig. 5a). With the EM-based algorithm, it is even possible to reduce the number of main directions to three (Fig. 5b, Table 1).

Table 1. Numbers of directions present in the polygonal chains of D1–D4.

As example for an open polygonal chain we consider the course of the road D4. The resulting labeling and simplified polygonal chain of the road course are shown in Fig. 3b. The results of both, hill-and-valley decomposition and EM, are very similar. Both approaches are able to detect both distinctive directions (out of initial 61, see Table 1) one intuitively assumes.

Discussion. Both introduced methods for detecting distinctive directions have different advantages over each other. Contrary to the EM-based algorithm, hill-and-valley decomposition is rather robust. It always results in at least one resulting direction even if the histogram is sparse or equally distributed. Prior knowledge about the number of distinctive directions supports the procedure and makes it quite robust. Otherwise, the approach has several important thresholds: It yields unsatisfactory results if the bin size or \(\#\mathcal {S}\) are chosen inappropriately or if the thresholds to eliminate shallow extrema (Sect. 2.1) do not fit the distribution. Enlarging the bin width may result in coincidence of neighboring bins and therewith a possibly tilted result. For compensation, we chose the average direction of hill as main direction instead of the local maximum. The EM-based approach works directly with the polygonal edges. Beside predefining the number of components no other parameters are required. The algorithm requires a clear mixed distribution. It is preferred if no or just few prior knowledge is available. But it has problems with lack of observations or major outliers, as well as nearly uniform distributions. So, the decision which algorithm is “best” for computation of distinctive directions depends on the background information, i.e., how many different observations are present, whether the number of free parameters can be decreased by prior knowledge, or whether the manifestation of several distinctive directions can be assumed.

Please note that the extraction of distinctive directions lying close to each other may lead to extrapolations in the simplification step if these directions are assigned to incident edges of the polygonal chain. Elimination of “isolated” edge labels will not solve that problem because it may lead to collapsing parts in the polygonal chain. This deficit should be avoided initially by adapting the number of smoothing steps (Sect. 2.1) or components (Sect. 2.2).

4 Conclusion and Future Work

In this article, we presented two methods to extract distinctive directions in order to simplify polygonal chains. The number of edges and their corresponding directions can be reduced as shown, i.e., for the example of road courses. Simultaneous evaluation of exterior and interior polygonal chains of building outlines lead to a simplification with the same determined directions for both, polygonal outline and hole outlines, and therewith to a parallelization of edges in segments of interior and exterior polygonal chains.

Simplified polygonal chains feature few distinctive directions and depict a shape similar to the initial polygonal chain. This has advantages in reducing data for memory reasons but also provides a more intuitively outline if it comes to man-made structures, e.g., straight walls or straight courses. Hence, both methods generate polygonal chains with less vertices but an appropriate fitting to the underlying classified image object. The choice of method depends on the number of edges in the polygonal chain and whether more background information about the underlying object is available. If \(90^\circ \) angles are expected, the detected directions may be shifted according to their certainty afterwards.

Our aspiration was to keep the algorithm flexible for any type of cityscape. If data sets are laid out in a grid pattern, it would be recommended to include the data of, e.g., all buildings at once into the procedure. Alternatively, determined distinctive directions may successively be included into a maximum a posteriori optimization.

For future work, the (un)certainty of distinctive directions may be used to suppress unnecessary directions. Further, the assignment of polygonal edges to a distinctive direction can be changed from a winner-takes-all strategy to a probability-based method. The assignment of edges close to multiple distinctive directions should be influenced by incident edge labels in order to advance edge propagation.