Automated Classification of Building Roofs for the Updating of 3D Building Models Using Heuristic Methods

In the Bavarian Surveying Administration, remote sensing methods are applied in the context of nationwide airborne surveys for the acquisition of aerial photographs and airborne laser scanning for the derivation of the digital terrain model (DTM). At the Bavarian Agency for Digitisation, High-Speed Internet and Surveying (LDBV = Landesamt für Digitalisierung, Breitband und Vermessung), image-based digital surface models (bDOM) and digital orthophotos without building lean (trueDOP) are produced using the dense image matching (DIM) method. Buildings and their roofs are displayed in the trueDOP in the correct position to the cadastral ground plan. Based on these data, an expert system was developed for the investigation of construction cases and for the updating of 3D building models, which automatically calculates change notices and makes them available to the Agencies for Digitisation, High-Speed Internet and Surveying (ADBV = Amt für Digitalisierung, Breitband und Vermessung). The reliable detection of buildings plays a decisive role here. Representative reference classes for the classification of building roofs in the RGB colour space are formed and frequencies are calculated across the boundaries of the photo-flights. The classification is carried out with the aid of a normalized digital surface model (nDOM), which is calculated from the height differences between the bDOM and the DTM, and with heuristically defined threshold values for the colours in the representative RGB colour spaces. The presented method is transferable to all federal states.


Introduction
According to a resolution of the Working Committee of the Surveying Authorities of the Laender of the Federal Republic of Germany (AdV), 3D building models with standardized roof shapes were recorded throughout Germany on the responsibility of the federal states (AdV 2016). The initial recording of the 3D building models with ten standardized roof shapes in the so-called Level-of-Detail 2 (LoD2) has almost been completed nationwide (AdV 2018). The distribution of the LoD2 building models via the Central Authority for House Coordinates and House Layouts (ZSHH = Zentrale Stelle Hauskoordinaten und Hausumringe) located in Bavaria could then start in 2020. In some federal states, the initial capture of LoD2 was carried out by an automated derivation, e.g. in North Rhine-Westphalia and Baden-Württemberg, or by a semi-automatic derivation in which buildings are interactively edited, e.g. in Bavaria, Bremen, and Hessen. The classical digital orthophoto (DOP) together or in combination with point clouds from airborne laser scanning (LiDAR) or the regular point grid of an image-based surface model (bDOM) derived with the dense image matching method serves as the basis for acquisition. According to a further AdV resolution, by the beginning of 2023 the classic digital orthophoto (DOP) will be replaced by the true orthophoto (trueDOP) (AdV 2017); see Fig. 1. In those cases where the trueDOP could already be derived from the bDOM by the federal states, the classic DOP was replaced. The mapping of buildings and their roofs in a trueDOP coincides with the cadastral ground plans, although the roof areas may be slightly larger than the building ground plans in the cadastre depending on the roof overhang. Thus the trueDOP is ideally suited for the nationwide classification of building roofs.
While some publications focus on building recognition including the classification of roof structures from satellite data or aerial photographs (Alidost 2018;Marmanis et al. 2016;Valinger 2015;Shibiao et al. 2018), in Germany this task can be considered as completed for the LoD2 building models. After completion of the initial data capture, the updating of the LoD2 buildings must be ensured Roschlaub 2013, 2014). The updating of the LoD2 building models is not yet established in all federal states. In Bavaria, for example, the updating is carried out from the cadastral two-dimensional ground plans of buildings as well as the terrestrial three-dimensionally measured special building points (ridge and eaves points) and the documented ridge lines. In combination with bDOM or LiDAR data, the 3D building models can be updated in a simple way (Hümmer and Roschlaub 2014).
In many federal states, there is no obligation to measure buildings or their modifications. However, to ensure that LoD2 continues to be as complete as possible, information on the changes made to the building is required. In the following, a procedure is presented that helps federal states detect changes in buildings from aerial photographs that have not yet been mapped in the cadastral system.

Colour and Intensity Differences at the Borders of the Photo Flight Projects
For the segmentation or classification of objects from aerial photographs, imagery that is as unchanged as possible is necessary, because every interference by image processing Fig. 1 a trueDOP with a mapped building floor plan, a ridge line and ridge and eaves points, b automatically derived LoD2 building displayed together with the grid points of the bDOM (green points, partly above and partly below the roof of the yellow garage build-ing) c with LiDAR data (here green points are above the roof of the garage). The red dots belong to the main building which is not shown behind it. b, c for visual inspection methods leads to a decrease of the initial quality and often to a loss of information. Even in the trueDOP, which is derived from the "Bayernbefliegung" (photo flight of Bavaria) at a ground resolution of 20 cm, the underlying images of the image flight are only very carefully radiometrically processed, so that radiometric differences of varying intensity can be seen at the lot boundaries of the image flights and flight strips can be seen in the mosaic of the trueDOP (see Fig. 2). Influencing factors are for example: • Atmospheric light absorption and scattering of light caused by air molecules and particles (aerosols) suspended in the air: the greater the flying height, the bigger is the air package the light has to pass through. • The shadow direction (sun azimuth) that changes during the day and the increasing haze that can be counteracted by yellow filters in panchromatic images-but not in colour images. • The shade length, which is determined by the elevation of the sun. • The position of objects in relation to the flight axis, where objects appear darker (backlighting) on the sun-facing image side and brighter (backlighting) on the far side. • The seasonally different vegetation, which can lead to violet undertone for damp soils in spring, while the rather dry soils at the photo flight borders at the end of summer often lead to brown-green transitions. Summer sun at noon leads to dark drop shadows and harsh contrasts, while a high, thin cloud layer leads to soft, diminished contrasts with brightened shadows. • Motion blur due to different flight speeds.
In the mosaics derived from the different epochs of aviation, unchanged building roofs can appear very different for the reasons mentioned above. This makes it difficult to classify objects in the trueDOP mosaic based on colour values and requires a representative and comprehensive evaluation of all lots to define a comparison class.

Classification Reference in RGB Colour Cube
For an automated classification of objects in raster images, training areas with the most representative properties of the objects to be classified are always necessary. The necessary training areas for the classification of building roofs in a trueDOP mosaic can be easily obtained by intersecting the building ground plans from the official digital cadastral map (DFK) with the trueDOP mosaic. The RGB colour values of all pixels of the trueDOP mosaic lying within the building ground plans serve as a reference for the automated classification of the buildings across the boundaries of the photo flight projects. The pixels of the training areas are now processed as follows. The number of possible combinations of the grey values for R, G and B is calculated by multiplying the number of grey levels of the individual channels, i.e. at 16 bits 65,536 3 = 281,474,976,710,656. This number is too large for meaningful further processing. Therefore, the original bandwidth of each individual colour channel is reduced from 16 bits (0-65,535) to 8 bits (0-255). The number of possible combinations is Fig. 2 a Colour deviations between the Bavarian photo flight projects. b Section in RGB. c Section in infrared 256 3 = 16,777,216. Then each pixel of the training areas is assigned to one of the combinations and the number per combination is added up. After the summation, the frequency of each RGB combination is known. The mask of the building floor plans now cuts out not only pixels from roofs, but also from overhanging vegetation, especially from trees. Therefore, a method is sought that separates the roofs from the vegetation.
As a previous investigation has shown, simple statistical tests on the confidence intervals in the colour values of roofs hardly allow a separation of vegetation from roofs, since the confidence intervals describe only a cuboid in the RGB colour cube and therefore reflect the characteristics of objects such as roofs without a sufficient differentiation (Geßler et al. 2019;Roschlaub 1992). A better method is based on a representative point cloud of roof pixels, the description of which is not limited to a cuboid, but rather describes the envelope of this point cloud. All pixels that lie within this envelope would then represent the colour values characteristic for roofs and all deviating colour values would represent any other object. In this model, however, all cavities in the point cloud enclosed by the envelope are also classified as roofs. However, this must be avoided.
However, previous investigations have shown that such cavities rarely occur, as aerial photographs show quite similar colours. This is shown in the illustrations in Figs. 3, 4, 5 and 6, which show the RGB combinations found in the aerial photographs. The value ranges are limited in all cases to a rather compact body in the middle of the value ranges. The illustrations show one dot per colour combination independent of the frequency.
Both methods, the cuboid calculated from confidence intervals and the envelope figure enclosing the point cloud, do not allow a reliable separation of roof and vegetation. Finally, a method that leads to the goal uses the frequency of the individual RGB combinations and their distribution, and separates the roof pixels from the vegetation by an empirically determined threshold value. Further on, it eliminates outliers. Section 4.1 will particularly focus on the significance of the threshold value. For illustration purposes, the RGB colour values of the characteristic roof pixels applied in the three-dimensional RGB colour cube can be projected into the three colour planes: blue-green, blue-red and green-red. In the projected colour planes, too, roofs that occur several times in the RGB colour cube with almost identical RGB colour values will appear at the same frequency in the three colour planes.
The following images of Fig. 4 show the corresponding point orders of a test tile in the RGB colour cube and its projections into the three colour planes in the columns: blue-green, blue-red and green-red. In the left column, the calculations were performed on the pixels of the entire test tile; in the middle column, only the roof pixels were examined; and in the right column, the pixels resulting from the difference between the two images were examined. The same investigations were carried out in a second test tile in a rural area without visualizing the results here. Figure 4 describes a heavily built-up area of commercial and residential buildings with the following peculiarities: • The point clouds in the RGB cubes are very compact; there are no outliers outside the point clouds, as the aerial images contain only natural and no synthetic colours. • The point clouds projected into the three colour planes have a diagonal characteristic; they differ essentially only in their basic values with regard to rural and built-up areas. • The point orders of the buildings (middle column) show a strong similarity with the point orders of the other object pixels (right column), so that a classification of roofs from an aerial photograph exclusively on the basis of RGB values will be very difficult.
The more roofs are in a test tile, the more often the characteristic colour properties of the roofs are reflected, due to the standardization to 255 gradations per colour channel. If, for example, all DFK building layouts for the whole of Bavaria were blended with the trueDOP mosaic,  then the colour and intensity differences of the photo flight projects would also be taken into account in the training areas of the resulting RGB colour values of the roof pixels. The calculation of such a reference data set in the RGB colour cube is very computational intensive, but allows the determination of stable and representative reference classes-for example for roofs or vegetation. To speed up the classification process, a reduction in the ground resolution of the trueDOP from 20 to 40 cm has proven itself.

Classification of the Original Images Based on a Threshold Value for the RGB Reference Colour Cube
For the development of a reference class for the automated classification of buildings, only the RGB colour values of the roof pixels and the resulting frequencies in the RGB colour cube are considered first. The frequency of each of the RGB combinations represents a fourth dimension besides the values for R, G and B, and therefore cannot be visualized. By defining a threshold value, it is possible to classify building objects in each image.
In addition to the images to be examined, the original images of the buildings used to define the training areas are also classified to check the quality of the classification procedure.

Classification of the trueDOP
For further investigations, only the buildings lying in a test tile are used as training areas for a representative reference to RGB colour values of roof pixels. Depending on the selected frequency threshold value, the classification of building objects of a test tile leads to different results, as the following pictures of Fig. 7 show: • When classifying the entire original image and using a threshold value of 1 and 2 (the occurrence of the RGB combination), significantly more pixels are classified as supposed buildings than there are actually buildings in the test tile. This is particularly true for roads that have similar colour values to roofs. • The other way round, the higher the threshold value, the fewer building pixels are recognized as buildings in the original image of the test tile. Thus, at a threshold value of 30, all large factory buildings with a homogeneous colour structure are detected, but hardly any residential buildings. Due to the very different roof coverings and the age of the buildings, they have a much more varied colour structure. They are therefore distributed over several RGB combinations close to each other.
An arbitrarily chosen section of the test tile from Fig. 7 illustrates this in Fig. 8. In the left column, only the pixels classified within the cut building ground plans are plotted and placed over the initial image, which result from the use of the respective threshold value. It becomes clear that the higher the selected threshold value, the less of the pixels originally located within the building perimeters are reclassified as building pixels. In the right column, the classification procedure was applied not only to the buildings, but also to the entire initial image. In the right column, the section shows that the classification procedure distinguishes vegetation very well from buildings and that there are hardly any misclassifications between vegetation and buildings. At the same time, as already mentioned, misinterpretations of pixels occur outside the building ground plans, especially in the street space.

Calculation of an nDOM Mask for trueDOP Classification
The normalized digital surface model (nDOM) forms the basis for the calculation of a mask for the classification of the trueDOP. It contains height values that are calculated from the difference between the bDOM and the DTM. Misclassifications of the road space in a trueDOP can be easily avoided by using the nDOM, in which only those pixels are considered which lie above a minimum height of e.g. 2.30 m (see Fig. 9). This would only classify bridges as buildings in the street space that are also of importance as structure. At the same time, the use of an nDOM accelerates the classification of roofs in the trueDOP to a very considerable degree, because only those image sections which are above the selected minimum height in the nDOM are considered to be classified. This means that a significantly smaller part of the original image is subjected to classification and the scope of classification is reduced accordingly. The use of data in binary format is essential for the processing routines of large amounts of data. For example, LAStools (rapidlasso GmbH) are available to the LDBV. LAStools are suitable for extremely high-performance processing of point clouds. For large parts of Bavaria, the nDOM can be calculated very efficiently with them. After calculating the nDOM with LAStools, other softwares must be used for the image interpretation.
To accelerate the classification of the trueDOP, the nDOM is placed as a mask over the trueDOP. The true-DOP is then cut accordingly-for example with the FME software (Safe Software Inc.)-and only the pixels of the trueDOP superimposed by the nDOM are considered for further processing.

Transfer to a Larger Test Area
To generalize the previous tests, the RGB colour cube is recalculated for all roofs in an extended test area. The test area covers 21,000 km 2 , which corresponds to approximately a third of the Bavarian state area. For further investigations, three reference classes are calculated-the first two for the classification of the roofs and the other for the determination of the vegetation. In the following, the calculation of the comparison classes is explained and the achieved classification results are presented.

Threshold Value for Building Classification Using the RGB Reference Colour Cube for a Third of Bavaria
To classify the roofs, the reference class "roofs" is determined from the superposition of the DFK with the trueDOP. On the other hand, a comparative class "roofs with a reduced margin" of 3 m width is calculated to minimize the influence of overhanging trees (see Fig. 5).
Due to the significantly larger data volume, the RGB colour cube, which is limited to 255 grey levels per colour channel, produces significantly higher frequencies for the colour values of the roofs compared to the previous investigations, which were limited to one test tile. The frequencies of identical RGB colour values for the 16 million (exactly 16,777,216) different RGB value combinations are now on average 350, minimum 1 and maximum 350,000.
If the classification for a test tile with the RGB colour cube calculated for a third of Bavaria and a threshold value of 250 is applied to the trueDOP covered by the nDOM mask, trees and bridges are still classified as roofs in the built-up area (see Fig. 10). A threshold value of 1000 reduces the misclassification of vegetation as roofs. With an even higher threshold value of 3000, building detection is also significantly reduced.
The pictures in Fig. 10 show that the determination of a suitable threshold value has a considerable influence on the quality of the classification and is one of the challenges in this classification procedure. A threshold value that is too high leads to a reduction in the number of roofs to be identified; a threshold value that is too low leads to higher misinterpretations.
The misclassification of trees as roofs is mainly due to the fact that the building roofs considered for determining the RGB colour cube in trueDOP contain many RGB colour Fig. 10 a nDOM mask with two new buildings in the upper right corner, whose buildings are not measured in the DFK and whose RGB values are not in the RGB colour cube. b Classification of the RGB values of the roofs in the trueDOP taking into account the nDOM mask with a threshold value of 250. New buildings are detected, but the vegetation is only slightly reduced. c At a threshold value of 1000, new buildings are not detected, but the misclassification of vegetation decreases significantly. d At a threshold value of 3000, many roofs are no longer detected values of the vegetation, as of trees that rise above the roofs. These vegetation components distort the reference for building roofs in the RGB colour cube, so that in the subsequent classification of the trueDOP mask covering the nDOM vegetation components are wrongly classified as building roofs. Even if an inner margin of 3 m is applied to the ground plans and only the RGB values of the "inner" roofs are used to determine the RGB colour cube, there are no significant differences (see Fig. 11). Especially in the shadow areas, the colour values of the vegetation and the dark roofs are similar.

Indirect Building Classification Using an RGB Reference Colour Cube for Vegetation
To avoid misinterpretations, a conceptual change of the classification procedure takes place, in which the RGB values for the vegetation are determined instead of the building roofs. However, this is much more computationally intensive for the creation of the RGB reference colour cube, since the aerial photographs contain considerably more vegetation areas than roof areas. The nDOM mask of Fig. 10 continues to serve as the basis for calculation. The building ground plans of the DFK are cut out of the nDOM mask. Due to the fact that the roof overhangs were not measured by cadastral survey, the outlines of the building were extended to the outside with a margin of 3 m. Thus, it is almost guaranteed that hardly any roof pixels are still contained in the database.
In Fig. 12b only the vegetation is obtained as a result, with the exception of a few new buildings which still remain in the clipped nDOM mask. It is not to be expected that many unmeasured new buildings will appear over a large data set, which would significantly distort the reference of the RGB colour cube for the vegetation. However, due to the selected height threshold of ± 2.30 m, the bridges in the nDOM mask remain when determining the reference class for the vegetation and are included in the reference of the RGB colour cube as a source of error. The average number of identical RGB colour values for the vegetation class of the 2.7 million different RGB values is 13,453. The maximum value is 2.3 million. This occurs in the shadow area with almost black colour.
Once the reference of the RGB colour cube for the vegetation has been calculated, the entire trueDOP is classified according to a predetermined threshold value. Then all vegetation pixels lying on the nDOM mask are subtracted. The remaining pixels of the nDOM mask represent the searched roofs (see Fig. 13). As a generalization, by subtracting the vegetation from the nDOM, all those objects are obtained that have the same colour as building roofs (Figs. 14, 15).
Roofs, for example of garages completely covered by trees, cannot be identified as roofs by the removal of vegetation in the nDOM mask. However, elsewhere something of the actual vegetation is filtered out of the nDOM mask. In addition, the lower the threshold value selected, the larger the range that is classified in trueDOP, so that the nDOM mask can filter out a corresponding amount of the actually existing vegetation. Conversely, the higher the threshold value, the less are the changes of the nDOM mask.  To further optimize the vegetation classification, very bright values can be subtracted from the vegetation colour cube. White, almost white and grey colour values often originate from modifications of the Earth's surface such as gravel pits and the like. Bridges with their grey values are also largely contained in the nDOM. These colour values occur very frequently and would be detected as vegetation points and thus removed from the mask. This must be prevented to find white and grey roof surfaces.
Technically, this is ensured by not only considering the white pixels with the RGB values (255,255,255), but also all pixels whose sum is R + G + B > 700 as vegetation points. The disadvantage of this optimization is that gravel pits remain in the mask. However, they can easily be recognized as misinterpretation.

Conclusion and Outlook
The investigations with the extended RGB colour cubes for a third of Bavaria show the following results: • The classification of the roofs using the two RGB roof colour cubes "with and without them"; results in -short calculation times when creating the reference classes, since the evaluation only takes place within the DFK floor plans; -shorter calculation times, if a classification only takes place on the nDOM mask; -no significant differences in the recognition of building roofs in the evaluation with and without them; -no significant reduction of vegetation in the nDOM mask; and -holes in the roofs, by an incomplete classification of the roofs in the trueDOP.
• With the indirect classification of the roofs via the RGB vegetation colour cube, on the other hand, the following results are obtained: -very high calculation times when creating the reference classes; -shorter calculation times, if a classification only takes place on the nDOM mask; -a significantly improved recognition rate of the roofs of new buildings; -a reduction of vegetation with a suitable threshold value; and -no holes in the roofs of the nDOM mask.
• Potential to avoid misinterpretations exists if the bridges in the nDOM mask could be removed. For this purpose, the bridge objects from the digital landscape model of the Authoritative Topographic-Cartographic Information System (ATKIS) would have to be available geometrically exact as polygons. • What remains unresolved is the heuristic definition of threshold values, which cannot be standardized and which probably has to be determined individually for each photo flight project.
The presented classification procedure is transferable to all federal states and can be used nationwide as soon as the federal states have processed the trueDOP. In Bavaria, the classified building roofs of the new buildings are transmitted to ten Agencies for Digitisation, High-Speed Internet and Surveying in order to determine buildings that have not been surveyed in cadastral terms and used to update the real estate cadastre and thus also to update the LoD2 building models. Further investigations using AI methods will follow to further develop and optimize the results and processes.
Acknowledgements Open Access funding provided by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. Fig. 15 a Found vegetation points without optimization; b found vegetation points without white values (middle); c and found vegetation values whose sum is R + G + B < 700 (right)