A Concept of Quality Management of 3D City Models Supporting Application-Specific Requirements

In this paper, a novel approach to specify application-specific requirements for 3D City Models is proposed. A modular set of geometric and semantic requirements that are based on the OGC CityGML Quality Interoperability Experiment (Coors and Wagner in Fernerkundung und Geoinformation eV 24:288–295, 2015) has been specified. Depending on the purpose of the model, not all requirements are mandatory. For example, if the model is used for visualization only, solid geometry is not required. However, if the same model should be used for analytic purpose such as heating demand simulation, solid geometry is mandatory. A formal definition of a validation plan is proposed in this paper to specify the application-specific set of requirements. This gives the city model manufacturers the possibility to provide proof that their model is usable in certain applications and can certify a certain level of quality. The concept is evaluated with the definition of a validation plan for heating demand simulation. It has been successfully implemented using the Software CityDoctor and SimStadt.

Besides industry and transportation, heating and cooling of buildings is one of the main sources of CO 2 emission. Reducing this heating and cooling demand will have a significant impact on climate protection worldwide.
To achieve this aim and to develop integrated energy concepts in urban districts, it is necessary to have an insight into the energetic performance of the areas of interest. Forecasting the future energy demand for heating and cooling for buildings at district level and beyond is essential for the development of climate protection strategies for municipalities world wide. This requires methods to simulate the impact of future developments such as refurbishment of buildings and reliable data of the existing building stock.
The availability of 3D building models has increased tremendously. Most of the models are available in CityGML (Kolbe 2009;Gröger and Plümer 2012). Some urban simulation tools such as SimStadt (Nouvel et al. 2015) and City-Sim (Robinson et al. 2009) support CityGML as an input heating and cooling demand simulation. The simulation results strongly depend on the quality of the input data. As an example, small errors in the building geometry can have a big impact on the calculated volume of a building (Biljecki et al. 2018).
To illustrate this, a simple experiment has been set up. A building with a rectangular footprint (3 m × 5 m) and a saddle roof with 3 m eaves height and 4.5 m ridge height is modelled in CityGML with LoD 2 solid geometry. Each polygon of the building geometry is defined by a sequence of points in counterclockwise order. A FME workbench is created to read the model and calculate the volume using the Transformer VolumeCalculator (Fig. 1). The resulting volume is 56.25 m 3 , which is correct. An error is then introduced into the model. The orientation of one polygon is changed by defining it with a sequence of points in clockwise order. The geometry is still the same, no coordinates have been modified. But calculating the volume of this model leads to a volume of 18.75 m 3 . As the heating and cooling demand of a building depends on the volume, this will lead to wrong simulation results.
In this paper, a general methodology to define application-specific requirements to a 3D City Model is proposed. This methodology is independent of a specific software solution, but of course, an implementation is needed to validate existing models. In this paper, the software CityDoctor is used for this purpose.
The paper is organized as follows: Sect. 2 will give a brief summary of the state of the art in quality management of 3D City Models. In Sect. 3, an overview of a monthly energy balance to calculate the heating and cooling demand of a building is given. A general methodology to validate 3D City Models is introduced in Sect. 4, with a focus on geometry validation in Sect. 5. This methodology is applied in Sect. 6 to validate if a 3D City Model is suitable for heating and cooling demand simulation. Section 7 shows an implementation of this validation process in a use case in the city district "Stadtgärtnerei" in Mainz, Germany. The paper concludes with discussion of the proposed methodology and the achieved results in Sect. 8.

State of the Art
In 2015, Biljecki et al. summarized applications that make use of 3D City Models from interactive visualization, urban planning, shadow and viewshed analysis to urban analytic and simulation (Biljecki et al. 2015). These applications have very different requirements to the input data. For interactive visualization, it is sufficient to represent a building geometry by a set of non-overlapping polygons, with no further constraints. In contrast, urban analytics and simulation usually includes the calculation of building volumes. In this case, a solid geometry of the building is mandatory. These different requirements have to be taken into account in quality management of 3D City Models. As CityGML is an XML Fig. 1 Data quality matters: changing just the orientation of one polygon in the building geometry reduces the calculated volume from 56.25 (correct) to 18.75 m 3 using FME volume calculator format, any CityGML document can be validated against the XSD schema. However, this does not include any validation of the geometry or can take into account application-specific requirements. Ledoux (2013) has proposed a methodology to validate solid geometry. Wagner et al. (2013a, b) take into account not only geometry, but also include some semantics such as BoundarySurface into the validation process. Both approaches have laid the foundations for the OGC CityGML Quality Interoperability Experiment (QIE) to define a unified method for the validation of 3D City Models (Coors and Wagner 2015). The result of this activity was the specification of a set of validation rules that can be used to validate CityGML models and conformance requirements as defined in the CityGML standard. However, application-specific requirements are not taken into account. In 2016, Biljecki at al. did a survey on the quality of existing CityGML models in Biljecki et al. (2016). However, the purpose of the model was not taken into account in this study. This is fundamental, as for example, a building geometry that consists of Multi-Surface geometry using triangles only is valid according to the CityGML standard. The standard requires a MultiSurface OR a solid geometry in all levels of detail.
The Working Committee of the Surveying Authorities of the Laender of the Federal Republic of Germany (AdV) has defined a CityGML profile for a nation-wide CityGML building model in 2016 (Landesamt 2019). This profile defines some restrictions such as a building has to have a solid geometry, and requires some mandatory attributes such as building function. Based on the results of CityGML Quality Interoperability Experiment, the AdV has published a validation plan for their profile in 2017 to enable quality management on a nation-wide CityGML 3D building model in LoD 1 and LoD 2.
To summarize, lot of work has been done to validate solid geometry. However, a systematic approach to take into account application-specific requirements in the validation process is not in practise yet. On the other hand, many existing models are suited for visualization, but not necessarily for urban analytics and simulation applications, as this usually requires a valid solid geometry.

Balance Equations
By applying the first law of thermodynamics to a given building (see Fig. 2), we have: Assuming a constant temperature inside the building (thanks to an idealized heating system), we get:

Building Simulation in SimStadt
SimStadt is an urban energy simulation tool (Nouvel et al. 2015). Several workflows are available, including a Monthly Energy Balance simulation based on DIN V 18599 (Din 2007).
The geometry is imported and then analyzed to determine building type, volume, external area and shared walls area. Additional attributes are required for the simulation, e.g. building function and year of construction. A coordinate reference system also needs to be defined in order for weather and irradiance calculations to be possible.

Influence of Geometry on the Energy Balance
The building geometry has an influence on each of the terms included in the Eq. (1). As an example: • Polygon orientation must be correct to calculate solar gain. • Building volume is used to estimate the internal area, which impacts internal gains and specific heat demand. • The total area of exterior surfaces is used to estimate ventilation and conduction losses.
Trying to calculate the volume of buildings as described in the introduction with a faulty geometry can lead to wrong Solar gain + internal gains + heating − ventilation losses − conduction losses = change in the internal energy of the building. (1) Heating = ventilation losses + conduction losses − solar gain − internal gains

Methodology
To validate if a CityModel or more precisely a XSD-valid CityGML document fulfills the requirements of a specific application, the following approach is proposed in this paper. First of all, a formal system to specify such requirements has been developed. In addition, an algorithm is needed to check whether a CityGML document fulfills a requirement or not. A validation software implements these algorithms.
To ensure interoperability, the specification of the validation plan as well as the requirements have to be agreed upon. Based on the CityGML QIE, a modular set of requirements is proposed. For a specific application, a subset of these requirements is chosen to define a validation plan. This approach will be evaluated by a validation plan for heating demand simulation. The results of the model validation include a reference to the validation plan to report what has been validated, and the validation results of any CityObject. The entire process from the CityGML document to the simulation results is shown in Fig. 3. Please note that the improvement of the 3D City Model usually requires some semi-automatic iterations. The entire process has been evaluated with a use case in the City of Mainz. To calculate the heating energy demand, the software SimStadt has been used. Data were provided by the City of Mainz. The focus of this study is the validation process, not the simulation itself.

CityGML Validation
As CityGML is based on XML, each CityGML file is an XML document.

Definition 1 A CityGML document is schema conform, if it
is validated against the CityGML XML Schema Definition and no errors are found.
However, an XSD valid CityGML file is not always suited for simulation purposes, as CityGML itself allows many different options to model a building (Coors and Wagner 2015). Several additional requirements have to be fulfilled for this purpose.
Definition 2 A requirement r is a verifiable criterion that says something about the content of a CityGML document or the data described therein.
As an example, a building has to have a valid solid geometry and the attributes yearOfConstruction and usage are mandatory in heating demand simulation. Even if these attributes and the building geometry are missing, the CityGML document is XSD valid, as all these elements are optional in the standard.
Definition 3 Let D be the set of valid CityGML documents and d ∈ D . A check c r ∶ D → Boolean is a function to validate a given CityGML document against the requirement r: If c r returns false, a specific error code including additional parameters can be stored.
Iterative process to validate and repair a 3D City Model against an application-specific validation plan. Both the validated (and repaired) model and the validation plan are used as input in SimStadt to calculate monthly heating energy demand Definition 4 A validation plan is a set of requirements R = {r 0 , r 1 , ..., r n } together with a set of checks C = {c r 0 , c r 1 , ..., c r n } that shall be used to validate these requirements.
It is possible that some requirements are necessary in every validation plan for every application. To be as general as possible, no set of requirement is defined which may be applied to all city models. If there is such a set, those requirements are simply included in every validation plan.
Definition 5 A validation software is an implementation of algorithms to perform the checks of the validation plan.
The requirements and the related checks have to be defined and agreed by data suppliers, data producers and data consumers to be able to develop data sets that can be used for multiple purposes. The aim of the OGC CityGML QIE (Coors and Wagner 2015) is to come up with such definitions. In the following section, a validation plan for a 3D building model to be used to calculate the heating demand of a set of buildings using a monthly energy balance in the simulation software SimStadt, will be proposed. The validation plan is based on the CityGML QIE and can be used for input data to a similar simulation software such as CitySim.

Validation Plan for Heating Demand Simulation Using CityGML Building Models
The requirements of a CityGML data building model for heating demand simulation using a monthly energy balance method can be summarized as follows, depending on the level of detail of the building model: In case of LoD 1: • the CityGML document has to be XSD valid • element yearOfConstruction 1 is mandatory for each building • element function 2 is mandatory for each building • each building and building part has to have a valid lod1Solid geometry Remark 1 If a building par t has no element yearOfConstruction or function, the value from the parent building shall be used in SimStadt.
In case of LoD 2: • the CityGML document has to be XSD valid • element yearOfConstruction is mandatory for each building • element function is mandatory for each building • each building and building part has to have a valid lod2Solid geometry • each valid building and building part has to have valid Roof-, Wall-, Ground-, OuterFloor-, OuterCeilingSurfaces with an unambiguous azimuth and tilt Remark 2 According to the CityGML standard, boundary surfaces such as Wall-, Roof-, and GroundSurface shall not be used in LoD 1. In SimStadt, they are required, but can be automatically derived from a valid solid geometry.
Some of these requirements can be expressed in a formal language such as XQuery or Schematron (Wagner et al. 2014). In Coors and Wagner (2015), Schematron is proposed to validate CityGML conformance requirements. This approach is used to formalize the above-mentioned requirements as well. Each requirement will be identified by a given id and a related error code if the requirement is not fulfilled.
The requirement that the element yearOfConstruction is given per building in the CityGML document can be expressed in Schematron as follows: value−o f s e l e c t ="@id"/> </ a s s e r t > </ r u l e > Similar rules can be defined for other mandatory elements such as function and lod1Solid. For simplicity, the name of the mandatory element is a parameter of this requirement. It is named SE-bldg:BU-0001 following the naming conventions of Coors and Wagner (2015). The same requirement for building parts is called SE-bldg:BP-0001.
However, geometry validation cannot be expressed as a Schematron statement as several geometric constrains have to be fulfilled to proof that a collection of polygons is a valid solid. A solid is defined in ISO 19107 (ISO 2003)

Ring Checks
Definition 6 An ordered set or sequence is an ordered list of elements. Unlike a set, order matters, and the exact same elements can appear multiple times at different positions in the sequence. A finite sequence a with n + 1 elements is denoted as a = (a 0 , a 1 , ..., a n ) . The empty sequence a = () has no elements.
Definition 7 A finite sequence of points R = (P 0 , P 1 , ..., P n ) is a valid linear ring if I R has at least four points: n ≥ 3 II All points of the sequence besides first and last point are different: P i ≠ P k , i = 0..n − 1, k = 0..n − 1, i ≠ k III The first and last point P 0 and the last point P n are the same: P 0 = P n IV Two edges (P i , P i + 1) and(P k , P k + 1), i = 0, ...n − 1, k = 0, ..n − 1, i ≠ k do only intersect in one start-/ endpoint. No other intersection is allowed.
If all points of the sequence are co-planar, the linear ring is planar. Planarity of a linear ring is not required, but is defined on polygon level.
Usually, the coordinates of a point p ∈ R are given as floating point numbers. A parameter > 0 and a norm have to be introduced to define equality of two points. The default norm is the l 2 norm.
Definition 8 Two points P and Q are the same if ‖(P, Q)‖ 2 < . > 0 is called the just notable difference (JND) of two points.

Remark 3
The JND of two points is not the same as the precision of the point location. For example, the precision of a measured point location can be 10 cm, but two different points might have a distance of 1 cm. These points are still two different points, even though there are some uncertainty in the location of the points.
The impact of such a definition is illustrated by the following example of a real-world CityGML building model. The building model with gml:id=DENW22AL10000c8S of the CityGML document LoD1_362_5700_1_NW.gml 3 contains a very small polygon in the LoD 1 geometry. Two times two consecutive points of that polygon have just a 1mm difference in the x-coordinate, y-and z-values are exactly the same. If the model is validated against the linear ring requirements with a minimal point distance = 0.0005 , the geometry is valid. With a minimal point distance of = 0.0011 , the linear ring is not valid any more. Two GE_R_CONSECUTIVE_POINTS_SAME errors will be thrown during validation. And it will lead to GE_R_SELF_INTERSECTION and GE_R_COLLAPSED_ TO_LINE errors as the polygon degenerates to a line in this case (Fig. 4).
As the ADV specifies the use of three digits after the decimal separator in its CityGML profile for LoD 1 and LoD 2 building geometry (Landesamt 2019), = 0.0005 is Two consecutive points shall not be the same GE_R_CONSECUTIVE_POINTS_SAME GE-gml:LR-0003 First-last points have to be the same GE_R_NOT_CLOSED GE-gml:  No self-intersection GE_R_SELF_INTERSECTION GE-gml:LR-0005 Linear ring shall enclose a no empty area GE_R_COLLAPSED_TO_LINE a good threshold value in this case and the model is valid. The parameter JND of two points is very essential for the validation process and should be added to the metadata of the CityGML document.

Polygon Checks
Definition 9 A set of planar linear rings S = {R 0 , R 1 , ..., R n }, S ≠ � is a valid polygon if I the exterior linear ring R 0 and all interior linear rings R 1 , ..., R n are co-planar. II The interior linear rings must be completely included in the area defined by the exterior linear ring. Interior linear rings must not overlap or be included in another interior linear ring. III Interior linear rings and the exterior linear ring touch each other in a finite number of points. IV The inner of the polygon, as defined as the inner of the exterior ring excluding the inner of the interior rings, is connected.
V The order of points of the exterior linear ring defines the orientation of the polygon. The interior linear rings have to have the opposite orientation.
The interior linear rings define holes in the polygon. Table 2 gives an overview of the related requirements for polygons. Requirement GE-gml:PO-0002 ensures that (I) is fulfilled. Planarity of a polygon (within a given tolerance) can be defined using distance of all point to a regression plane or by the deviation of the normal vector. For the deviation algorithm, the polygon needs to be tesselated. Each of the resulting triangles has a normal vector n resulting in a set of normal vectors for the polygon N = {n 0 , n 1 , ..., n m }, S ≠ � . The polygon is planar if the scalar product of two normal vectors is less than a given threshold ∀N i ∈ N, ∀N k ∈ N ∶ ⟨N i , N k ⟩ < . GE-gml:PO-0001, GE-gml:PO-0004 and GEgml:PO-0005 cor respond to (II) and (III), GEgml:PO-0003 to (IV) and GE-gml:PO-0006 to (V).  Exterior and interior rings have same orientation GE_P_ORIENTATION_RINGS_SAME 1 3

Solid Checks
Definition 10 The set W = {S 0 , S 1 , ..., S n }, n ≥ 4 of polygons is a solid geometry if: I The intersection of two polygons S a ∈ W defined as a set of planar linear rings S a = {R a 0 , R a 1 , ..., R a n }, S a ≠ � and S b ∈ W defined as a set of planar linear rings is either empty or contains only a set of points Q ≠ ∅ and a set of edges E ≠ ∅ that are part of both sets of linear rings: i s a n e d g e i n R 1 ∈ S j , j = 0...n ⟹ (e = (p j , p i ) is an edge in R 2 ∈ S k , k = 0...n, k ≠ j ∧ e = (p i , p j ) is not an edge in any other polygon R ∈ S k , k = 0...n, k ≠ j. III All polygons R ∈ S j , j = 0...n are oriented such that the normal vector of each polygon points to the outside of the solid. VI The dual graph consists of a set of nodes V W and a set of edges E W . Every node v ∈ V W represents one polygon of S ∈ W . An edge e = (q i , q j ) shared by two polygons S a ∈ W and S b ∈ W is represented by an edge e a b in E W . V For every point P of a linear ring R of a polygon S ∈ W applies: The graph G P = (V P , E P ) , that is built by polygons and edges that touch P is connected. Each node v ∈ V P represents a polygon in which a linear ring contains P. Two nodes are connected by an edge e ∈ E P if the two polygons represented by the nodes have a common edge that touches P.
The statement of (I) formulates the intersection of the two polygons S a and S b . The intersection results in an edge, which is an existing edge in both of the polygons. Polygons should always only "touch" along existing edges. The intersection must not be a new edge. The statement of (VI) formulates the requirement that the graph is connected.
From (I) and (II) follows that the surface defined by W has no holes. Together with conditions (IV) and (V), it follows that the inner of the solid is connected.
If all these requirements of a solid's geometry have been fulfilled, the solid is valid (Table 3).

Scalability
All geometric requirements have been implemented as checks in the CityDoctor software and tests have been conducted to ensure that massive amount of data can be processed in a reasonable time frame.
For this test, a PC with an i5-2400 @ 3.1 GHz was used.
The file was a DLM-Model of Niedernhall with a filesize of 3.3 GB. The batch process was completed in 4 m and 50 s while utilizing 1 core and less than 2 GB RAM, producing a pdf and an xml report output file.

Order of Requirements
Requirements depend on each other. In general, polygon requirements depend on the ring requirements and solid requirement depend on the polygon and ring requirements. Every polygon has to be a valid polygon before the solid requirements can be checked, otherwise it is possible to get false positive or false negative errors, which only lead to confusion. For a detailed view of the dependencies, refer to the wiki of CityDoctor Validation Plan for SimStadt where all requirements and checks are listed and described. A dependency graph can be found in Coors and Betz (2019).

Storing the Validation Results as Metadata
The results of the performed checks and the validation plan itself need to be stored as metadata. Other applications can use the metadata to ensure the validity of their calculations. Once there are known errors, the result may also be wrong but if no error has been reported, one can assume the correctness of the calculation. All solid geometries have to be valid One or many of the above defined error codes  The metadata need to contain the validation plan to know which requirements were checked, the parameters used for the checks (e.g. point equality distance), the validation results, the date and the file or model that was validated. For this purpose, a reference between metadata and model is possible.

Validation Plan for Monthly Energy Balance
Not every requirement is useful for every use case. Each use case must specify a set of requirements and parameters to ensure that it works correctly.
To simplify the definition of a validation plan, an addition requirement GE-gml:SO-0009 (valid solid geometry) is introduced to group all the necessary requirements.
The validation plan for building models with LoD 2 geometry is more complex as each building and building part has to have valid Roof-, Wall-, and GroundSurfaces with an unambiguous azimuth and tilt. A boundary surface has a lod2MultiSurface geometry with no further constraints.
Similar to the solid geometry, a grouping requirement GE-gml:MS-0001 is introduced to validate multisurface geometry. A MultiSurface geometry M is valid if all polygons S ∈ M are valid. To ensure that BoundarySurfaces such as Roof-, Wall-, and GroundSurface have an unambiguous azimuth and tilt, all polygons S ∈ M of the lod2MultiSurface geometry M of the BoundarySurface shall be coplanar within a given tolerance.
For the monthly energy balance in SimStadt the requirements for LoD 1 are listed in Tables 4 and 5. For LoD 2, more requirements are needed because of the necessary boundary surfaces which need to be referenced by the solid. Table 6 has those additional requirements listed. Additionally for both LoD 1 and LoD 2, a coordinate system needs to be defined in the CityGML file. This is necessary to retrieve weather data for the city, which is needed for the simulation.
The validation plans for SimStadt have already been implemented in CityDoctor to verify CityGML files for the usage in SimStadt.

Example Stadtgärtnerei Mainz
As a use case, the heating demand of the city district Stadtgärtnerei in the City of Mainz, Germany is simulated using SimStadt based on a 3D Model of the existing building stock. As input data, a CityGML file containing the LoD 2 geometry of the buildings was provided by the City of Mainz. The first step before the simulation can be performed is to validate the model according to the specified validation plan. CityDoctor v3.2.0 was used. As a result, the application found several hundred GE_R_CONSECUTIVE_POINTS_SAME errors. An overview of the initial error distribution can be seen in Fig. 5. These kinds of errors are not necessarily detrimental to the simulation process, but they do inhibit some algorithms in the calculation. If two points have the same or nearly the same coordinates, they create an edge that has a zero-length or a nearly zero-length. Such an edge can result in some extreme values for some calculations. The checks are organized hierarchically. Thus, a following check depends on the result of the previous. However, in some cases it is necessary to perform a check without having completed the previous checks. In this case, the error code GE_R_CONSECUTIVE_POINTS_SAME "hides" any previous errors, which are to be detected at a later stage. Consequently, those checks are not executed and it is, therefore, unknown if they would have found further errors (Fig. 6).
Fixing the model was an iterative process starting with GE_R_CONSECUTIVE_POINTS_SAME errors. After removing those, further errors were found. Most of them were GE_S_NON_MANIFOLD_EDGE, GE_S_NOT_CLOSED and GE_S_MULTIPLE_CONNECTED_COMPONENTS errors. More rare were GE_S_POLYGON_WRONG_ ORIENTATION errors were found more rarely.
The manifold edge errors were the result of polygons located inside the geometry. While the geometry looks correctly from the outside, these superfluous polygons have an impact in the calculations, specifically the volume calculation. The volume calculation takes every available polygon into account, which consequently results in wrong results.
Another error that has a major impact on the volume calculation is the wrong orientation of a polygon.
After the repair of the CityGML model, SimStadt was able to calculate the heat demand of the city quarter.

Conclusion
The workflow of validating the GML models before the simulation works very well as there is now a possibility to judge the validity of the simulation results. The validation plan will be used in the simulation software SimStadt in the workflow to automatically create an indicator on the uncertainty of the resulting values.
For better feasibility in validating the models, it is imperative that the creator of the CityGML files inserts the accuracy of the vertices into the file as a mandatory parameter. Otherwise, choosing a useful JND may result in "guesswork" or the chosen JND may not be useful. As an example, if a file has a vertex accuracy of three decimal places and two JNDs are chosen for two different validation plans, respectively, 0.0001 m and 0.00001 m. They would both have the same result as the input data do not have the necessary accuracy to support those JNDs.
The validation plans can also lead to a categorization of the 3D-Models. Not every CityGML file is applicable for any application. To check whether the file can be used or not, the validation plans and the validation software are applied. Manufacturers can on the basis of our approach specify that their model can be used for some applications while for others, they may provide a different model. This may lead to the fact that each application produces its own model with each of those being a different one. However, this is not the intention of the concept.
Our investigation has rather revealed a hierarchy of categories where the lower categories are specializations of the higher ones. Consequently, a city model manufacturer has the choice of selecting a higher category with its usability for all applications or a lower category, which serves specific requirements.

Future Work
It is planned to extend CityDoctor to implement a repair feature where faulty buildings are allowed to be in a (semi-) automatic process. This is not a simple task, as there are many ways to fix seemingly simple errors with different results and repercussions.
A working group is implementing validation plans for further applications to ensure that the methodology provided in this paper is applicable to multiple applications.