Introduction

In urban planning, the seamless integration of Building Information Modeling (BIM) and Geographic Information Systems (GIS) remains a major challenge. Sustainable urban planning and management is important, and integration is necessary to ensure the sustainability of smart cities. However, the loss of geometric and semantic data when transforming from BIM to GIS is a significant barrier to achieving sustainable integration. The fact that BIM and GIS are defined by different data structures causes inconsistencies between these data and hinders interoperability between the two technologies. The motivation behind our work is to address this gap by proposing a novel solution that preserves semantic information and geometric integrity, thus facilitating more coherent urban planning and development strategies.

Rapid and unchecked urbanization is a major challenge facing both developed and developing countries (Madlener and Sunak 2011; Uduporuwo 2020). Migration from rural to urban areas has been the main driver of this uncontrolled urban growth (Tian et al. 2017). This increase due to migration complicates the processing of urban information and future planning efforts. In response, the “smart city” concept has emerged to facilitate effective and streamlined urban planning with the goal of sustainability (Toli and Murtagh 2020). Over the past two decades, the idea of a smart city has gained considerable traction (Sharifi et al. 2021). Beyond planning, smart cities also provide essential tools for informed infrastructure management and development decisions (Stępniak et al. 2021).

Three-dimensional (3D) city models are typically used for visualization purposes, but they can also be effectively used for planning and development purposes. Therefore, interoperability between Building Information Modeling (BIM) and Geographic Information Systems (GIS) in the creation of 3D city models is essential. BIM-GIS integration combines two environments that are rich in building information and useful for urban planning (Fig. 1). BIM-GIS integration can be used in spatial analysis such as natural disaster planning, emergency response, transportation network planning, etc., and more precise and accurate results can be obtained (Zhu and Wu 2022).

Fig. 1
figure 1

Integration BIM-to-GIS technologies can improve the overall planning, analysis, and management of both urban environments

The rich structural information in BIM may not find its exact counterpart in GIS. On the other hand, other objects in BIM and their geographical significance may not find their counterparts in GIS (Arroyo Ohori et al. 2018). 3D models created with BIM can represent detailed building information on a large scale and in a local coordinate system. They support application areas such as indoor navigation (Boguslawski et al. 2022; Liu et al. 2020) and building energy modeling (Pereira et al. 2021). 3D models created with GIS are capable of representing urban areas on a large scale and in global coordinate systems. GIS supports various applications such as hydrology, spatiotemporal and spatial analysis (Biljecki et al. 2015).

When building information is created in 2D format, there are problems in transferring and using it in a GIS environment. For example, since the AutoCAD file format has no equivalent in GIS (GIS can manipulate the CAD data, but the operator must first edit the data), information can be lost during the conversion phase (Karimi and Akinci 2009). Today, a similar situation exists for 3D city models. The 3D models produced by BIM are stored in different file formats, with Industry Foundation Classes (IFC) being the most used (Sani and Abdul Rahman 2018; BuildingSmart 2024). The 3D models produced by GIS are also stored in different data formats: Shapefile (spatial vector data format developed by Esri) (Esri 2024), CityGML (City Geography Markup Language) (CityGML 2024), and CityJSON (CityJSON 2024) are commonly used. In this study, the IFC-CityJSON transformation is investigated to ensure the transfer of both geometric and semantic information.

Research on the integration and interoperability of BIM and GIS has progressed significantly. However, many challenges remain. There is a content and georeferencing mismatch between BIM and GIS. BIM excels in content but lags in georeferencing, while the reverse is true for GIS (Diakité and Zlatanova 2020). However, there are also studies where BIM models can perform georeferencing (Zhu and Wu 2021). Conflicts arise due to the different data formats used by each technology, with BIM using IFC and GIS using formats such as Shapefile, CityGML, and CityJSON (Sani and Abdul Rahman 2018). Comparing IFC vs. CityJSON is challenging due to the differences in their level of detail (LOD) (Tan et al. 2023). During the conversion process, both geometric and semantic information may be lost (Noardo et al. 2020). While progress has been made in converting data from BIM to GIS, the loss of semantic detail remains an issue. The lack of continuous, bi-directional data transformation between BIM and GIS prevents full interoperability. In addition, achieving data integration is complex, primarily due to the inherent challenges of interoperability.

Related work

The benefits of integrating BIM and GIS have been demonstrated in various studies (Zhu et al. 2018; Zhu and Wu 2022; Stouffs et al. 2018). GIS focuses on the geographic information of structures and building components. BIM allows the detailed storage of structural elements and their semantic information in the project file. It plays an important role in economic and construction planning, which is important for structural engineers and architects (Liu et al. 2017). BIM is preferred in architecture, engineering, and construction (AEC) due to its suitability for planning and the ability to view and manipulate structures in 3D (Vilutiene et al. 2019).

With the increasing demand to combine indoor and outdoor applications, current efforts are focused on bringing together building models and spatial models based on building information (Liu et al. 2017). The integration of GIS and BIM enables effective information management at different stages of a project’s life cycle, such as planning, design, construction, operation, and maintenance (D’Amico et al. 2020). Effective management becomes a life-saving support (Xie et al. 2017). In addition, integrated capabilities support the development of smart and sustainable cities through technology applications, data integration, and urban solutions (Ma and Ren 2017).

The interoperability of BIM and GIS depends on the full integration of the 3D models created from the two data formats (Sani and Abdul Rahman 2018). It contributes significantly to construction management, cultural heritage preservation, disaster management, urban energy management, climate adaptation, etc. (Wang et al. 2019).

To classify the integration between GIS and BIM, several levels of integration have been proposed (Amirebrahimi et al. 2016). These are process, ontology, schema, system, and service-based approaches (Sani and Abdul Rahman 2018). Although there are various studies for each level, process level (Wu et al. 2014; Park et al. 2014) and data level (Isikdag and Zlatanova 2009; Rafiee et al. 2014), it is difficult to achieve successful results at the application level. This is because at the application level, successful integration and data interoperability are essential (Sani and Abdul Rahman 2018). Although there are several studies (Zhu et al. 2018; Liu et al. 2017; Song et al. 2017) on the integration of BIM and GIS, there are still many challenges (georeferencing, semantic transformation, different data types, etc.) because BIM and GIS were originally developed for different purposes (Liu et al. 2017). This paper focuses on the data level of BIM and GIS integration and develops a new approach to realize geometric and semantic transformation with the developed code.

Selection of data formats for BIM and GIS integration

The stages of GIS and BIM integration are a subject that needs to be detailed, and for this reason we think it is appropriate to start with the data formats used. Since interoperability and integration between GIS and BIM are essential, it would be right to choose compatible data formats. The research continued with CityJSON and IFC, the data formats used in the integration. CityGML and CityJSON are both 3D urban data modeling formats, but they differ significantly in their approach and usability. CityGML, an XML-based format that is part of the Open Geospatial Consortium standards, is known for its rich and complex data model, making it ideal for detailed representations, but potentially cumbersome due to its complexity and larger file sizes. In contrast, CityJSON uses JSON (JavaScript Object Notation), which is more useful for web visualizations, resulting in smaller file sizes and better performance. CityJSON was also chosen because of the convenience of the Python libraries. While CityGML’s adherence to OGC standards ensures broad support in GIS software and systems, making it the preferred choice for projects requiring detailed urban models and interoperability, CityJSON is becoming increasingly popular in web-based applications and among developers for its ease of use and rapid editing. Although CityGML has generally been used as GIS output data in previous studies (Tauscher 2019), we used CityJSON in our research. The main reason is that CityJSON is based on JSON, which makes it easier to translate into computer languages (Ledoux et al. 2019). This is mainly because IFC can be parsed to JSON and it is easy to map CityJSON and IFC data parsers to JSON. In addition, the fact that CityJSON is innovative and updated concurrently with CityGML also played a role in our selection of this data as output (CityJSON 2024).

Industry Foundation Classes (IFC)

IFC is a 3D data standard that was created out of necessity for the AEC industry (Vilutiene et al. 2019). As an open BIM standard (along with BIMXML and COINS), IFC is the most widely used data model for information transfer within AEC (Deng et al. 2016) and is governed by buildingSMART International. It has an information schema based on the object-oriented concept and in the EXPRESS language. However, IFC’s handling of Level of Development (LOD) values is flexible. IFC is adept at conveying a wide range of LOD specifications, fluidly adapting to the diverse needs of users. This adaptability underscores the model’s user-centric approach, allowing for the tailored use of LOD to meet specific project needs. However, LOD definitions can be explained as follows: LOD0 consists of semantic information with no physical representation, LOD100 shows basic block models, LOD200 contains roof shapes and various 3D building information, LOD300 contains building related interior and exterior structural information, and LOD400 contains all building interior and exterior structural information and all architectural content such as furniture and other content (Fig. 2).

Fig. 2
figure 2

IFC building model with different level of developements (LOD100 to LOD500) (TrueCADD 2024)

CityJSON

Although CityGML has been in use for a long time and a standardized version has been created, especially in recent years, a JSON-based encoded CityJSON has been created to use the data for purposes such as web-based visualization (Ledoux et al. 2019). CityJSON is an Open Geospatial Consortium (OGC)-approved version of CityGML 3.0.0, encoded using an open source standard. Like CityGML, this coded data model expresses a set of urban objects such as buildings, roads, rivers, bridges, vegetation, and their relationships (Fig. 3). It also defines different LODs, allowing different objects to be expressed at different resolutions (Ledoux et al. 2019). While CityGML and CityJSON are fundamentally similar, there are some differences. For example, CityGML is XML-based while CityJSON is JSON-based, which improves the usability of CityJSON in web applications. While CityGML has a complex structure, CityJSON has a simpler structure because it is based on JSON, so it is easy to read by humans and computers. This means that while CityGML conveys much more complex data expressions and models, CityJSON provides a simplified version of these. This makes it easier to store and implement.

Fig. 3
figure 3

Schematic representation of CityJSON data structures (Ledoux et al. 2019)

Research question

The integration of BIM and GIS is a new and rapidly developing topic that will take urban planning to the next level. The main challenges in realizing GIS-BIM integration are the geometric and semantic information levels of the two data. At the geometric level, issues such as BIM and GIS using different reference systems, mismatch of 3D geometry and LODs, and difficulties in georeferencing BIM data are the main challenges of the geometric aspect of integration (Sani and Abdul Rahman 2018). In addition, semantic information is also lost during GIS-BIM conversion. This is due to the information mismatch between the two data standards. For example, a door that may be called “door” in CityJSON is called “ifcdoor” in IFC. Even though the data is geometrically matched, the attribute information of the objects may be lost during the conversion. This loss occurs because data schemas based on inheritance in IFC are not fully equivalent in CityJSON. Therefore, while geometric data and vertices can be retrieved through type mapping, only geometry-based semantic information can be passed to CityJSON through this mapping. For this reason, in addition to the type mapping, a mapping based on unique IDs should also be performed so that the semantic information of the data associated with inherited IDs can also be passed. Because BIM has more data types than GIS, it may not find a counterpart in the CityJSON data standard, which focuses more on geographic information. In this study, semantically linked data is used to make the whole connection and to transfer the data correctly, and to ensure that the semantic attribute data is transferred correctly.

Materials and methods

One of the main problems to be solved is the loss of geometric and semantic information during transformation. Although various efforts (Table 1) have been made to prevent the loss of information, there are still shortcomings in terms of seamless data transfer. Since the two technologies, BIM and GIS, store both geometric and semantic data in different ways, the incompatibility between two different technologies is also an issue (Beck et al. 2021). In addition, most of the research on this topic has not proposed a fully automated transformation approach (Donkers et al. 2016). Although research in this area has increased in recent years, there are still gaps. Semantic Web technology is the most prominent and promising of the proposals for BIM and GIS interoperability. BIM and GIS data, which are difficult to reconcile, can be accurately transferred by redefining them thanks to the Semantic Web.

Table 1 Short summary of some studies on BIM-GIS integration

Our approach to developing the conversion algorithm was guided by principles from both semantic web technologies and geometric conversion tools. We first analyzed the structural and semantic differences between the IFC and CityJSON formats, and identified key challenges in preserving data integrity during conversion. Using Python and open source libraries, we iteratively designed and refined our algorithm to ensure it met our pre-defined goals. This process included continuous testing with different datasets to validate our approach.

Data

Open source data is essential for academic studies today. Through these data, researchers can continue their studies without any restrictions. This study aims to convert IFC data into CityJSON; for this reason, we accessed IFC data that we can access from the Internet and that is open source. This eliminates the risk of sharing the details of an architectural project. Open source data providers are another important issue. This makes it easier to find and fix errors in the data.

Accordingly, the IFC data shared by BuildingSmart, Karlsruhe Institute of Technology (KIT), and BIM Whale was used primarily to create the transformation code. In this way, the risk of information about the data content not conforming to the IFC schema was avoided. The data used and detailed information about these data are shown in Table 2.

Table 2 IFC data used in the conversion process and its open-source citations

Apart from this information about the data, another important part is the version of the data. The most common versions that define IFC data are IFC 2 × 3 (BuildingSmartc 2024) and IFC 4 (BuildingSmartd 2024). This approach aims to convert not only one version but multiple versions to CityJSON. Therefore, the proposed approach has been tested with different IFC versions. Choosing the correct version of the converted data is also essential for the results of the study. For this purpose, CityJSON version 2 (CityJSONb 2024), the latest version released by CityJSON, was used.

Method

The proposed method aims to automate the transformation process that will enable BIM-to-GIS integration. In this way, a reorganization process will not be necessary to convert IFC data from different sources. In addition, the method aims to avoid the loss of semantic (some of the attribute information of components) and geometric (opening elements for this study) data. The process flow of the method is shown in Fig. 4.

Fig. 4
figure 4

Flow diagram of the whole conversion process and describing geometric and semantic transformation

For the transformation to occur, IFC and CityJSON data models were first analyzed and matching building objects (walls, windows, doors etc.) were identified. The priority here is to make the correct semantic mapping. This way, objects of different data types can be mapped and semantic and geometric information can flow between them. Accordingly, the objects to be matched were identified based on the data we used and a type mapping table was created (Table 3). The criterion here is introducing identical or similar structural units into the computer language. For example, when the code captures the “IfcWindow” object in the IFC data, it will transfer the related semantic and geometric information to the related “Window” object in the CityJSON data. While doing this, the method performs the data flow precisely by creating Unique ID for all objects and definitions to avoid data loss. Reusing the GlobalId as the Unique ID when generating CityJSON objects ensures that the original, unique identifiers from the IFC model are preserved in the conversion process. This approach addresses potential data consistency and entity tracking issues across different data formats and stages of a project.

Table 3 Data type mapping between IFC and CityJSON for the object matching

The developed method works in two steps: first, geometric transformation and georeferencing are performed, and then semantic transformation is performed.

It is necessary to be able to manipulate the IFC data. In the method we developed, we used the ifcopenshell library, which already does this successfully. ifcopenshell allows opening IFC data and performing a series of operations on the data (IfcOpenShell 2024). In our code, we used the ifcopenshell library to access IFC data, set up geometric iteration, and configure geometry settings. Using ifcopenshell, we read the geometry of the IFC data and made it suitable for geometric transformation. The goal here is to perform the transformation process by using a formulation to iterate the solid geometries of the IFC data and make them suitable for the geometry definition of the CityJSON format.

In addition to geometry transformation, semantic data transfer is also an important aspect. The key point of our approach is to create “unique id” values in the code to map and associate entities with each other. The process converts IFC file elements, excluding openings, to a CityJSON format by mapping their unique GlobalId, geometry (vertices and faces), and semantic properties to CityJSON attributes. This conversion provides continuity, traceability, and enhanced interoperability with unique identifiers, including additional unique IDs for semantic properties, facilitating data integration and analysis in urban modeling. In addition, JSON-LD (JSON Linked Data) is used to link the IFC and CityJSON files using these unique IDs, further improving data connectivity and semantic richness across different file formats. Using ifcopenshell, data is accessed and manipulated with specific settings to create a link between IFC and CityJSON using unique IDs. After this integration, the data is finally stored in JSON-LD format for enhanced connectivity and interoperability. JSON-LD is a method of encoding linked data using JSON (JSON-LD 2024). Therefore, the IFC data saved as JSON-LD is made interoperable using the “json” library. Then, using the “Unique ID” we created with “uri” and the linked data connection, the geometric and semantic information in the IFC data is linked to the CityJSON data created with its definitions, and the data transfer is complete.

Data transformation from IFC to CityJSON

In this study, we use JSON-LD to implement a comprehensive method for transforming an IFC file into a CityJSON file. CityJSON, provides a simplified version of the CityGML standard, a common framework for describing 3D urban objects. The transformation process of the developed Python code is based on two stages. These are geometric transformation and semantic transformation. The ifcopenshell library provides utilities for working with IFC files, json handles the JSON data, and uuid generates unique identifiers for the objects. In addition, a basic CityJSON file was created to store the transformed information about the data, an empty vertex dictionary to store the vertices in the geometry transformation, a content dictionary to store the semantic information to be transferred, and a graph list to store the graph units. Since the developed method aims to correctly perform the geometric and semantic data transformation, this section will be divided into two phases: geometric transformation and semantic transformation.

Geometric transformation

The main logic of this section is to correctly define the IFC geometry rules in the developed code and to determine the parameters required to convert them (Fig. 5). In this direction, the data should first be analyzed and the geometry types of IFC should be determined. Our study used the “IFC 2 × 3” and “IFC4” versions of IFC. Therefore, to access the geometry definitions of these versions, it is sufficient to access the IFC standards (Link-1). However, since the study is based on building transformation, we found that the definitions used for these two versions are similar. IFC 2 × 3 and IFC 4 share similarities in their basic definitions of building structures, ensuring consistency across different versions. Therefore, to achieve accurate geometry and semantic data transfer simultaneously, semantic data mapping was performed between the original data and the data to be transformed, as mentioned above (Table 3). The exclusion of ‘IfcOpeningElement’ is due to its nature as a space or void within a building, which has no physical presence in the architectural structure.

Fig. 5
figure 5

Flow diagram for step-by-step geometric transformation process

The next step is to extract coordinate system information from the IFC file, specifically in the form of an EPSG (European Petroleum Survey Groups Geodetic Parameter Dataset) code. EPSG codes are commonly used to define Coordinate Reference Systems (CRS). The process starts by checking if the IFC schema version is “IFC4”. If it is, the code searches for ‘IfcProjectedCRS’ objects within the IFC file and, if such an object exists, extracts the EPSG code from the object’s name attribute. If the EPSG code is not available, the code resorts to extracting geolocation data from ‘IfcSite’ objects within the IFC file.

Another important step in the geometric transformation is to store the geometries in IFC in the right place in CityJSON. For this reason, we stored the IFC building elements in a dictionary in JSON format, since the format we converted is in the JSON infrastructure. We used the GlobalID attribute information in the data to store the elements as independent records. Then, in the fourth step, all the building elements except ‘IfcOpeningElement’ are collected from the IFC file. Due to limitations in the CityJSON format, geometric objects identifiable in the IFC data - specifically window and door openings - are not directly translatable. As a result, when converted to CityJSON, these openings are represented as solid shapes, effectively obscuring the windows and doors in the 3D visualization. This results in the representation of building elements in the dataset with no identifiable openings.

After the initial attempts to obtain identifiers and type mapping, the code focuses on making the necessary adjustments to the geometry settings. Geometry settings are adjusted to apply standard components (wall, window, building furniture, etc.) to the elements and to use world coordinates for the geometric data, facilitating the transformation of the geometric data into a global context. A dictionary is created that maps global identifiers (GlobalId) of elements to the elements themselves. We also define a mapping from IFC types to CityJSON types (Table 3). In addition to the mappings, several helper functions are designed and implemented to further simplify the conversion process. The code iterates through and processes IFC geometries into meshes for compatibility with CityJSON, using conversion functions to map and reindex geometries for accurate representation.

Next, we initialize a CityJSON object with appropriate metadata and prepare a vertex list to hold the 3D coordinates of all vertices in the geometry. The vertex list is initialized to map each unique (x, y, z) vertex tuple to an integer index, streamlining the conversion of IFC geometries to CityJSON by ensuring each vertex is uniquely indexed and referenced efficiently, minimizing redundancy and aligning with CityJSON’s indexing requirements. A core function in this methodology is ‘process_shape’ (Fig. 6). This function iteratively processes each shape in the IFC file, populating the ‘cityjson’ dictionary and ‘graph_data’ list accordingly. During each iteration, it extracts the geometry of a shape, converts it to the CityJSON format, gathers associated properties, and builds a CityJSON object for that specific shape. This object includes the shape’s type, attributes, and geometry, which are then added to the CityJSON dictionary.

Fig. 6
figure 6

The function that processes the shapes, which is one of the main functions of the developed code

Semantic transformation

The attribute information received from the unit elements was stored as CityJSON objects under the type, attributes, and geometry attribute table layers. This object stores data types from previous functions and various attribute information from entity elements (Fig. 7). For example, unit name, description, and other semantic information are stored under the attribute heading. First, “Unique ID” values were created for the semantic transformation. This was done to register city objects as “uri”. The created uri was saved in CityJSON format under the graph data heading and then mapped to the previously saved CityJSON object. In addition, the detailed data mapped using the uri was saved in the graph data at this stage.

Fig. 7
figure 7

Flow diagram for step-by-step seometric transformation process

Therefore, using “uri” first creates unique ids for the data units. These unique id values are then used as an intermediary in the semantic definition of the data. In this way, IFC and CityJSON, which are mapped to each other, meet on a common ground in JSON-LD format. JSON-LD is a method for encoding linked data using JSON. It allows data to be highly connected through URLs and has context, giving the data a level of semantics. JSON-LD allows the script to add “context” information to the data, which provides additional semantic information about the data itself, allowing the data to be better understood and linked to other data.

This context information provides additional semantics to the data, such as mapping attribute names to URIs that can link the data to definitions in vocabularies or ontologies. This allows the JSON-LD processor to understand the semantics of the properties in the data. The graph_data list is then converted to JSON-LD format by appending the context dictionary. The inclusion of JSON-LD in this script serves as a bridge between IFC and CityJSON, passing semantic information from IFC to CityJSON data. This is especially important when sharing this data between different technologies or for future Web discovery, as it provides interoperability by linking the dataset to other datasets on the Web. JSON-LD also facilitates the merging of different data sets by defining a common context.

This allows the computer to better understand the data, so all the semantic information in the IFC is linked to the newly created related CityJSON geometry. Semantic and geometric information is transferred to CityJSON with no data loss. Finally, the new CityJSON object created in the code is stored as CityJSON via the JSON library. In addition, the JSON-LD file that supports the semantic transformation is also stored using the JSON library.

Results

The conversion process was largely successful, with the Python code developed effectively translating all geometric components from the IFC models to CityJSON, with one minor exception. Although CityJSON can visualize geometric components such as walls, windows, and doors, it does not define openings as IFC does (Fig. 8). As a result, window and door openings are missing from the converted CityJSON model. Objects that do not have detailed definitions in CityJSON, such as building furniture, are grouped into a single element (BuildingFurniture). All other geometric components have been accurately converted. To ensure the integrity and correctness of this conversion, a CityJSON validator was used to rigorously check the geometric validity of the exported CityJSON data (3D City Database 2024). This step is important to verify that the transformation adhered to CityJSON specifications and standards, and to confirm the absence of geometric anomalies such as incorrect vertex positions, invalid polygon formations, or inconsistencies in the representation of building elements.

Fig. 8
figure 8

a Door objects with openings in IFC data, b Door objects with no openings in CityJSON data, c Transparent windows with opening data in IFC data, d Non-transparent window object in CityJSON data

In terms of semantic integration, the results show a high level of accuracy. Approximately 95% of the semantic data was correctly transferred from the IFC files to the CityJSON model (Table 4). To quantify this accuracy, a detailed analysis of the semantic data accuracy was performed. The accuracy analysis was performed manually, and due to the high level of detail present in each dataset, the instance of the data model with the highest level of detail was specifically selected for in-depth analysis.

Table 4 Accuracy analysis of semantic information transfer

Figure 9 provides a side-by-side comparison of the original IFC data (with roof components) and the converted CityJSON data (with the roof removed for better visualization). These figures demonstrate the accurate transformation of geometric information despite the limited number of openings.

Fig. 9
figure 9

Comparative analysis of original IFC and converted CityJSON through visuals

The Python code used in this study successfully converted IFC files to the CityJSON format, preserving most of the semantic data. The script maps IFC building elements and their corresponding properties to CityJSON counterparts through a series of functions. The transformation of semantic data into linked data also leveraged “context” data, enriching CityJSON models with additional semantic information.

Discussion

Our research has made a valuable contribution to the use of BIM-GIS integration in urban planning. Our algorithm not only addresses a critical gap in BIM-GIS integration, but also lays the foundation for future advancements in smart city technologies. While our work focuses on the conversion from IFC to CityJSON, the methodology and findings have broader applicability and pave the way for further research on data interoperability in urban informatics.

When comparing this study to similar research, several similarities and differences emerge. Several other efforts have been made to translate IFC building models to other data models, the most comparable being the conversion to CityGML (Donkers et al. 2016). CityGML and CityJSON serve similar purposes but are structured differently, with CityJSON offering a more compact and user-friendly data structure. In the literature, both manual comparative accuracy analysis and automated comparisons are available. Here, we decided that a manual comparison would be more appropriate since a transformation is performed on individual buildings.

In studies focusing on IFC to CityGML transformation, researchers have faced challenges due to differences in how the two formats handle certain building components (Sani and Abdul Rahman 2018). For example, there are difficulties in converting complex structures such as interior building installations and certain types of windows due to differences in schema between IFC and CityGML. In the case of our study, the main challenge was dealing with openings for windows and doors, which are not defined in the CityJSON schema.

Despite these challenges, this study and the research above demonstrate the potential for converting IFC data into other formats to increase interoperability and improve the utility of building models. IFC data extends beyond the initial building design phase and plays a critical role in building management and energy efficiency analysis when translated into GIS formats such as CityGML and CityJSON.

While the effort to integrate Building Information Modeling (BIM) with Geographic Information Systems (GIS) through the conversion of IFC to CityJSON represents a promising avenue for improving data interoperability in urban planning, it also underscores the complexity of achieving lossless translation between these different data models. The challenge of accurately conveying architectural details, particularly the representation of openings in CityJSON, highlights a limitation of the current approach and suggests a need for continued methodological development. Rather than a detriment, this recognition serves as an impetus for future research that will drive advances in the seamless integration of BIM and GIS technologies for more comprehensive and accurate urban modeling.

Limitations and future works

The successful conversion of IFC building models to CityJSON in this study provides some opportunities for further research. The Python code developed in this study can serve as a foundation for more advanced converters that address the limitations identified in this study. Future research could focus on developing methods for representing undefined components such as openings in CityJSON. The limitation with openings is related to the code we created. Because the code converts the surfaces in IFC to CityJSON as a solid model, ifcopeningelement units are stored in CityJSON as solid model geometry. This results in anomalous geometry when CityJSON is rendered. Future work could focus on the transfer of opening units and the use of transparent objects such as windows.

Future developments could explore more advanced converters that address the limitations identified in this study, such as the representation of undefined components like openings in CityJSON. In addition, the study opens up new possibilities for applying the conversion process to different domains. For example, by integrating IFC building models into urban planning, GIS technologies could be optimized using the CityJSON format.

Another limitation of the study is the type mapping shown in Table 3. The main reason for this limitation is that the study only converts building units in a basic sense. A much more comprehensive type mapping should be done for more complex structures and other complex units. In future studies, the scope of the topic can be expanded and several different objects can be included to convert complex data. Another limitation is georeferencing. Here, the code tries to georeference using the local placement values in the IFC 2 × 3 version. The code finds coordinate information in the IFC 4 version, it can easily obtain this information. Future work will attempt to overcome the limitations of the code developed for the IFC 2 × 3 version.

In addition to improving the code, further work could explore the application of this conversion process in different areas. For example, by integrating IFC building models into urban planning, GIS technologies could be streamlined using the CityJSON format. Converting geometric and semantic information in one model to another could support more efficient decision-making processes in urban planning.

Finally, this study’s approach to integrating semantically linked data with CityJSON offers a promising extension to this area of research. A more holistic representation of building elements and their properties can be achieved by combining structured building data with additional semantic information. Future studies could explore other schemas for semantic enrichment and evaluate their benefits for different applications.

Conclusions

This study explored the conversion of IFC building models to the CityJSON data model, with a focus on the integration of semantically linked data within the process. Both processes were implemented using a Python-based script, focusing on the conversion of geometric elements and the extraction of semantic data.

This study successfully transformed IFC to the CityJSON data model, using semantically linked data to transfer semantic information. Despite some limitations, particularly the lack of a definition for openings in CityJSON, the Python script successfully translated geometric components and extracted semantic data. The conversion process resulted in an accuracy rate of approximately 95% for transferring semantic data from the IFC files to the CityJSON model. This demonstrates the effectiveness of the script in converting building data models and provides a strong foundation for future development and enhancements. When we look at similar studies, although not all transformation studies give accuracy analysis results, for example, the linguistic based text mining used in the Cheng et al. study showed that 53% of the data was correctly passed, and the performance of our study seems to be better. However, given the different models used, the results are relative and it may not be very appropriate to compare them.

In addition, the study highlights the potential of converting IFC data into different formats, such as CityJSON, to enhance the interoperability and usability of building models. As discussed, the conversion of building models and their subsequent integration into GIS formats such as CityJSON extends their applications beyond the initial design phase to include critical building management and energy efficiency analysis tasks. This study differs from similar research by using JSON-LD to add semantic detail to CityJSON models, allowing for a more complete and meaningful representation of building components and their characteristics. This integration of semantically linked data is a step forward in promoting a richer understanding of building models, and highlights the potential of using similar schemas for semantic enrichment in future research.

In conclusion, this study contributes to the broader effort to increase the interoperability and richness of building models using CityJSON and linked semantic data. The Python script developed provides a robust foundation for further advances in this area and promotes a comprehensive understanding of the physical and semantic properties of buildings, creating promising opportunities for future research.