Categorisation of building data in the digital documentation of heritage buildings

The documentation of heritage buildings is the preliminary action to deal with any problem related to the built heritage. The procedure of documentation requires a very diverse range of data (quantitative and qualitative) to be obtained and investigated in order to produce an accurate digital representation of the building. This type of work of data capture and interpretation is often conducted in isolation by different stakeholders and for a range of purposes, leading to a lack of communication between different data types, repeated effort and incomplete documentation. Heritage Building Information Modelling (H-BIM) is set to play a key role in the digital documentation of heritage buildings, as it can combine quantitative and qualitative data and facilitate the integration of different stakeholders and specialised data into the digital management of the different phases of dealing with heritage buildings. This paper aims to review the multitude of data types that could be included in the documentation and investigation process of the built heritage, in order to assess the breadth and depth by which heritage buildings can be documented. Four main categories that span the whole documentation data areas are being suggested which vary from outer geometry surveys, to subsurface materials and structural integrity investigations, to data concerning the building performance, as well as the historic records concerning the building’s morphology over time, which can help to create a more in-depth knowledge about the heritage building’s status and performance and can create a solid base for any required restoration and retrofitting processes (Khalil and Stravoravdis 2019a).


Introduction
The documentation of heritage buildings is a topic that has been discussed for a very long time to help safeguard valuable built heritage. It is usually the most fundamental and crucial process that can affect and facilitate any required procedures to preserve heritage buildings for the next generations or enhance their performance in order to enable them to achieve their current or future functions. Thus, documentation plays a key role in a building's future lifespan. The documentation of heritage buildings also supports the development of a better understanding of the building's history; its historic socioeconomic context, the building technologies employed, construction materials and, on a larger scale, our knowledge concerning its historic period and ancient societies. Many investigation tools can be used and combined to document and investigate the fabric of historic buildings. Emerging new technologies over the years help to improve the speed, quality and accuracy of the documentation process.
The heritage sector is not usually seen as a thriving economic sector, which makes the efforts to document heritage buildings short to fulfil the needs to document every heritage building. This leads to many heritage buildings not being documented or lacking a comprehensive accurate documentation that covers every aspect of the building: its history, pathology and performance. Heritage buildings are usually characterised by their fragile fabric, non-efficient old systems and not up-todate safety measures, which puts them in an ever-increasing risk of damage and accidents that can lead to fabric loss or even losing the building itself. In these unfortunate cases, unless a comprehensive reliable documentation exists, the building's history and its legacy could be lost forever. In other cases, a similar situation will occur when a heritage building undergoes necessary conservation or renovation works that could result in fabric loss or replacement without prior documentation. In all these cases, cost and time of the restoration works would increase due to the lack of sufficient and reliable information.
Considering every aspect of heritage buildings' documentation, current state-of-the-art documentation technologies are already available and capable to deliver accurate and reliable information if accompanied with careful planning and consideration of the documentation process and its objectives. These technologies cover a wide range of targeted data that are useful in various aspects of a heritage building and its required interventions. However, these up-to-date methods and technologies are often isolated from each other and not connected.
Current BIM (Building Information Modelling) tools enable the combination of diverse documentation data into one comprehensive model of the building and promote the collaboration of different stakeholders into the same workflow. In practice though, the inherited isolation of different workflows still dominates the sector. This can be seen in the research pattern concerning the documentation of heritage buildings, which is usually disciplinary oriented and rarely discusses the issue of heritage building documentation within its wider holistic view (Acierno et al. 2017) (Khalil and Stravoravdis 2019a). Intensive literature exists in distinct areas such as building performance, geometry capture and pathology, but there is not much work on all these areas combined, which is what is needed for heritage building documentation and preservation. Nevertheless, there is no framework that can combine all these technologies together in a meaningful way for the purpose of digital preservation of historic buildings.
Thus, the combination of various data sources incorporated with the documentation of heritage buildings into one holistic framework would facilitate the full implementation of BIM potentials and open the doors to the integration of the digital twin concept that aims to truly represent the building and all its characteristics in the digital environment.
This paper is a detailed follow-up of a previously published conference paper by Khalil and Stravoravdis (2019a) and it attempts to put into perspective the holistic view towards the digital documentation of heritage buildings. It will review distinct documentation areas, their respective data types and their related technologies, as well as discuss their potential interrelations and combination.

Digital documentation of heritage buildings
Heritage documentation is seen as "the systematic collection and archiving of tangible and intangible elements of historic structures and environments. Its purpose is to supply accurate information that will enable correct conservation, monitoring and maintenance for the survival of an artefact" (Dore and Murphy 2017). Documentation is the first phase for heritage building's analysis, conservation, retrofitting, renovations and management. It can incorporate both quantitative assets (geometric data, performance data) and qualitative assets (historic photographs, oral histories, music) (Fai et al. 2011). Acquisition of all possible data is the first step to contribute towards fundamental modelling for building recording and documentation (Cheng et al. 2015).
The heritage buildings sector alongside the whole AEC (Architecture, Engineering and Construction) industry witnessed its early attempts to digitalisation following the third industrial revolution (known as the digital revolution) that started during the 1980s (Techopedia n.d.). It helped reform work processes through the technological development of new tools and methods. During the 1990s, the digital reforms started by the transition from hand drawing to technical drawing with the use of CAD (computer-aided design) software (Banfi 2019), working as a representational tool to improve precision and expand the limits of creativity. Consequently, computer and automation began to contain design into a virtual environment (Sebastian et al. 2018), followed in the 2000s by the transition from 2D CAD representation to 3D and Building Information Modelling (BIM) that enabled the digital representation of heritage buildings in 3D space and the creation of digital repositories and databases (Dore and Murphy 2017). This transition permitted the move from the concept of static representation of the building (2D CAD drawings) to the concept of information process (digital models that can support the long life cycle of building LLCB) (Brumana et al. 2018). Multidisciplinary attempts for the digital documentation of heritage buildings have been taking place even before BIM was known; however, full integration was limited due to the lack of capabilities in hardware and software. For example, computer hardware did not have the processing, memory and visualisation capabilities required, surveying equipment did not have the accuracy and speed required and so on. Software capabilities were limited, as was interoperability between them due to different data formats. Since then, a lot of hardware and software innovation has occurred with BIM trying to address a lot of the aforementioned issues.
H-BIM (Heritage Building Information Modelling) emerged in the late 2000s as a tool that can help in the management of the conservation, renovation and retrofitting of heritage buildings. It introduced the benefits of BIM into the heritage sector while tackling the challenges characterising the built heritage sector, mainly represented in the inevitability of starting at an intermediate point in the asset's life cycle, which can be much more complex than the relatively straightforward cradle-to-grave model that describes new build construction (Historic England 2017a). More challenges are usually present in the processing of historic buildings: irregular geometry, non-homogeneous materials, variable morphology, undocumented changes, damage and the various stages of construction . These challenges put more weight on the surveying, documentation, modelling and visualisation phase in the process of H-BIM. Due to the requirement of more complex and sophisticated documentation/data capture technologies as well as data models, H-BIM only became reliable and operational by the early 2010s (Logothetis et al. 2015).
The survey and documentation process is unsurprisingly the first process that benefited from this digital transition. Advanced 3D imaging technologies allow the capture of complex structures. State-of-the-art computer systems provide the capacity to federate and analyse large datasets. Cloud-based IT infrastructure can ease the use and transfer of data and provide more secure and reliable storage of data.
A recent trend is to produce a "digital twin" of the building, which is linked to the 4th industrial revolution (widely known as "Industry 4.0"). It aims to merge physical, digital and biological worlds (Sebastian et al. 2018). The digital twin in the AEC industry is typically connected with BIM (Building Information Modelling), building simulation, XR (cross reality) and IoT (Internet of Things) concepts in order to build a digital replica of the building, usually with near real-time update, that can help in optimising the decision-making process. The creation of the digital twin requires intensive research and accuracy in the building's documentation in the first place to accommodate all the necessary and updated data concerning the status of the building and its performance.
Internet of Things (IoT) hardware and applications can represent a significant contribution towards documentation and monitoring of heritage buildings and can provide the digital twin of a building with regularly updated data about the building's performance (energy, weather data, light, movement, etc.) and pathology (status of components of the building, etc.) that can help in maintenance and preservation. IoT can be achieved by the installation of various sensors, usually wirelessly connected and non-invasively installed, that can monitor the building and feed live updated data to the cloud (or local server) and then the H-BIM model in order to create a more accurate representation of the building.

Digital documentation phases
The process of digital documentation of heritage buildings can be viewed as a two-phase procedure. It begins with the acquisition of all the necessary data on various levels using a variety of data capture tools. Subsequently, the phase of data interpretation could be carried on in order to convert the surveyed raw data of a heritage building into useful information that can help to build an understanding of the building, its history, how it was constructed, how it works, potential structural deterioration and performance deficiencies. This understanding is the keystone in decision-making, planning and managing any needed intervention of conservation, renovation or retrofitting. Interpretation of captured data requires different tools and methods but share the same idea of extracting and analysing the related information concerning the building.
This duality of data capture/data interpretation within the documentation process can be noticed in many types of data and survey methods such as processes of "Geometry Survey" where geometry data is gathered versus "Modelling" where the surveyed data is used to create a digital model of the building. The same can be noticed in "Performance Monitoring" where the building performance metrics are surveyed versus "Building Simulation" where the monitored data are used to predict the performance of the building in different scenarios. It also similar to "Archaeological Survey" where data and evidences about the building's past are gathered versus "Historic Analysis" where these data are analysed to understand the building's history. Thus, the dual-phased process can be seen as a universal character within the process of digital documentation of heritage buildings.
While the data capture phase is concerned with gathering, surveying and monitoring all the available raw data of the heritage building from the building itself or from external sources in order to create a pool of primary data that describe the building in all its aspects, data interpretation is the phase that introduces these recorded data to the later processes of design and decision-making of the different potential interventions to the heritage building. Data interpretation can include several processes and several steps to transform the captured data into meaningful structural information that represent the base for further procedures. It can include data analysis, such as research and analytical studies of archaeological and historic data; modelling and visual representation of surveyed geometry of the building; and simulation of the performance of the building in different scenarios; as well as visualisation of the geometry and other documented data. Figure 1 summarises the proposed breakdown of the documentation process and its relations with other project phases. The cycle starts with the data capture phase, which represents the collection of all relevant data of the heritage building. Following is the data processing that leads to the analysis, modelling and interpretation of the captured data and transforming them into structured information. This structured information forms the base that enables the decision-making and design process to create an action plan for any needed works. Within the design phase, suggestions and alternatives could be fed back to the data interpretation phase for analysis, modelling and simulation processes to explore more optimised solutions. After the production of the action plan lies the execution phase for any required works such as conservation, renovation and retrofit. Through the execution of the required works, new findings could be discovered, as well as the recording of the executed interventions and as-built surveys that should be added to the body of the captured data for documentation. Then, within the facility management, operation and maintenance phase, regular monitoring of the different aspects of the building could be fed back to the modelling and simulation processes; moreover, feedback from the different operations will contribute to the captured data that could be used for documentation and for optimisation of the next steps of the lifecycle of the heritage building. The concept of data management spans the whole lifecycle to manage the flow of data and the integration of all stakeholders throughout the various phases.

Documentation data
The documentation process of heritage buildings incorporates a diverse range of data formats that span from quantitative to qualitative and from tangible to intangible (Fai et al. 2011) (Di Mascio et al. 2013. It represents also numerous types of data, considering its purpose, such as historic data, geometric data and performance data. Different stakeholders are usually interested in specialised types of documentation data (Acierno et al. 2017); these distinctively different data types collectively represent the documentation of the building.
Following a review (Khalil and Stravoravdis 2019a) on the variety of documentation purposes, stakeholders' interests, data typology, data capture and interpretations methods for heritage buildings, and after reviewing the current literature within the different areas of documentation, the authors of this paper propose the following four distinct categories that reflect the unique characteristics of every type of documentation data (Fig. 2): & Archaeological and historical data: aiming towards archaeological investigations, understanding of the historical context of the building and determining the morphology of the building's fabric and function over time. They are within the scope of archaeologists, architectural historians, listing authorities, museums and public dissemination. & Geometry: aiming to record, survey and visualise the exact shape and characteristics of the building's fabric on its current state. Geometry is an important element for many stakeholders but is the major element in the scope of architects, structural engineers and in general for the construction sector & Pathology: aiming to discover and survey any potential damage or decay of the fabric of the historic building over time, being it material decay or structural deterioration. Those who work in the conservation industry are the major stakeholders that benefit from pathology data, but it can be useful also for stakeholders interested in the archaeologic and historic analysis of the building. & Performance data: aiming to understand and analyse the current status of the building's operability and performance on its various aspects, such as energy performance, thermal performance, users' comfort, systems' performance, safety and security performance. Performance data is in the core interests of architects, MEP engineers, lighting and acoustic engineers, energy auditors and other building scientists.  Table 1. This relationship is assumed to be in three levels: Full interest, where the stakeholder is assumed to want to have or be able to see the relevant category of data; basic interest, where the stakeholder is assumed to have a partial interest in the data; and no interest, where the stakeholder is assumed to have no practical interest in a data category. There are situations where the suggested relationship is different, but in this table, the most common scenario is explored.
The following sections will present the suggested categorisation of data documentation and documentation phases in more detail. More specifically, historic and archaeological data can be found in the "Historic and archaeological data" section, geometry in the "Geometry" section, pathology in the "Pathology" section and performance in the "Performance" section. In each section, relevant data capture methods for each category are discussed first, analysing their potentials and challenges, followed by a review of the main data interpretation methods associated with each category. Finally, the interrelations between the different categories of data are discussed in the "Documentation methods integration" section, and a brief review of the factors affecting the planning decisions concerning the use of the various methods in the digital documentation of heritage building takes place in the "Documentation planning factors" section.

Historic and archaeological data
Historic and archaeological documentation data category focuses on the analytical study of the building's history, its historical context and the variation of its form and function through its lifetime. This not only helps archaeologists and historians understand the history of the building and its context but also leads to a better understanding of the architecture ideologies and styles, construction technologies, building materials and structural systems of the building's era that were incorporated into its fabric. It also shows how the building functioned and served its various roles through its lifespan (Historic England 2006).
Historic documentation can combine the tangible geometry of the building (including previous drawings, form changes, building materials) with many of its intangible aspects such as historic texts, archaeological figures, oral histories, sketches and photos. These data sources can create a better understanding about the building in its current status as well as its historic morphology over time. They can also contribute towards understanding the construction systems of the building and its development through the building's history, as well as building an idea about the materials and technologies used in its construction. This can be also used to disseminate the building and its historic development for the wider audience visualisation of the different phases of the building's history. In this sense, more advanced visualisation and presentation can be achieved using XR (cross reality) technologies (Osello et al. 2018) .

Historic and archaeological data capture
Acquisition of historic and archaeological data can span a variety of tools and investigation methods studying different aspects of the building's history. After a review of the Fig. 2 Proposed categorisation of the documentation data of heritage buildings and their respective data capture tools literature, the authors suggest the following sub-categories that can capture all possible data sources within the archaeological and historic data category.

Past drawings
Past drawings and sketches, when available, could be a crucial element in facilitating the documentation and survey of heritage buildings. Previous architectural drawings could be a preliminary step for creating a digital twin of the building. Furthermore, they can provide valuable information concerning changes in form or function that the building has witnessed.

Historic records
Old and current sources of historic information such as official registers, historic maps, building accounts, inventories, sale particulars, census records, trade directories or any other form of records, when available, could be an important source of information concerning the history of the building, its owners, construction, function and variation over time. The extent to which more detailed research is necessary or desirable will depend on the level of record, and the merits of available documentary sources. The range, scope and survival of these sources will vary considerably (Historic England 2006).

Historic texts
Historical texts provide the ability to understand the historic context of the building and the social-economic changes that affected the building and its function over time. It can also provide information on the building techniques, materials and architectural styles associated with its era.

Archaeological findings
Archaeological investigations and findings are major sources of evidence that provide information about the history of the building, its materials, building techniques, original form and function (Historic England 2006).

Historic photographs
Historic photographs, if available, provide great opportunity to identify any changes on the heritage building in comparison with its current state. Photographs also can give a glimpse on historic social and economic contexts of the building, its function and its users' behaviour over its lifespan.

Oral histories
Oral histories of local residents and users of the buildings (especially older generations that witnessed the changes of the building and its context over time) can provide some evidence of the role of the building within its local society and help to understand the socio-economic changes it witnessed that could have affected its form and function over time.

Multimedia
Old videos, voice recordings, music and any other form of media that is associated with a heritage building can play a role in understanding the building's cultural environment and function. Old films and videos can also provide useful information about the changes in the building's form over time.

Historic and archaeological data interpretation
Interpretation of historic and archaeologic data aims to develop an understanding concerning the history of the heritage building, its morphology in form and function over time, the building technologies and materials used in its construction and even leads to a better understanding of the building's historic era in architectural, social, economic and political contexts. So, it is usually within the core interest of archaeologists, historians and listing authorities.
An example of this can be seen in the case of the church of St. Maria in Scaria d'Intelvi, Italy, surveyed by Brumana et al. (2013). They produced a stratigraphic study that considers the changes undergone to the church through the centuries based on their analysis of the 3D scanning and the historic evidence they derived from their analysis about the construction procedure of the vaults covering the church (Fig. 3) (Brumana et al. 2013).
Historic and archaeologic data represent a major contribution towards the understanding of the heritage building's history and development over time. It comprises various data sources and investigation tools which can be represented in a variety of data formats such as textual, graphical or database data formats, which makes it more challenging to merge such data into BIM models.

Geometry
The determination of position, size, shape and identity of the components of a heritage building is a fundamental part of any project related to the conservation or renovation of built heritage (Historic England 2018). Geometry capture of heritage buildings has witnessed a lot of research and development in the recent years. However, it is still a particularly challenging process due to the irregular geometry, non-homogeneous materials, variable morphology, undocumented changes, damage and various stages of construction that typically characterises heritage buildings. This section will discuss the geometry documentation process, firstly analysing the data capture process and its available technologies, then the geometry data interpretation and its main applications.

Geometric data capture
Geometry capture and representation of heritage buildings can be conducted by means of various techniques with a wide range of accuracy, cost efficiency and time consumption. Common methods range from manual surveying techniques to photogrammetry to laser scanning.

Manual survey
Manual measurements using simple tools such as measurement tapes is the traditional method of surveying and is the least expensive yet the most inaccurate and a potentially time-intensive choice to capture the geometry of a building accurately. It can provide dimensions and relative positions of small and less complicated objects, but it is lacking details and accuracy. For large structures, it quickly becomes uneconomic. Laser distance meters can help facilitate the process of hand measurement, increase its accuracy and provide more flexibly in confined spaces. However, it still suffers the same limitations as manual measurements.
A recent development in this area is the use of low-cost mobile smartphone applications based on computer vision "structure from motion" (SFM) and "simulations location and mapping" (SLAM) algorithms that utilise uncalibrated smartphone camera images. Some mobile phone manufacturers took this even further by adding a ToF (timeof-flight) camera using an infrared projector or laser meters to their smartphones (Historic England 2018). This is still not aiming towards professional application and lacks the required accuracy, but it is an area of fast development and serious potentials.  (Brumana et al. 2013) Total station survey Modern robotic total station (TSs) can be used for data collection as well as the survey of a local control network, which is typically the initial step to precisely identify necessary control and check points for photogrammetry and laser scanningbased techniques (Backes et al. 2014) (Historic England 2018. State-of-the-art Robotic TSs incorporate a range of sensor technologies including reflectorless laser ranging and photogrammetric methods. They provide a highly accurate tool to collect accurate geometry data. However, measurements are time consuming and lacking the completeness provided by other 3D imaging methods (Backes et al. 2014).

Photogrammetry
Photogrammetry is defined as the acquisition of accurate measurements and three-dimensional data from photographs (Matthews 2008). It is based on using images taken at different viewpoints to record the 3D geometry of a building (Dore and Murphy 2017). Photogrammetry includes areal photogrammetry and close-range photogrammetry, which share the same key principle of triangulation where lines of sight (rays) from two different camera locations are joined to a common point on the object. The three-dimensional location of the point is determined by the intersection of these rays (Dore and Murphy 2017) (Figs. 4 and 5).

Laser scanning
Laser scanning represents the most common and efficient tool in the field of as-built survey and documentation. Back in 2002, Boehler and Marbs (2002) defined laser scanning as "any device that collects 3D coordinates of a given region of an object's surface automatically, in a systematic pattern at a high rate and achieving the results in near real time". More recently, Grussenmeyer et al. (2018) stressed the non-contact and active nature of the process in their definition of laser scanning "an active, fast and automatic acquisition technique using laser light for measuring, without any contact and in a dense regular pattern, 3D coordinates of points on surfaces".
Operational scanning systems include a wide range of static and cinematic methodologies which deploy different 3D scanning principles. The generated point clouds have different sample densities, accuracies and characteristics. Laser scanners are based on one of three ranging principles: triangulation, pulse or time-of-flight (ToF), and phase-comparison (Historic England 2018), with differences in the range and accuracy capabilities from each method (Dore and Murphy 2017). The most common and accurate laser scanners used for building documentation are called terrestrial laser scanners (TLS) which conduct 360 degree scans from a static position on a tripod. Cinematic techniques include handheld scanners (e.g. Faro Freestyle), Mobile Mapping Systems (MMS) which can be mounted to vehicles, trolley or backpacks or airborne systems on drones, helicopter or planes (usually referred to as Lidar derived from Light Detection and Ranging) (Thomson et al. 2013) (Figs. 6 and 7).

Geometry capture methods comparison
Manual survey methods can be the least costly and most accessible methods in small simple buildings. However, they lack the accuracy and speed to be practical or suitable for documenting larger or more complex buildings (Fig. 8).
Photogrammetry can be less expensive compared with laser scanning as it can be conducted using less expensive digital cameras. The use of modern digital photogrammetry in indoor environments is often impractical and the entire   process requires a lot of time to post-process large image blocks. The deployment, particularly in confined spaces, is difficult and increases cost indirectly.
A disadvantage of laser scanning, beside the cost factor, has been data pre-processing which includes scan registration and cleaning. This phase can be very time consuming in order to achieve a high-quality point cloud. However, best practice procedures and rapid progress to automate this step lessen this factor. Photogrammetry also requires a high computational effort for post processing. TLS are not as versatile or flexible as cameras regarding data capture. It still takes longer time to take 360 degree scan at each position, especially if higher resolutions and qualities are required. This contrasts with the instantaneous camera shot and the ability to use a camera in difficult locations (Historic England 2018).
3D laser scanners can vary in range from under a metre to several kilometres and vary in accuracy from a fraction of a millimetre to 300 mm, depending on the site requirements (Fig. 8, Table 2).
Several methods can be used in parallel for recording various aspects, such as using total station (TSs) to survey a site control network in order to precisely identify the scanning points for other survey techniques. Also, photogrammetry can be used for mapping surfaces colours and details, while 3D laser scanning is used for accurate modelling.

Geometric data interpretation
Interpretation of geometric data aims to create 3D models reflecting the spacious character and qualities of the heritage building and representing it visually in order to help architects and engineers in the process of planning and executing any needed conservation, renovation or retrofitting works. It can be also conducted just for documentation purposes that can help in any future works.
Modelling and visualisation are also crucial for archaeological, historic and pathologic data interpretation and representation. It is also the basic element for depicting performance data. It can be used, as well, in the areas of education, public awareness and cultural dissemination, in which, modelling and visualisation combined with XR applications can play a significant role.

Parametric modelling
3D imaging and survey methods such as 3D laser scanning and photogrammetry enable the capture of complex buildings as highly dense point clouds. The most challenging part of the process of geometry modelling is to produce a parametric model that combines the geometric information with its parametric information and merge them into an H-BIM  (Cheng et al. 2015).
One of the major challenges in modelling existing and historic buildings is the lack of pre-defined parametric objects compared with the extensive libraries used to model new buildings. This requires the development of methodologies and algorithms to use data survey to model within BIM software (Murphy et al. 2013) (Chevrier et al. 2010). These models should consider the level of detail and simplification of the models suitable for conservation projects, while offering the possibility to modify the parameters of the shape of the architectural elements, in particular, of historical objects that are often irregular, in an isotropic manner (Brumana et al. 2013) (Fig. 9).
Many attempts were made to build parametric object libraries for heritage buildings within various contexts (Wazeri 2014) (Murphy et al. 2013) (Baik 2017). However, this area still needs more research to address different building elements and different historical eras, to create extensive parametric object libraries as well as automated object recognition tools, which can facilitate the parametric modelling process.

Semantics
A new area of research in the field of geometry documentation is the process of converting point cloud data into semantically rich BIM models. This is conducted by creating algorithms able to learn the unique features of different types of surfaces and the contextual relationships between them, and then to use this knowledge to automatically label patches as walls, ceilings or floors (Xiong et al. 2013). This technology, while still in its early stages, has great potential for facilitating the automated conversion of raw point cloud data into useful semantic BIM models in one step, which can contribute towards time and effort saving (Fig. 10). However, a lot of research is still needed in this area.
The geometric data category can be the most important data category as a geometric model is usually needed for the representation of other data forms concerning heritage buildings. It is, as well, the primary requirement for modelling the building in an H-BIM environment. Unsurprisingly, geometry capture witnessed the fastest technological development for heritage building documentation in the recent years. However, it is still challenging and in need for more research and development in order to somehow manage to automatically translate point clouds into parametric BIM models.

Pathology
Investigating and documenting the pathology of heritage buildings has a significant impact on the decision-making and process of their conservation. Pathological investigations focus on studying the quality of the materials and structural system of the building; they also study original materials and construction methods, material degradation, historic fabric developments (Historic England 2017a) and structural decay that can result from design errors, erroneous interventions or neglect (Theodossopoulos and Sinha 2008). Therefore, pathological investigation can be categorised into two areas: material pathology and structural pathology. It can be conducted using various tools; however, the geometry capturing tools remain the most used tools to investigate buildings pathology, unless subsurface investigations are required.

Pathology data capture
After a review of the literature, the authors suggest the material survey and structural survey as sub-categories that can capture all possible data sources within the pathology data category.

Materials survey
Material pathology aims to investigate material characterisation and properties, damage and temporal decay (Pocobelli 2015). Outer skin material survey could be achieved using photogrammetry or laser scanning. However, they do not help in subsurface material survey, which needs different investigation tools, be it invasive or non-invasive, such as wet chemistry which clarifies the pathology type; optical microscope to define the pathology origin; ultra-violet (UV) lighting and infrared (IR) imaging to detect organic matter; Fourier transform infrared (FTIR) spectroscopy to identify materials (Pocobelli 2015). Although destructive techniques offer a means of extending understanding, through sampling materials or revealing hidden fabric, such as the removal of areas of plaster, opening up of blocked features or inaccessible voids or the lifting of floorboards to examine floor structures, it is important to evaluate and consider the loss which will result to the fabric of the building (Historic England 2006). An example of a material pathology survey is in the project of Pocobelli et al. (2018) as they performed a survey on the façades of the Jewel Tower in London (Fig. 11), dated back to the fourteenth century, using 3D scanning and photographic They produced 2D weathering façades using AutoCAD, as Revit could not produce the required level of detail in 2D drawings, and then they inserted them into the BIM model as a new rendering material (Pocobelli et al. 2018).
An innovative use of thermal scanning can help in surveying historic buildings and contributes to the H-BIM modelling process. As a non-destructive technology, it can be useful in investigating the building's envelope and identifying the structural system and near surface properties of material composition, decay, damages and moisture (Stober et al. 2018). These data enable detection of near surface areas of different material properties which in turn helps in planning of any material sampling needed or detailed inspection of structure and non-structural parts of the building. Stober et al. (2018) used infrared passive thermography to identify the invisible materials and structural system of the atrium façade at their case study of the Palace of the Slavonian General Command in Osijek in northern Croatia, built in the eighteenth century and witnessed many changes till the early twentieth century. Their investigations combined modelling of the existing 2D drawings of the current state of the building and laser scanning of the baroque entrance of the building that was integrated into the BIM model. Then, they performed a thermal energy assessment of atrium wall surfaces to identify materials, construction system and thermal bridges of the twentieth century reconstruction of the atrium (Fig. 12). The last phase was the interpretation of historical documentation over time in reverse engineering to model the building over different periods of time.

Structural survey
Structural pathology represents a great challenge and a main aspect to shape the conservation requirements of heritage buildings. Geometric survey could help to indicate structural pathology, but, in many cases, more in-depth structural investigations would be needed.
An example of structural survey can be seen in the work of Banfi et al. (2017) as they conducted a structural health monitoring for the documentation of the medieval bridge "Azzone Visconti" in Lecco in Italy. They combined 3D digital survey, parametric modelling and monitoring datasets for the development of a system for archiving and visualising structural health monitoring data. The project consisted of a laser scanning survey to capture the irregular shape of the bridge. Then, they used photogrammetry to generate accurate orthophotos of the elevations as they provide a photorealistic visualisation, which were used in different stages of the project, for instance, for planning the location of destructive and non-destructive analysis and a complete stratigraphic analysis. Then, they performed a geometric levelling to monitor vertical movements of the bridge using a series of trucks and metallic coils to test the bearing capacity by alternately loading the different bridge spans to determine the deformation of the bridge under these loads (Fig. 13)  ).

Pathology data interpretation
The interpretation of pathological data aims to depict potential solutions to preserve, consolidate or restore deteriorated heritage buildings. It is meant to translate the pathologic findings from material decay and structural deterioration raw information into real action plan aiming towards the conservation and protection of the building. This can be achieved using structural simulation of the building and optimisation of the best solution.
Structural analysis of heritage buildings has been a topic of research for a long time using developing techniques, including limit analysis, simplified methods, finite element method (FEM) macro-or micro-modelling and discrete element methods (DEM) (Roca et al. 2010). Structural analysis aims to better understand the genuine structural features of the building, to characterise its present condition and actual causes of existing damage, to determine the true structural safety for a variety of actions (such as gravity, soil settlements, wind and earthquake) and to conclude on necessary remedial measures (Roca et al. 2010).
Finite element method (FEM) originated back to the early 1970s and is widely used as a generally applicable and straightforward approach for modelling and analysis tool, especially when investigating complex, three-dimensional interactions between structural components (Atamturktur and Laman 2010). FEM falls within two main approaches: The first approach "macro-modelling", sometimes referred to as "Continuum Mechanics finite element models", is the most common approach due to its lesser calculation demands in analysis of large structural members or full structures. It represents the material as a fictitious homogeneous orthotropic continuum which makes them simpler since they do not have to accurately describe the internal structure of masonry and the finite elements can have dimensions greater than the single brick units. This type of modelling is most valuable as a compromise between accuracy and efficiency. Its drawback is in its description of damage as a smeared property spreading over a large volume of the structure not in localised areas, which provides a rather unrealistic description of damage and may result in predictions either inaccurate or difficult to associate with real observations (Roca et al. 2010) (Fig. 14).  Fig. 12 Using thermal imagery to identify the structure system, different materials and thermal bridges (Stober et al. 2018) The second approach is the "micro-model" which describes the units and the mortar at joints using continuum finite elements, so the unit-mortar interface is represented by discontinuous elements accounting for potential crack or slip planes. Detailed micro-modelling is probably the more accurate tool available to simulate the real behaviour of masonry as it can realistically describe the local response of the material, elastic and inelastic properties of both unit and mortar (Roca et al. 2010) (Fig. 15).
Pathologic data is the most crucial data category for the preservation and survival of heritage buildings as it aims to investigate potential damage and structural deficiency of the building's fabric and structural system, in order to accurately plan the conservation and consolidation works needed. It is Macro-modelling (d) (Roca et al. 2010) still challenging to store and visualise the multitude of pathological findings needed for preservation purposes into BIM models, which makes this an area in need of more research to address such issues. Pathology data can greatly benefit from the realisation of the digital twin and IoT (internet of things) concepts, as they can provide crucial live updated monitoring data of any identified pathology, which can lead to more intime, realistic, accurate and reliable planning for the conservation of the building and addressing the urgent pathological risks.

Performance
Documentation and integration of the building's performance is a major contribution of digital documentation of heritage buildings, which can contribute towards the decision-making process at the design, retrofitting or facility management stages of heritage buildings.

Performance data capture
Monitoring and surveying the current status of the building and its performance capacity on various levels is the first step to help in the analysis and determination of any deficiency it could be facing in order to develop the objectives of any required rehabilitation or retrofitting works. It also acts as a valuable base for the management of the building. Building performance in its broad definition can represent many aspects. After a review of the literature, the authors suggest the following sub-categories that can capture all possible data sources within the performance data category:

Energy performance
Monitoring and documenting the energy use and energy efficiency of the building is a major objective concerning the retrofitting of heritage buildings and through better management and building upgrades, energy performance targets could be met, as well a lowering of running costs can be achieved. The monitoring and assessment of the energy performance of the building is related to the efficiency, control and management of building engineering services (heating and cooling; hot water supply and lighting; and equipment and appliances). Assessments should identify fuel sources and the type, size, age and condition of all the energyconsuming services and equipment. The way the engineering services are controlled and operated should also be reviewed. Any defects that need to be rectified and opportunities for improvement should be highlighted (McCaig et al. 2018).

Thermal performance
Surveying the thermal behaviour of the building helps in identifying the thermal characteristics of its fabric and potential thermal comfort issues that can affect its users. It helps in identifying the aims and objectives of planning any retrofitting works needed.
In an innovative case study conducted by Wang and Cho (2015), they tried to combine laser scanning of an existing building with thermal imaging to help assess the thermal performance of the outer envelope of the building. For this purpose, they proposed a framework by developing a hybrid 3D LIDAR system with an IR camera to measure the temperature of the building's surface so the temperature data are automatically fused with corresponding scanned points during the data collection process and every point of the point cloud is defined by its x-y-z coordinates and corresponding temperature data (Figs. 16 and 17). As is BIM was automatically created by a building envelope recognition algorithm. After converting the file format into gbXML, the as is BIM was imported into energy analysis software to conduct building performance analysis that can assist in retrofit decision-making. Thus, the building performance analysis was based on actual thermal performance data collected from the fabric of the building itself (Wang and Cho 2015).

Moisture survey
Moisture of building elements is one of the major causes of weathering. It can vary throughout the year depending on external climatic conditions, indoor environmental conditions and on the construction and state of the fabric, causing stresses on building elements. It can affect both the integrity of the building's fabric and its thermal behaviour, making it is an important performance area to regularly survey and depict. Moisture surveys can help in planning any conservation works or retrofit project.
A research by Pocobelli et al. (2018) focused on measuring moisture data of the façades of the Jewel Tower in London, which is dated back to the fourteenth century. They integrated moisture data into Revit through the "Schedule" command.
These reading points had to be modelled as family masses "spheres" because Revit can create schedules only for families. They used the Dynamo platform to create an algorithm to depict moisture variation in walls, using the data that are stored in Revit through spreadsheets linked to smart masses (Fig. 18).

Lighting/visual performance
Monitoring visual and lighting aspects of the building such as daylighting, artificial lighting, visual comfort and glare effect are important metrics that can help in the lighting design for the renovation and rehabilitation of heritage buildings, and can have their effect as well on the energy performance of the building (Khalil et al. 2018) (Re and Lucas 2012) (Andersen and Guillemin 2013) (Garretón et al. 2018).

Acoustic performance
Acoustic performance plays a major role in some buildings such as theatres, concert halls and museums, and even in other building types, noise control plays a role to maintain the acoustic comfort for the users. So it is important to monitor and survey how the building functions in that area and to determine the major factors influencing acoustic performance and the causes of noise in the built environment (Habibi 2017) (Bo et al. 2015).

Indoor air quality
Indoor air quality refers to the environmental qualities within a building, used especially in relation to the health and comfort of building occupants. It can be affected by microbial contaminants (including mould and bacteria), gases (including carbon monoxide (CO), carbon dioxide (CO2), radon (Rn), volatile organic compounds (VOC)) and particulates (e.g. water), or any mass or energy stressor that can induce adverse health conditions (Historic Scotland 2011). Other factors are related to the indoor air quality such as temperature, humidity and HVAC systems. It is important to monitor the indoor air quality to spot any deficiency to be addressed in renovation projects in order to maintain the health and comfort of the users.

Systems
Different systems are vital parts of the operation of the building. These systems can include electrical systems, plumbing systems, HVAC systems, fire alarm systems, networks and any other system that could be implemented in the building. A detailed survey of the building's systems is essential to Fig. 17 The point cloud of the building including its thermal information. (Wang and Cho 2015) Fig. 16 The laser scanning and thermal camera setting, and the framework proposed by Wang and Cho (2015) identify the performance and efficiency of existing systems in order to assist in the decision-making of any upgrade, renovation or replacement of the building systems within the process of renovating or retrofitting the heritage building.
Information about existing systems operation is also crucial for the process of facility management (FM) of the building and planning any required ongoing maintenance.

Furniture and equipment
Fixed and movable furniture and other specialised equipment are part of the assets of the building that need regular evaluation and survey. This would help in planning the renovation of the building as well as its maintenance management.

Users' behaviour
Users' behaviour largely affects the performance of the building. It should be studied to assist in the planning of the retrofitting of the building to achieve optimum performance and users' comfort.

Functionality
Assessment of how the building is suitable for its current function or its intended future function is useful to address any deficiency and how to deal with it in the design of any renovation of the building.

Accessibility
Accessibility analysis aims to study how the building can be approached and assess the circulation system within it in order to enhance the users' experience and insure adequate accessibility measures for disabled persons. This is especially helpful in renovation of heritage buildings that usually lack modern design standards for accessibility.

Safety and security performance
Measures of the security of the building and safety of its users are to be analysed and addressed in any renovation project. It should also be monitored and evaluated by the facility management team in order to keep the building secure from outer threads and safe from any potential hazards (BSI 2016) (Letellier et al. 2007).

Performance data interpretation
Interpretation of performance data is meant to provide crucial information about how the building is performing and operating for the decision-making and design stage. This is based on the accuracy of the performance metrics monitored during the performance data capture phase.
In the digital environment, building modelling and simulation play a vital role in predicting the building performance and can help in the optimisation of retrofitting decisions and design alternatives. BIM helps to integrate building simulation and performance data into heritage building modelling that can facilitate collaboration of different stakeholders and helps in the retrofitting and design processes (Azhar and Brown 2009) (Azhar et al. 2011) (Habibi 2017. In the case of the St. Maria church in Scaria d'Intelvi, Italy, Brumana et al. (2013) surveyed and modelled the church; they performed a building performance analysis through simulation, using a simplified version of the model. This simulation, however, was based on a lot of parameters taken as assumptions just to start the process (Fig. 19).  Pocobelli et al. (2018) Several simulation tools can be linked to H-BIM models of existing and heritage buildings, to achieve dynamic simulation of the performance of a building in several aspects such as energy performance, thermal comfort (Fig. 20), weather analysis, daylight performance, acoustic performance and air flow analysis and energy. A case study by Habibi (2017) proved the feasibility of conducting several performance simulations within BIM environment of an existing building at the University of Ferrara, Italy (Fig. 21).
The performance data category represents a wide range of data concerning the operability and performance of the building. It aims to understand how the building is operating, investigate potential performance deficiencies in various aspects and predict its performance in a range of scenarios. This can help to optimise solutions for the various aspects of its operability and planning its maintenance. However, it can be still challenging to link various performance data into BIM formats. Performance data can greatly benefit from the digital twin and IoT concepts, especially within the operation and maintenance phase, as they can provide accurate information concerning the real-life performance of the building which can help in accurately planning its maintenance and operation.

Documentation methods integration
The aforementioned distinct categories of data related to the documentation of heritage buildings are often conducted and utilised by different stakeholders, which have different aims. This often leads to the isolation of information and stakeholders can work in silos. But this is not always the case, as there are examples where data can be used interchangeably across categories. For instance, pathology data can be obtained from an accurate geometry survey and can benefit from available archaeological data. Performance data capture and interpretation depend, as well, on geometry output and pathology interpretation.
In some cases, when historic data is scarce, a reverse process starting with the geometric survey and the development of 3D models of the heritage building can be useful for the Fig. 19 The model spaces and some parameters of the BPA of St. Maria church (by Brumana et al. 2013) Fig. 20 Modelling and simulation of the energy breakdown and comfort metrics of the Villa Antoniadis heritage building in Alexandria, Egypt interpretation of the monument itself and its historical construction and development over time. An example of this procedure can be seen in the modelling of St. Maria church in Scaria d'Intelvi in Italy, conducted by Brumana et al. (2013). As they started with an accurate 3D scanning of the entire church, that allowed the analysis and detailed interpretation of the geometry and the morphology of the structural elements, such as building a hypothesis concerning the building techniques and construction periods of the vaults covering the church. They focused on the three spherical vaults covering the nave. As there is no sufficient historical data related to the era of construction of the, apparently synchronous, vaults, a structural analysis of their shapes from the 3D scan could help in identifying analogies and differences. Based on this analysis, they suggested a hypothesis on the different periods of construction of the three vaults, while all decorated in the same period concealing their different shapes and dimensions (Fig. 22).

Documentation planning factors
Documentation and representation of different data categories of heritage buildings can be conducted using a multitude of techniques that represent a wide range of accuracy, cost, labour capacity, time consumption, ease of use and technology. Table 3 attempts to summarise and assess the different data categories and sub-categories with regard to the aforementioned parameters in both data capture and data interpretation as follows: & Output accuracy: which assess how accurate and unquestionable is the output data of the tool compared with other tools producing the same information. It is classified as (1) least accurate, (2) medium accuracy and (3)  The assessments presented in Table 3 are based on literature review and past experience and assumptions from authors, consideration a range of many factors for a typical heritage building case. However, individual cases can dramatically vary depending on the specifics of the building the documentation objectives and the availability or challenges of specific methods. This table can be further refined on a project to project basis and it can serve as a template to assist in decision-making.
These parameters are very challenging in the documentation procedure and need to be carefully considered in the planning of the documentation strategy of the heritage building in order to fulfil the required documentation objectives without  (2017), including solar radiation, CFD, thermal comfort and daylighting wasting unnecessary time, labour force or budget. Therefore, identification and planning of the appropriate technique is a crucial initial process that depends on many factors such as:

The building
Scale The scale of the building has a profound impact on choosing the appropriate documentation methods. In smaller buildings, basic techniques could be sufficient to cover the required documentation objective. On the contrary, buildings with larger size range need more advanced and time-efficient methods.

Complexity
The level of complexity of the building and the details that are required to be documented have a significant effect on the planning of the data capture methods for documentation. More detailed and complex buildings need more accuracy in documentation, especially in the geometry data capture process (Historic England 2006).

Accessibility
How to access the building is a major aspect to plan the documentation process, deteriorated or hard to access buildings Table 3 Parameters of different data categories 1 = low value, 2 = medium value, 3 = high value usually eliminates the usability of simple techniques and require more advanced and efficient survey methods.

Significance and value
The value and significance of a building dictates the accuracy level of the documentation that copes with its importance to the society. However, in some cases, the documentation itself can emphasise that an assessment of significance may need to be revised, in case of the emergence of new evidence that can change our valuation of the building (Historic England 2006).

Building condition and structural integrity
The structural integrity and the condition of the building affect the objectives and strategies of the documentation. Buildings in deteriorated conditions, or those suffering complex conservation or performance issues, need more in-depth survey and high accuracy levels.

Budget
The available budget and the owners' intentions usually play the key role in planning the documentation strategies, identifying the level of accuracy and setting the scope of the documentation process (Historic England 2006).

Documentation objectives
The objectives of the documentation need to be addressed in the planning of the survey process and the weight given to each category of the documentation (Historic England 2006). For instance, intended retrofitting works need more intense performance data acquisition and highly accurate geometry capture to accurately identify the potential performance deficiencies and realistically plan the needed work on an accurate as-built model. While documentation aiming for conservation and consolidation processes will concentrate on pathology data and geometry survey to precisely depict the decay of the building's fabric and its structural deterioration, likewise, documentation for the purpose of rehabilitation, renovation and adaptive reuse of the building will be heading towards accurate geometry modelling and basic knowledge of the building's history, though documentation for the purpose of historic studies or for public dissemination and XR (cross reality) modelling will need minimal geometry survey but intensive historic documentation, similarly, facility managementoriented documentation will stress on performance data with basic geometry modelling. Table 4 represents the required and potentially needed documentation categories for each documentation objective.

The role of H-BIM in the digital documentation process
Building Information Modelling (BIM) has proved to be a useful, efficient and practical tool to manage architectural and construction projects in the different phases of design, construction and operation (Carbonari et al. 2015). Its usefulness extends to the existing buildings retrofitting practice and building performance improvement (Habibi 2017). BIM represents a "shared digital representation of physical and functional characteristics of any built object which forms a reliable basis for decisions" as defined by the ISO Standards (ISO 29481-1:2016) (Jordan-Palomar et al. 2018) which makes it a tool that can comprise different levels of information and a shared medium between different stakeholders. It helps also with the integrity of design and visualisation, cost estimation, conflict detection, full planning implementation and improved stakeholder collaboration (Volk et al. 2014).
H-BIM inherits all these BIM characteristics as it can combine multi-dimensional visualisation with comprehensive, parametric databases and allows the integration of management of graphical and informational data flows as well as facilitating the collaboration among project partners to develop strategies of design, construction and facility management (Fai et al. 2011). This helps to transform individual executors into teams and decentralise tools into complex solutions, which leads to individual tasks being implemented as complex processes, perform life cycle operations of a construction project more effective, faster and with lower cost (Logothetis et al. 2015).
While these characteristics are shared between new build BIM and heritage buildings' H-BIM, the prominent difference lays in the initial phases of documenting the existing and valuable fabric of the heritage building, which is, usually, coupled with irregular geometry, non-homogeneous materials, variable morphology, not documented changes, damage and various stages of construction . These challenges put more weight on the surveying, documentation, modelling and visualisation phase in the process of Heritage-BIM.
H-BIM can facilitate the representation of changes to the building over time. It can incorporate both quantitative assets (intelligent objects, performance data) and qualitative assets (historic photographs, oral histories, music) (Fai et al. 2011). H-BIM data can also include historic texts, archaeological figures, architectural information, administrative data and past drawings, sketches, photos, etc. (Cheng et al. 2015). H-BIM offers a process of digitally documenting all the features that are made or incorporated into the heritage building over its lifespan, thus affords unique opportunities for information preservation (Albourae et al. 2017). H-BIM is also useful to disseminate the building and its historic development for the wider audience through modelling the different phases of the building's history. In this sense, more advanced visualisation and presentation can be achieved using augmented and virtual reality techniques (AR and VR) (Osello et al. 2018).
While some technical challenges arise in some particular areas, H-BIM can represent the four categories of data related to the digital documentation of heritage buildings. Geometric data can be represented in the parametric modelling of the surveyed point cloud either through manual modelling or automated object recognition algorithms. In the same sense, historic data can be represented through the modelling of the different phases of the building and through the multitude of data that could be linked to the model (Khalil and Stravoravdis 2019a). The same applies to the pathological data and its representation. On the level of performance data, H-BIM can be linked to various tools of building performance modelling and simulation to facilitate the depiction of the different aspects of the building performance (Khalil and Stravoravdis 2019b).

H-BIM data formats
Prior to the era of BIM, a range of software and a variety of data formats have been used to describe data, such as CAD software and DWG, DXF, 3DS formats for managing 2D/3D geometric data, PDF for 2D and textual information and many other formats for raster and vector graphical data, as well as charts and database formats. However, the interpretation and integration of software and data formats were very limited, which was the primary motive for the development of the BIM concept.
BIM represents the potential to integrate and manage data from discrete sources within the same platform. However, many challenges arise in integrating such diverse data which is in and of itself a topic for research and development. Another challenge is in the usage of proprietary formats by BIM software developers which makes it more challenging to merge data between software.
Due to the distinct BIM data formats between software, integration solutions have been developed. Two approaches are used for data integration. The first approach is the standalone approach where all the stakeholders are working together on the same platform, while they can still use different software to create their own data that will be readable by the other users that have access to the same platform. However, this approach is not always applicable to a project because there is currently no single platform that is able to support all the data formats created across the whole lifecycle of a project. Thus, it will be necessary to use other tools external to the platform to add additional necessary data (Arayici et al. 2018). The second approach is the integrated approach which provides a more flexible integration as it uses a translator tool to convert the proprietary format into open data readable by any software that supports this standard (Fig. 23) (Arayici et al. 2018).
Standardised open-source formats were developed to act as a universal format that can be interpreted on various platforms, such as the Industry Foundation Classes (IFC) format, an open specification data format developed by the International Alliance for Interoperability (IAI) and currently promoted by Building Smart International. It can facilitate the interoperability between various software to share the information of the model which can link operators in construction and engineering, such as in simulations and calculations (Nieto et al. 2019). It uses four layers (resources, core, interoperability and domain) to describe the geometry information, the material properties and the relationships in a BIM model. However, the IFC format does not capture how information is created and shared by practitioners so some specific information will be missed in the exchange process (Arayici et al. 2018). Moreover, BIM software developers typically export to the IFC format in distinct ways, adding an extra layer of complexity to the interoperability between software.
Another standard format is the green building XML (gbXML), which is a schema that facilitates the exchange of data between BIM and building performance simulation (BPS) tools (Jeong and Kim 2016). However, the gbXML format is not mature enough and has been limited to being used in simple design solutions because of its inability to read complex geometries (Arayici et al. 2018) (Wang and Cho 2015).
More research is needed to address the interoperability and standardisation challenges of BIM environments and their data formats to help to enhance their ability for the integration of the various data and the involvement of the different stakeholders that are crucial for the protection and survival of the built heritage. Some research efforts are currently under way, but they are not mature enough for wider adoption.
Another challenge that should be addressed with more research is the issue of protection and longevity of heritage buildings data. Heritage documentation needs to be preserved, accumulated and used over a long period of time in order to address the alterations, usage and management practices heritage buildings will be exposed to over their long lifespans. This, coupled with the short lifespan and high development rate of digital software, means that current documentation stored on current data formats are unlikely to be readily usable in future developed software, which will lead to the loss of such valuable data. This can easily be seen even now, when data from buildings from only a few years in the past are not easily accessible by today's software. Therefore, more effort in research and development should be put towards more stable, standardised and universal formats that can be easily interpreted and translated by future software.

Conclusion
While extensive literature concerning the documentation of heritage buildings exists, it is usually disciplinary oriented and lacks a holistic and integrated view that combines and relates different areas into a universal framework.
Recent development in BIM technology and H-BIM research and practice promotes the collaboration of different stakeholders into the documentation and preservation of heritage buildings, as well as the creation of a centralised digital model that can merge data from different areas together. This is further supported by the concept of the "digital twin" which aims to merge physical, digital and biological worlds in order  (Arayici et al. 2018) to build a digital replica of the building that can help in optimising the decision-making process. This paper has attempted to put all this information together for the purpose of providing an overview of the current state of the art and to make the first step towards the creation of a framework for creating a digital twin for heritage buildings, that can be used for documentation, conservation, renovation and maintenance purposes.
Four categories of documentation data were suggested that cover all the data required to document the different aspects of the heritage building. Each category incorporates subcategories of different investigation tools and methods. The interrelations between these categories were discussed as well as the different factors that affect planning the documentation procedure.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.