Virtual Reality Application of the Fortress Al Zubarah in Qatar Including Performance Analysis of Real-Time Visualisation

Technological advancements in the area of Virtual Reality (VR) in the past years have the potential to fundamentally impact our everyday lives. VR makes it possible to explore a digital world with a Head-Mounted Display (HMD) in an immersive, embodied way. In combination with current tools for 3D documentation, modelling and software for creating interactive virtual worlds, VR has the means to play an important role in the conservation and visualisation of cultural heritage (CH) for museums, educational institutions and other cultural areas. Corresponding game engines offer tools for interactive 3D visualisation of CH objects, which makes a new form of knowledge transfer possible with the direct participation of users in the virtual world. However, to ensure smooth and optimal real-time visualisation of the data in the HMD, VR applications should run at 90 frames per second. This frame rate is dependent on several criteria including the amount of data or number of dynamic objects. In this contribution, the performance of a VR application has been investigated using different digital 3D models of the fortress Al Zubarah in Qatar with various resolutions. We demonstrate the influence on real-time performance by the amount of data and the hardware equipment and that developers of VR applications should find a compromise between the amount of data and the available computer hardware, to guarantee a smooth real-time visualisation with approx. 90 fps (frames per second). Therefore, CAD models offer a better performance for real-time VR visualisation than meshed models due to the significant reduced data volume.


Introduction
Virtual reality (VR) enables a new form of presentation and visualisation of cultural monuments. This type of immersive experience offers interesting opportunities in heritage conservation to disseminate information and knowledge to a broad public. This development, thus, presents exciting new opportunities for cultural heritage institutions, such as museums, seeking to engage new audiences through their collections and archives. In particular, coupled with digital 3D reconstruction techniques, VR allows audiences to experience historical spaces at a 1-to-1 scale in a spatially immersive visual and auditory environment.
There are various definitions of VR that simplify understanding of this technology. VR is the representation and simultaneous perception of reality and its physical properties in a real-time computer-generated, interactive virtual environment. Furthermore, virtual reality is "a realistic and immersive simulation of a three-dimensional environment created with interactive software and hardware and experienced or controlled by movement of the body" (Dictionary.com 2021). Understood in this way, VR is an artificial environment experienced through sensory stimuli (such as perspective views and sounds) provided by a computer, in which one's actions partially determine what happens in the environment (Merriam-Webster Dictionary 2021).2021 This very broad definition allows for most modern applications of VR to be taken into account. Additional definitions may be found in literature by Dörner et al. (2014), Freina and Ott (2015), and Portman et al. (2015). Lanier (1992) describes the equipment and technical requirements necessary to achieve the illusion of being in a virtual world. However, the term was first introduced by author Damien Broderick in his 1982 science fiction novel The Judas Mandala. Already in 1962, Morton Heilig developed the Sensorama, a machine that is one of the earliest known examples of immersive, multi-sensory (now known as multimodal) technology and which could be reasonably called the first VR system. In 1968, Ivan Sutherland created with the support of his students Bob Sproull, Quintin Foster and Danny Cohen the first prototype of a VR system which connected to a computer and is similar to modern VR systems (Rheingold 1992;DuBose 2020). This history shows that VR technology is not new, though with the introduction of affordable VR-Headsets, e.g. the Oculus Rift in 2014 and the HTC Vive in 2016, this technology has become available for a wider public. Hence, today's VR applications especially benefit from the visualisation including interaction potentials and sensor tracking using Head-Mounted Displays (HMDs), which increases the degree of immersion. A first empirical verification HMD vs. screen display was carried out by Hruby et al. (2020).
To use this technology for an immersive experience, the following requirements must be met: (a) a virtual 3D environment must be constructed and textured in an IDE (e.g. in a game engine), (b) the developed and executable VR application must be connected to a head-mounted display (HMD) via appropriate software (such as Steam VR), and (c) the user's movements must be controlled and tracked via controller and HMD.
An important factor in VR is immersion, which describes the effect produced by a VR environment that causes the user's awareness of being exposed to illusory stimuli to fade into the background to such an extent that the virtual environment is perceived as real. However, to ensure a smooth immersive visualisation of the VR application in the VR headset, the number of frames per second (fps) should ideally be at least 90. The lower the frame rate of the VR application in the HMD, the more likely it is that the user will experience motion sickness (or cyber sickness) due to latency, which can result in dizziness, nausea or general discomfort while behind the VR goggles.
Therefore, a major challenge in creating a VR application is the balance between visual representation and performance in real-time. This paper uses the example of the Al Zubarah fortress to investigate the performance of VR applications. For this purpose, 3D point clouds from terrestrial laser scanning (TLS) and structure-from-motion (SFM) photogrammetry were used to create 3D meshes with different resolutions and a CAD (computer-aided design) model. Important factors that may affect the performance of the VR application (measured in frames rendered per seconds) are the amount of data and the texturing of the generated models as well as different hardware equipment. Thus, the question arises, what kind of influence does the amount of data and the hardware equipment have on the performance of a VR application in real-time visualisation?
This paper summarises other related work in this context in the second chapter, while the third chapter presents the Al Zubarah fortress. After presenting the 3D modelling in the fourth chapter, the fifth chapter describes the creation of the VR application. The results of the investigations into the performance of the VR application are presented in the sixth chapter. Finally, a conclusion is drawn from the investigations and an outlook is given.

Previous Work
VR has been instrumental in the development of the field of virtual heritage. It opens up a new form of public and scientific communication, in particular for historical objects and monuments that are either already damaged, destroyed, or too far away from potential interested visitors (Addison 2000;Stone and Ojika 2000;Affleck and Thomas 2005). Polimeris and Calfoglou (2016) attempt to shed some further light on the potency of the digital medium virtual reality by conducting a small-scale research, comparing the effects of diverse modes of presentation of the cultural tourism product on respondents' choice of a cultural tourism destination. Medyńska-Gulij and Zagata (2020) evaluated the effect of immersion in a specific historical-geographical virtual space for experts and gamers using the stronghold in Ostrów Lednicki (Poland) as a case study. Bozorgi and Lischer-Katz (2020) present the Virtual Ganjali Khan Project, an ongoing research initiative using 3D and VR technologies for supporting cultural heritage preservation of the Ganjali Khan Complex, a vast historical landmark in the desert city of Kerman, Iran. Edler et al. (2019) present how VR-based 3D environments use can be enriched (based on the game engine Unreal Engine 4) to support the district development of a restructured post-industrial area using the VR model of the area of "Zeche Holland" in Bochum-Wattenscheid as a representative former industrial area in the German Ruhr district.
At the HafenCity University Hamburg, several VR applications concerning cultural heritage have already been developed. The museum in Bad Segeberg, Germany, housed in a sixteenth-century townhouse, was digitally constructed in four dimensions for a VR experience using the HTC Vive Pro (Kersten et al. 2017b). Three historical cities in Germany (as well as their surrounding environments) have been developed as VR experiences: Duisburg in 1566 , Segeberg in the year 1600 (Deggim et al. 2017;Kersten et al. 2018a), and Stade in 1620 (Walmsley and Kersten 2019). In addition, three religious and cultural monuments are also available as VR experiences: the Selimiye Mosque in Edirne, Turkey (Kersten et al. 2017a), a wooden model of Solomon's Temple (Kersten et al. 2018b), and the imperial cathedral in Königslutter, Germany integrating 360° panorama photographs within an immersive realtime visualisation (Walmsley and Kersten 2020). Another example for an immersive and interactive VR presentation of a CH monument are the İnceğiz caves, located at the Çatalca district of Istanbul, Turkey, which were modelled in 3D using point clouds from terrestrial laser scanning and integrated within the Unity 3D game engine (Büyüksalih et al. 2020).
The amount of work specifically regarding the real-time VR visualisation of cultural heritage monuments is growing. Recent museum exhibits using real-time VR to visualise cultural heritage include Batavia 1627 at the Westfries Museum in Hoorn, Netherlands (Westfries Museum 2021), and Viking VR, developed to accompany an exhibit at the British Museum (Schofield et al. 2018). A number of recent research projects also focus on the use of VR for cultural heritage visualisation (Fassi et al. 2016, See et al. 2018, Skarlatos et al. 2016, Ramsey 2017, Dhanda et al. 2019, as well as on aspects beyond visualisation, including recreating the physical environmental stimuli (Manghisi et al. 2017). However, this is not an exhaustive list.
Publications regarding studies into the performance of real-time VR visualisations are more limited. Kharroubi et al. (2019) have analysed the influence on VR performance (in fps) relative to the size of the point cloud using the Unity game engine. They tested their approach on several datasets, including that of a point cloud composed of 2.3 billion points representing the heritage site of the castle of Jehay (Belgium). Their results underline the efficiency and performance of their solution for visualising classified massive point clouds in virtual environments with more than 100 frames per second. The presentation of digital elements in virtual reality systems requires very low latencies to generate a smooth VR experience free from motion sickness and nausea. To create the ideal conditions for VR, a "motion-to-photon time" (time between sensor detection of the movement and the reaction on the screen) of less than 20 ms (ms), corresponding to 50 FPS (frames per second), is aimed for (McCaffrey 2017). To investigate the performance of the computer for real-time VR application, three scenes of the four-masted barque Peking (an historic ship) were selected as examples based on the different complexity of the scenes : Rendering a view of the harbour with only a few environmental props and a low level of detail, peak values of up to 54 fps (19 ms) were achieved for the VR visualisation. As the number of models and textures increases, the performance correspondingly decreases, as was the case when the entire ship is in view (∅: 21 fps/48 ms). Due to the simulation of physical processes (such as the wind in the sails), the required computing power increases greatly and, thus, the visualisation is delayed, to the point that when looking at the sails (∅: 15 fps/67 ms) the user can clearly perceive the latencies at close range due to the dynamic process in the scenery.
Therefore, the performance impact of different versions of the VR application of the fortress Al Zubarah has been given a further look to investigate the resolution limits of models suited for real-time applications such as VR.

The Fortress Al Zubarah in Qatar
To investigate the performance of a VR application under various data loads, the historic object Al Zubarah fortress ( Fig. 1) was used. The fortress is located on the north-western coast of the Qatar peninsula in the Madinat ash Shamal municipality. With an area of 34 m × 34 m and a height of 9 m, this historic Qatari military fortress is one of the most well-known sights and tourist attractions in Qatar. The fortress was originally built by Sheikh Abdullah bin Jassim Al Thani in 1938 to serve as a coast guard station (Wikipedia 2021). However, its location bequeathed it strategic importance due to the constant conflict with the neighbouring state of Bahrain. High, compact, one-metre-thick walls of coral stone and limestone enclose the fortification. A protective roof of pressed clay provides shade and a cool environment for the fort's former inhabitants (soldiers). The fort has three corners with massive circular towers including various types of defences. The fourth corner was built as a concise rectangular tower with exquisite triangular-based ledges with slits. Eight rooms were set up on the ground floor to accommodate the soldiers. The fort has staircases inside that can be used to access the gallery and towers of the fort. In 1987, it was renovated into a museum to display diverse exhibits and artworks, particularly contemporary, topical archaeological findings from the nearby archaeological excavation in the former city of pearl fishermen. Four significant buildings (mosque, palace, towers, and market) of the ancient city Al Zubarah were already virtually reconstructed by Ferwati and El Menshawy (2021). Since 2013, the Al Zubarah archaeological site has been inscribed on the UNESCO World Heritage list (Thuesen and Kinzel 2011;UNESCO 2013UNESCO , 2021.

3D Modelling
The Al Zubarah fortress was recorded and documented in September 2011 by terrestrial laser scanning using an IMAGER 5006 h from Zoller + Fröhlich (Z + F) in Wangen, Germany and SFM photogrammetry as part of the Qatar Islamic Archaeology and Heritage Project for the Qatar Museums Authority (Kersten et al. 2015). The aim was to document the fort in 2D and 3D in order to provide plans and sections for the upcoming renovation and also to generate a 3D meshed model. The digital construction and 3D modelling of the Al Zubarah fortress was carried out using 3D point clouds obtained from both laser scanning and photogrammetry. The following variants of 3D models were generated: (1) a meshed 3D models in three different resolutions (1, 5 and 10 million triangles, Fig. 2 left)   Fig. 1 The Arabic fortress Al Zubarah in Qatar-the location in the map (left), the exterior (centre) and interior with the terrestrial laser scanner (right) Fig. 2 3D models of the Al Zubarah fortress mesh with 10 million triangles from terrestrial laser scanning (left) and textured 3D model from SFM (structure-from-motion) photogrammetry with 10 million triangles (right) based on terrestrial laser scanning data, (2) a meshed 3D model obtained from photogrammetry using 395 photos of a Nikon D70 (focal length = 35 mm) from the same campaign (Fig. 2 right) and (3) an optimised CAD model constructed using the laser scanning point cloud at the highest resolution (Figs. 3 and 4).
In the point cloud, the points representing the fort were cut out and all outliers eliminated. In addition, all windows and doors were cut out to include moving objects such as opening doors and windows in the VR application later. A triangular mesh was calculated from the point cloud, in which existing holes were filled manually using appropriate functions in Geomagic. Three variants with the number of triangles of 1, 5 and 10 million triangles were derived from this original mesh (Fig. 2 left). As the second version, a point cloud was generated from the photos using the photogrammetry software Agisoft Metashape, from which a RGB coloured triangular mesh with 10 million triangles was calculated after the elimination of outliers. The third variant is the CAD model constructed using the software Auto-CAD. The basis for the construction of the objects as solids in the CAD model was the TLS point cloud (Fig. 3 left). First, for the construction of the ground plan, a horizontal cross section was calculated in Geomagic in the point cloud, which was then imported into AutoCAD and digitised as 2D lines (Fig. 3 centre). The measurements for the construction of the fortress in AutoCAD were performed in the software Z + F LaserControl, in which some of the functions can recognise planes and corners (Fig. 3 right). However, for the construction of the fortress, a certain degree of generalisation was necessary, since the fort is not always exactly rectangular and the walls not always completely straight or exhibit the same thickness.

Development of the VR Application
For the development of the VR application, the models of the fort described in chapter 4 were imported as fbx files into the game engine Unreal Engine version 4.24. A game engine is a simulation environment where 2D or 3D graphics can be manipulated through code. Developed primarily for the video games industry, they provide ideal platforms for the creation of VR experiences for other purposes, as many of the necessary functionalities are already built in, eliminating the need to engineer these features independently. For  . 4 Workflow for the development of the VR application fortress Al Zubarah this project, Unreal Engine was chosen for its advantage in the built-in blueprints visual coding system, which allows users to build in simple interactions and animations without any prior knowledge of C+ + , the programming language on which the engine is built. The general workflow for the creation and visualisation of a VR application is shown schematically in Fig. 4. However, the 3D models were not textured in the 3DS software as usual, but first in the game engine. Only the meshed 3D model from the photogrammetric data was already textured in the Metashape software.
The area surrounding the Al Zubarah fort consists of a flat desert landscape. Therefore, the landscape around the fort was constructed in the game engine as a simple flat plane. The environment was furnished with some shrubbery and trees at one corner of the fort. The movement and navigation of the user within the VR application is done by teleportation (Fig. 5 left and centre) controlled by the hand controllers. To limit the user's movement in the application, an invisible boundary was set around the fort, allowing the user to move within the fort itself and the immediate vicinity. To prevent the user from teleporting through the wall of the fort, a complex collision mesh was produced for each element. For movable objects such as doors, the collision mesh changes allowing the user to teleport through open doors (Fig. 5 centre and right). The doors open automatically when the user moves within a trigger box located in front of and behind the door. In addition to their own movement within the application, the user has several possibilities to interact with the application via the programmed controllers, for example the switching of various light sources that can be used in the fort to operate corresponding lamps in the interior rooms and above the pictures in the exhibition. This interaction is triggered by moving the controller to the light switch. Another interaction within the VR application is the trigger box-activated teleportation of the user in each tower via a ladder to the level above.
The final VR application (Fig. 6) lets the user switch between the versions in real time and allows one to explore both the cultural aspect the digital version of the fortress and the technological aspect with the impact of different forms of using 3D data sources and workflows for creating digital models.

VR Performance Tests and Results
To obtain meaningful results for the performance tests, five different views representing and displaying different content (including an overview of the fortress, a simple view of the wall, a view with a waving flag, and a detailed interior view) were selected for the performance analysis (Fig. 7). The views were chosen for their different amounts of materials and polygons displayed. We suppose that the different content of each view will have an influence on the performance of the real-time VR visualisation, i.e. the more different materials, textures and number of polygons, the more influence on the VR performance.
The first view shows the front of the fort with the only entrance and the Qatari flag waving atop the rectangular tower. For the second view, a section was chosen in which no changes are visible, with only floor, sky and masonry in shot. To capture the full extent of the 3D model, another view (view 3) was created showing the entire model of the fort from an oblique angle. The fourth view was placed in the middle of the inner courtyard of the fort with a view of the exhibition, while the fifth view shows a series of paintings for the exhibition on the left within the narrow corridor.
To test the influence of the hardware on the VR performance, two VR applications (CAD model vs. photogrammetry mesh) were run on three different computers. However, the computer Lab 1 (Table 1) was mainly used for all other tests, which is connected to a VR station in the laboratory at HafenCity University Hamburg and allows the user to experience the application in a virtual environment located. The following technical specifications of the three computer hardware are summarised in Table 1: CPU (Central Processing Unit), GPU (Graphics Fig. 7 Overview of five different test scenes including their perspective views Processing Unit), and Random-Access Memory (RAM). Each Nvidia graphic card used has a Graphics Double Data Rate (GDDR) of eight Gigabyte. The results of the VR performance tests with three different computer hardware using an application with a CAD model and a meshed model from SFM photogrammetry are illustrated in Fig. 8 (see dots as measured values). Due to the reduced amount of data of the CAD model compared to the meshed models, all three computers obtain more than 100 fps for the VR application using the five different perspective views, more than sufficient to run the VR application. The slight differences in the performance might be caused by the different CPU. With the significantly higher data volume of the photogrammetric meshed model, however, the high-end graphics card (Nvidia RTX 2060 Super) of the Home computer pays off, as the fps achieved here are still over 100. In contrast, the large data volume (10 million triangles) reduces the performance of the two laboratory computers (Lab 1 and Lab 2), seen in the reading of only 50 fps in each case using the four views. View 5 was not available in the meshed model generated from SFM photogrammetry.
The second part of the tests consisted of a performance comparison of the different VR models using the same computer hardware (Lab 1). To eliminate as far as possible the influence of other factors, the fps measurements were carried out without textures and before calculating the lighting of the scenes. The results of these performance tests are summarised in Fig. 9, illustrated as dots. The VR applications with the CAD model and the TLS mesh with 1 million triangles run with a high fps of more than 120 fps, while the other meshed models produced only significantly worse performances. However, there are two interesting aspects in Fig. 9: (1) the TLS mesh with 1 million triangles obtains a higher fps rate than the CAD model and (2) the photogrammetric meshed model (10 million triangles) from Metashape performs slightly better than the TLS mesh with 10 million triangles. The reasons for both results are not clear. Moreover, the complexity of view 3 reduces the performance of the VR application, when only using a minor data volume. Figure 10 shows the influence of the amount of data (meshed TLS model in millions of triangles) on the fps rate for the views 1 and 3. For this analysis, the amount of data was artificially increased in the VR application, in order to  Table 1 Fig. 9 Frames per second (fps) as dots for the different 3D models in the VR application clearly visualise the effect of data growth in view 1 and 3. Thus, it could be demonstrated that there is currently a limitation of the data volume of approx. 5 million triangles in the game engine to guarantee for a smooth real-time visualisation of the VR application with the computer hardware and settings used.

Conclusions and Outlook
In this contribution, the development of a VR application of the Arab fortress Al Zubarah in the game engine Unreal 4 is presented. Furthermore, the performance of the VR application was investigated on the basis of various criteria, with a benchmark of approximately 90 frames per second for optimal real-time visualisation in the HMD. For this test, five different 3D models were generated and integrated in a VR application: three meshed 3D models in different resolutions (1, 5 and 10 million triangles) based on terrestrial laser scanning data, one meshed 3D model obtained from photogrammetry and an optimised CAD model constructed using the laser scanning point cloud. The following variables were investigated in the performance tests: computer hardware, five different 3D models, and different TLS meshed models with increasing number of triangles. The performance investigations in the game engine Unreal Engine 4 showed that it is always important to find a compromise between the amount of data and the available computer hardware, in order to guarantee a smooth real-time visualisation with more than 90 fps. Therefore, it is an advantage to use CAD models instead of meshed model due to the reduced data volume. However, if a meshed model is used the upper limit are 5 million triangles for an optimised real-time VR visualisation for the hard-and software and the settings used. The higher the resolution of the meshed models, the lower is the performance of the VR applications in terms of frames per second. This effect can be somewhat reduced using hardware equipment with high performance graphic card and higher computer memory (Random-Access-Memory). Furthermore, it can be stated that the different content of each generated perspective view of the fortress have a significant influence on the performance of the real-time VR visualisation.
It should be noted that there are various other elements that can highly influence the performance of a VR application, such as shadow quality and dynamic shadows, subsegmentation of few big 3D models in many smaller ones, level of detail (LOD), light settings, environment settings, rendering pipeline of the chosen game engine, etc. However, this contribution focusses on the Unreal Engine and on comparison with meshes derived from point cloud data. The usual meshed point cloud is one single big mesh, which is unusual for real-time applications and comes with its own set of challenges considering performance, e.g. the large data volume of a high-resolution meshed point cloud. Optimization methods for a similar application are presented, e.g. by Lütjens et al. (2019), who could show that large, high-resolution terrain datasets can be visualised in a virtual reality application by incorporating tiling, level streaming, and LOD algorithms. In the future, the performance of the computer hardware and the game engines is expected to significantly increase, so that the maximum data volume used for the VR application can be increased.
Funding Open Access funding enabled and organized by Projekt DEAL. This research has no funding.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.

Fig. 10
Frames per second (fps) as dots for the different TLS meshed models used in VR applications