Keywords

1 Introduction

Virtual reality and augmented reality techniques are constantly evolving. Thanks to the development of increasingly high-performance hardware, they make it possible to achieve results and levels of realism unthinkable just a few years ago. The increase in computational power of the GPU makes it possible to create highly complex three-dimensional models rich in vertices, polygons and very high-resolution textures. As a result, the virtual environments that can be realized allow the user to be immersed in photo-realistic worlds that can even trigger the suspension of disbelief. This sentiment is a psychological state where the user stops considering the virtual world as a fake and detached environment but rather as an alternative reality that engages him or her psychologically. In the past two years, the whole world has been affected by a pandemic whose scale and intensity have caused endless pain and problems in many sections of the population. However, there is one thing that we can ascribe to the pandemic as positive: that it has caused all segments of the world’s population to make an impressive technological leap. The conjugation of this impressive technological leap of individuals, borrowed from the forced confinement for long periods in confined spaces and under travel bans, has brought the importance of digitization in cultural goods and works of art to enable their enjoyment even study remotely.

This paper aims to identify a series of methodologies that enable the creation of virtual worlds through which art treasures can be accessible to a vast population in a simple and relatively inexpensive manner in various ways. To this end, use cases are presented which were realised using two of the most successful approaches, based on the most modern virtual and augmented reality techniques. The first use case was realised through photogrammetry. The “Fontana Maggiore” in Perugia (PG, Italy), one of the main monuments in the city’s historic centre, was reconstructed. Photogrammetry has made it possible, through appropriately taken photographs, to reconstruct the fountain in a highly reliable manner without manually modelling its polygonal shapes. We preferred relying solely on the results obtained by neural networks and algorithms that analyse the images produced by the three-dimensional model. Photogrammetry is a technique that allows very high-quality results that could hardly be achieved by manually producing the models and textures. As a limitation, this technology has a high number of vertices required for faithful reconstruction, limiting the virtual visualisation of the artefact with low performing hardware.

The second use case presented follows the manual modelling approach and concerns the reconstruction of the Republic square in Foligno (PG, Italy) and the surrounding areas where the buildings and architectural works were reproduced. These environments were treated using two popular 3D modelling software. The first is BlenderFootnote 1, open-source software that allows the three-dimensional modelling of objects and the faithful representation of environments. The modelling tools made available to the user are of particular importance, rich in functions performed by the software that greatly facilitate the user’s work.

The second software used is UnityFootnote 2, which is frequently used for the realisation of three-dimensional environments and interactive synthetic scenarios, in particular videogames and serious games.

The article is structured as follows.

Section 2 provides an overview of recent studies conducted by the scientific community on the digitisation of cultural heritage and the implementation of virtual environments that enable the remote visitation of sites.

Section 3 describes the state-of-the-art techniques that are used for the realisation of materials and three-dimensional polygonal models with the aim of creating synthetic, interactive and navigable worlds for the visitor.

Section 4 illustrates the use of the technologies discussed, in order to allow the replication of the approach we suggested in multiple other contexts, in order to achieve an easily realisable and low-cost methodology, describing its most relevant aspects. The discussion takes its cue from how the two use cases were realised, illustrating the salient steps and methodologies adopted.

Section 5 summarises the important aspects of our approach and provides some insights into possible future developments of this work.

2 Related Works

The reproduction of images through computer graphics and synthetic reconstruction has always been searching for a compromise between the quality of the result and processing performance [1, 2]. A scene taken from a video, or an image, to be considered credible from the point of view of realism, needs a particularly advanced level of detail, which inevitably requires an underlying complex numerical processing [3]. Indeed, every image has got to be divided into numerous small adjacent polygons, the number of which is proportional to the quality and fidelity of the reproduction itself [4,5,6]. Certainly, the advent of more performing hardware such as GPUs has allowed a significant leap forward in the main sectors of computer graphics and virtual reality [7,8,9]. More recently, artificial intelligence, and more specifically convolutive networks, has made it possible to quantitatively raise the level of realism of images and video streams [10,11,12].

The goal becomes particularly complex when one wants to create 3D models of historical and archaeological items and monuments in their existing condition. So, a more sophisticated approach capable of capturing and digitally modelling these places’ precise geometry and morphological features is required [13]. For years, close-range photogrammetry has been proposed for cultural heritage documentation and has worked very well. This conventional approach has been supplanted by digital close-range photogrammetry due to recent computer and information technology advances. This innovative technology opens up new possibilities like landscape laser scanning, automatic orientation and measurement operations, 3D vector data production, digital ortho-image generation, and digital surface model development [14]. Whereas many methodologies and sensors are now available, the correct strategy for obtaining a better and more realistic 3D model with the necessary level of detail is still a blend of multiple techniques and models. That is why a single approach is not finally capable of providing consistent results in all scenarios [15]. However, it is precisely this mix that enables a high-quality graphic reproduction and the possibility of manipulating the image to highlight and better understand its constituent details. The virtual environments present in the images can then be rotated and resized; one can enter them and explore them in their spatial depth, appreciating the aesthetic elements [16, 17]. Furthermore, each environment can be enriched with a series of important information and data of different nature (data and metadata), such as texts and audio, useful for better describing the object under examination [18, 19].

This outstanding potential to create a diverse cognitive experience in humans is precisely what makes these modern technologies an effective way of communicating ideas and culture; they already pervade numerous domains ranging from education [20] to healthcare [21, 22], from recreation and entertainment [23] to tourism [24].

The growing interest in these technologies in tourism and cultural promotion is evident; by their nature, they allow the enhancement, knowledge and accessibility to the vast public of humankind’s immense cultural heritage, even remotely.

3 Materials and Methods

This section describes the techniques used to create the materials and polygonal models with which the three-dimensional scenarios were realised.

3.1 Creating the Shapes with Blender

Blender is a cross-platform 3D modelling programme, i.e. it can run on various operating systems such as Linux, Windows and MacOS, offering developers the possibility of working independently of the computer being used. Blender is released under the Gnu Public Licence (GPL) and allows numerous creative functions. These include the possibility of modelling three-dimensional objects, creating animations, rendering complex scenes and exporting the result by producing PNG or JPEG images. The latest versions of this programme have considerably improved the user interface, lowering the learning curve and allowing it to be used even by people who do not have the aptitude for advanced computer software. Polygon models made with Blender can also be exported as OBJ or FBX files. These files are highly compatible and can be used by many other complementary programmes, such as Unity.

Starting Modeling from Google Maps or Other Images  

Reproducing natural objects can be challenging. In particular, it is not trivial to reproduce accurately scaled shapes and dimensions within computer graphics software. Specific techniques can help the developer maintain a high fidelity of the shapes realised. The technique we recommend is the use of reference images. First of all, a photograph must be taken of the object to be modelled, acquiring the image perpendicular to the object to be reproduced. Let us suppose that we want to acquire the frontal image of a building. It is sometimes difficult to obtain images of this type when one wants to reproduce buildings built in an urban context, as there may be no suitable points for taking the images. For this reason, we recommend using a drone, on which a remote-controlled camera must be installed. After having had the drone positioned perpendicularly to the façade of a building, the shot can then be taken. Particular attention must be paid to the focal length of the camera used, as focal lengths too short, e.g. less than 50 mm, may cause perspective distortion that is difficult to correct. In this case, it could be helpful to have the drone move within an imaginary grid positioned along the façade of the building, taking photographs after moving the drone by a predefined amount of metres, an example of which is shown in Fig. 1. This operation will produce a collage of photographs, which, when joined together, will provide a faithful reproduction of the building façade without the problem of perspective distortion.

Fig. 1.
figure 1

Sample virtual grid applied to a building.

3.2 Procedural Textures

The realisation of realistic three-dimensional objects involves, among other things, the creation of textures, i.e., images that are applied to the object making it easily identifiable with real-world objects. A properly realised texture can simulate the materials that make up the objects in the three-dimensional scene. There are various techniques for creating realistic textures. The first involves manual creation using photo editing software (such as GimpFootnote 3), while the second involves the mathematical definition of the structure that the resulting texture should have. Textures created using the latter technique are called Procedural Textures. They are generated mathematically to simulate complex materials such as marble, wood and granite. Blender allows procedural textures to be created via a block graphics programming environment, an example of which is shown in Fig. 2.

Fig. 2.
figure 2

Visual program related to a procedural texture of the wall.

The various blocks we can see in Fig. 2 are abstractions of mathematical algorithms that perform operations on an initially empty texture. When these blocks are interconnected, the output of the previous block is given as input to the next block. Thanks to the concatenation of these blocks, extremely complex textures can be obtained, such as the one shown in Fig. 3, which is the result of the procedural texture used to create the façade of a building in Perugia’s central square next to the “Fontana Maggiore”.

Fig. 3.
figure 3

The procedural texture output simulating the real wall.

Another advantage of procedural textures is that they are created and managed by the software itself, eliminating the requirement for file system references to images. These textures likewise adjust to the object they’re applied to and don’t have any visible discontinuities. It should also be noted that if one of these textures is applied to an object, such as a building, a change in the size of the building’s facade will not have an adverse effect on the texture’s aesthetics because the software will handle this fully automatically and the texture will be correctly placed on the object’s facade, generating the missing parts. A procedural texture has no fixed resolution and adapts to the objects to which it is applied dynamically. Finally, it should be noted that creating a procedural texture is a time-consuming process that will use the CPU for several minutes.

3.3 Creating Relevant Details

The manual modelling of complex polygonal details may not be easy, and the final result may not be satisfactory. In this case, we recommend using the technique of photogrammetry, which allows a very high level of realism to be obtained by delegating the realisation of the most complex parts of 3D models to artificial intelligence techniques. To set up the photogrammetry software, one has to create an image dataset of the object to be reproduced. The dataset must consist of a sufficiently large amount of images, as all parts and facets of the object must be photographed. Most of today’s cameras have an automatic mode, capable of setting photographic parameters that are most often adequate.

However, knowing how to set the correct parameters and juggle the use of manual mode will make a significant difference in terms of the quality and efficiency of the photogrammetric survey. It would be good to ensure that all photographs in the dataset are taken at the same focal length and with as similar an illumination level as possible between the various images.

In photogrammetry, similarly to human vision, if an object is captured in at least two images taken from different viewpoints, the different positions of the object in the images allow stereoscopic views to be obtained and three-dimensional information to be derived from the overlapping areas in the images. To obtain a very detailed three-dimensional model, a large amount of overlap between the different photographs is necessary; in general, it would be optimal that between one photograph and the next, approximately 70% of the details captured in the previous photograph also appear in the following photograph. Furthermore, the number of photographs required for the three-dimensional reconstruction of an object is directly proportional to its size; the larger the size of the object, the greater the number of photographs required for its reconstruction.

3.4 Creating an Interactive Environment with Unity

The creation of immersive environments is facilitated by the presence on the market of software with a large community. Thanks to constant updates, it keeps pace with the needs of developers. Unity makes it possible to realise immersive environments by inserting models and assets created with Blender into the scenes. The scenes realised in Unity can then be exported for the platforms on the market today, such as desktop computers with Windows operating system or consoles or Android or iOS smartphones.

Issues on Importing Blender Shapes and Textures  

The export of models from Blender may initially present critical issues. In particular, first-time users of these programmes cannot easily export procedural textures from Blender to Unity. The correct operation involves backing the texture directly onto the UV map of the object. Through this procedure, the object will use an image that will be applied directly onto it and no longer the procedural texture. Therefore, the procedural texture must be understood as mathematical modelling used in the development phase. It must be stored in an image file applied to the object. Once this has been done, it is then possible to proceed with the export from Blender of a zip file containing the three-dimensional model and the images that will constitute its textures. Once the three-dimensional object has been imported into Unity, it will then be necessary to manually apply the various PNG or JPG textures to the various components of the object in order to obtain a faithful reproduction.

4 Digitalization of Cultural Heritage Artifacts

In this section, we describe the methodologies that can be adopted to realise virtual worlds of monuments and works of art for the remote enjoyment of the objects of interest. To this end, we will first analyse the salient phases of the work that led to the virtual representation of the areas of the two Umbrian towns consideredFootnote 4.

The methodologies indicated can be reused for virtual representation and the creation of virtual journeys in other scenarios and contexts.

4.1 The Piazza IV Novembre

The “Piazza IV Novembre” is located in the centre of the city of Perugia and features several historical buildings such as the “Palazzo dei Priori” and the “Fontana Maggiore”.

The “Fontana Maggiore”  

The “Fontana Maggiore” in Perugia has three levels, is polygonal in shape and was built in the thirteenth century using an alternating pink and white marble. The lowest level is set on a flight of steps from which rises a twenty-five-sided basin. Each face of the twenty-five sides forms a diptych, i.e. two images joined by a central link, adorned with sculpted reliefs with small, slender columns. The twenty-five sides describe the 12 months of the year, each accompanied by the zodiac symbol. Each month is associated with moments of daily life and agricultural work. Thus, each bas-relief on each side is different from the others. The photogrammetric survey of the fountain required photographs taken at different heights rotating at each iteration around the fountain to view the fountain’s surroundings completely. An example is shown in Fig. 4.

Fig. 4.
figure 4

Images taken around the “Fontana Maggiore”.

A total of 564 photographs were collected. The photographs are all taken with the same parameters: ISO sensitivity is set to 500, the focal length is 24 mm, the exposure time is 1/320 s, and the aperture is f/9. Since the scenery in which the photographs are taken is open to the public, care must be taken to avoid people appearing in the shots. Shadows of animals such as birds, dogs or cats should also be avoided as much as possible.

Palazzo dei Priori  

Next, photographs were taken at the “Palazzo dei Priori”, located opposite the fountain. The palace is characterised by a staircase that connects the main door with the floor of the square. There are several arches supported by columns in the lower part of the palace, while in the upper part, one can observe several windows arranged in the highest part of the façade. There are also aesthetic embellishments along the walls, signs of the wear and tear of time, and statues. The dataset, in this case, consists of 626 photographs and the parameters used are the same as those used for the fountain dataset. The point cloud of the building is shown in Fig. 5.

Fig. 5.
figure 5

Images taken for reproducing the “Palazzo dei Priori”.

As can be seen, the photographs were taken at different heights, moving from left to right and trying to capture as much detail as possible of the building.

Photogrammetry  

The dataset is then processed with the Meshroom photogrammetry softwareFootnote 5. Meshroom is a software released under an open source licence, which allows the reconstruction of three-dimensional models from photographic datasets. Once the dataset has been imported into Meshroom, it is necessary to define the pipeline, i.e. the set of operations that the programme must perform to generate the object’s three-dimensional model. The pipeline used is as follows: Camera Init, Feature Extraction, Image Matching, Feature Matching, Structure from Motion, Prepare Dense Scene, Depth Map, Depth Map Filter, Meshing, Meshing filter, Texturing. The fountain model that was produced consists of approximately 2.9 million vertices and approximately 5.9 million faces. The model of the “Palazzo dei Priori” consists of approximately 2.1 million vertices and approximately 4.3 million faces. The resulting models are then imported into Blender and placed within a scenario. The square floor is then created using a procedural texture, and several adjacent buildings are modelled to give a sense of immersiveness to the scene. Figure 6 shows a view of the “Piazza IV Novembre” rendered within the virtual scenario, and the beauty of the final result can be appreciated.

Fig. 6.
figure 6

Final result of the virtual representation of “Piazza IV Novembre”, Perugia (PG, Italy).

4.2 The Republic Square, Foligno

The Republic square in Foligno was reconstructed using the technique of manual 3D modelling with Blender. There are various buildings within the square, among them the “Palazzo Trinci”, the Town Hall, the Cathedral of San Feliciano and the “Palazzo delle Canoniche”. Similarly to what was done to reconstruct “Piazza IV Novembre” in Perugia, preliminary work aimed at collecting images was also crucial in this case. Images are necessary for reconstructing rooms and buildings because they provide details and proportions and are used as a trace to be followed during modelling.

The “Palazzo Trinci”  

On the eastern side of the square is the “Palazzo Trinci”, formerly the residence of the Foligno seigniory and now home to the City Museum. Characteristic elements of “Palazzo Trinci” are the six large Corinthian-style columns resting on the central loggia of the palace façade, the windows are characterised by two distinct architectural forms, the ‘curbs’ under each row of windows, the row of small parallelepipeds positioned below the roof and the bricks decorate the lower part of the building. Figure 7 shows the virtually reconstructed “Palazzo Trinci”. The picture was taken with a focal length of 35 mm.

Fig. 7.
figure 7

Virtual representation of “Palazzo Trinci” located in Republic square, Foligno (PG, Italy).

The Town Hall  

The Town Hall is set on a medieval tower called the “Pucciarotto”, which has the function of a bell tower. The elements that characterize this building are described. The façade is punctuated by six Ionic-style columns resting on a ledge from which five round arches open. The windows of four different shapes have a frame very similar to that of the “Palazzo Trinci”, which is why the same model has been reused. A characteristic element of the building is undoubtedly the tower, built in the thirteenth century, which collapsed following the 1997 earthquake, and then restored in 2007. The structure of the turret ends with an umbrella-shaped dome. The virtually reconstructed Town Hall is shown in Fig. 8. The perspective distortion seen in the image is due to the focal length that was used to capture the photograph. The photo was taken with a focal length of 18 mm and this was necessary to fit the whole building into one photographic shot.

Fig. 8.
figure 8

Virtual representation of the Town Hall located in Piazza della Repubblica, Foligno (PG, Italy).

The Cathedral of San Feliciano  

The Cathedral of San Feliciano, also called the Duomo, is the most influential building in Republic square, both from an artistic and cultural point of view. Two facades face the square, the side one and the main one, which more precisely overlooks Largo Carducci. The main facade of the cathedral, restored in 1904, has three doors in the lower part, which were completely rebuilt during the restoration work. On the axis of the side portals, there are two mullioned windows surmounted by a rhombus and a circle and a loggia articulated into eight small arches. In the second order, the facade has a central rose window. At the same time, the third level, introduced with the restoration of the sixteenth century, contains a mosaic depicting the Redeemer enthroned, the saints Feliciano and Messalina and the Pope Leo XIII genuflected. The minor façade of the Duomo presents, in the lower part, the multiple ring gate adorned with a series of decorative motifs. There is a large loggia in the upper order with six openings above which the two small side rosettes and the larger central one stand out. The main facade has a single rose window consisting of two rows of columns arranged radially starting from the central core and connected by small arches. The side façade, on the other hand, is characterized by three rose windows, one larger and placed centrally, and two smaller lateral ones, formed by a single row of columns. Three mullioned windows characterize the minor facade of the cathedral, created starting from the outer frame and then moving on to the modelling of the internal details. On both sides of the church, numerous bas-reliefs and sculptures present a significant level of detail, many of which were not made through modelling but simulated through procedural textures. The Cathedral also features a bell tower and a large dome. The virtually reconstructed Cathedral of San Feliciano is shown in the Fig. 9. The picture was taken with a focal length of 35 mm.

Fig. 9.
figure 9

Virtual representation of the Cathedral of San Feliciano (left side) and “Palazzo delle Canoniche” (right side) located in Piazza della Repubblica, Foligno (PG, Italy).

The “Palazzo delle Canoniche”  

The “Palazzo delle Canoniche” is built between the nave and the left arm of the main facade of the cathedral. Inside we can find the Capitular and Diocesan Museum of Foligno. The aesthetic aspects that characterize this building are the arches that run along the base, the windows with a very elongated shape, and the roof, with various parallelepiped-shaped elements that run along its perimeter. The “Palazzo delle Canoniche”, virtually reconstructed, is shown in the Fig. 10. The picture was taken with a focal length of 30 mm.

Fig. 10.
figure 10

Virtual representation of “Palazzo delle Canoniche” (left side) and the Cathedral of San Feliciano (right side), Piazza della Repubblica, Foligno (PG, Italy).

5 Conclusions and Future Works

We presented some guidelines to digitally reconstruct monuments and artefacts, which enable public administrations, museums, and organizations in charge of promoting art treasures to realize virtual exhibitions and the digitization of such important cultural heritage. The theme is highly relevant in Europe, particularly Italy, where the topic is part of the National Recovery and Resilience Plan (NRRP).

Two virtual environments were created as case studies, using two different work paths in this work. In the first case, photogrammetry was used, thanks to which the Palazzo dei Priori and the Fontana Maggiore in Perugia were reconstructed. The objects obtained with photogrammetry were then arranged in an interactive Unity virtual environment. In the second case, manual three-dimensional modelling of the buildings present in the Piazza della Repubblica of Foligno was carried out. In this second case, the virtual scenario can be freely explored through virtual reality viewers or the most common desktop computers that interact with the environment via mouse and keyboard input.

The photogrammetry technology to reconstruct three-dimensional elements is able to produce very detailed models that are able to immerse a user within the scenario.

As possible future developments we will try to investigate how to use these techniques in Mataverse launched by Meta, and how to make virtual visits to museums and cultural places that can be explored through virtual reality and augmented reality techniques more and more engaging and effective. The aim is to allow even very distant people to view the works of art, the architecture of historic buildings, and appreciate the beauty of our cities from a distance.

Acronyms

The following acronyms are used in this manuscript:

AR:

        Augmented Reality

CPU:

        Central Processing Unit

GPL:

        General Public License

GPU:

        Graphic Processing Unit

UV:

        The u,v graphic coordinates

VR:

        Virtual Reality