1 Introduction

Fig. 1
figure 1

Extent of damage at Kumamoto Castle: a Main castle tower, viewed from the same position before and after the earthquake. b Sections from which stones were dislodged and fell. c Map of all structures that form part of Kumamoto Castle. d Restoration procedures. The two Steps 4 and 5 enclosed by the red frames were the focus of this study

Kumamoto Castle, a popular traditional asset in Japan, was built in the 17th century by the feudal lord Kiyomasa Kato. The site on which the castle is located measures 980,000 \(\text {m}^2\). The castle consists of two castle towers, 49 turrets, and 29 gates. These buildings are constructed of wood and are erected on stone retaining walls, which is hereafter referred to as “ishigakis,” as shown in Fig. 1a. In 1877, both castle towers and many turrets were destroyed by fire during the Seinan war O’Grady (2010). Although the destroyed buildings were reconstructed during the 20th century, the ishigakis was in the same condition as when it was built in the 17th century, and it has high cultural value.

1.1 Kumamoto Earthquake

On April 14 and 16, 2016, two strong earthquakes with a magnitude of 7.3, struck Kumamoto region, Japan. Many people were injured, and buildings and infrastructure were significantly damaged. The reconstruction of Kumamoto has been underway, supported by national and local public entities and other supporting organizations.

Many traditional cultural assets in Kumamoto have been damaged or destroyed. For example, among the structures that form part of Kumamoto Castle, many stone bridges and Aso Shrine were extensively damaged. The castle is a particularly important asset for the citizens of Kumamoto because it is a symbol of the city and a source of tourism revenue. The total restoration cost of Kumamoto Castle is estimated at USD 0.6 billion, and the cost of restoring the ishigakis is estimated to be USD 0.4 billion. Furthermore, the time required for complete restoration is estimated to be more than 20 years; therefore, it is essential to shorten the period.

An important task that forms a part of the restoration work is to identify the original positions of the stones before the ishigaki collapse because it is impossible to build a superstructure without a base. However, the identification of stones is a difficult task similar to building a large jigsaw puzzle with pieces that have a 3D shape, and some stones are chipped or fragmented, and the shapes are incomplete. In this study, we attempted to identify the original positions of stones that rolled down and away from the ishigaki. In addition, we aimed to develop a method based on computer vision and optimization technologies to accomplish the task as efficiently as possible.

Fig. 2
figure 2

Flowchart of our ishigaki retrieval system. A and B 3D-shape of the pre/post-collapse stones constructed from the images using computer vision techniques. C The shapes obtained before/after the collapse are matched, and correspondence candidates are proposed to the user. D Complete table of the stones that correspond

1.2 Shape and Size of the Stones

The stones are composed of andesite, ranging in size from 20 cm to 2 m, and in weight from 20 kg to over 1000 kg. Unlike the ishigakis of Osaka Castle, the shapes of the stones of Kumamoto Castle are not uniform with a fixed form, and each stone has a unique shape. However, some stones were similar in shape. Stones are often rectangular or trapezoidal with long sides.

1.3 Extent of Damage and Restoration Procedures

Figure 1b shows major examples of the collapsed ishigakis that needed to be restored: Kita-jyuhakken-yagura and Iidamaru (turrets), Hohoate-gomon and Akazu-no-mon (gates). Figure 1c shows a map of the entire site comprising Kumamoto Castle. In total, 973 ishigakis exist on the site, of which 517 ishigakis were damaged by the earthquake (a total area of 23,600 \(\text {m}^2\)). Thus, 29.9% of the total number of ishigakis were damaged, and these collapsed ishigakis were composed of more than 70,000 stones.

Given that each wooden building is supported by an ishigaki, which forms the basis, the ishigaki must first be restored. Furthermore, to avoid compromising the cultural value of the structures, the fallen stones need to be returned to the same position they occupied before the collapse. However, this incurs enormous cost and requires a long restoration period because of the large number of heavy stones weighing over 100 kg. These stones were carefully lifted by crane and relocated to temporary storage spaces by truck.

The procedures to restore the ishigakis, as defined by Kumamoto Prefecture, consist of the following six steps, as shown in Fig. 1d:

  • Step. 1. Status Check of Ishigakis Checking the extent of damage to each ishigaki. The number of stones used for restoration was estimated, and the difficulty of restoration was ranked.

  • Step 2. Recording This process entails precisely counting the number of fallen stones, recording videos, and capturing photographic images of the collapsed situation. Furthermore, the detailed global position (XYZ) of each stone was measured using GPS and electro-optical distance measuring (EDM) instruments. GPS data play an important role in identifying the original position (described in Sect. 4.4).

  • Step 3. Move to Storage Spaces After recording the data, the fallen stones were numbered and moved to storage spaces by crane and truck.

  • Step 4. Measurement The shape, actual size, and surface scratches of each fallen stone were measured and recorded.

  • Step 5. Create the Correspondence Table This process identifies the original position of each fallen stone. After identification, a correspondence table was created.

  • Step 6. Begin the Restoration The correspondence table forms the basis of the actual restoration, and the table is used for planning purposes.

This study focuses on Step 4: Measurement and Step 5: Create the Correspondence Table. Our contribution was to provide tools for the staff and masons of Kumamoto Prefecture to measure the shapes of the stones before and after the collapse and to establish the before and after correspondence of stones.

Hereafter, we refer to the “stone before and after the collapse” respectively as the “pre-collapse stone” and the “post-collapse stone” or “fallen stone,” for simplicity.

1.4 Overview of Our System

Figure 2 shows a flowchart of the overall system we developed. The system consists of the following four major parts:

  • A. Construction of a database of pre-collapse stones.

  • B. 3D-measurement of post-collapse stones.

  • C. Matching the stones between A and B above.

  • D. Complete the correspondence table based on the matching results.

We describe each part in detail in section “Related Work” and in later sections.

On the basis of the correspondence table created in part D, a restoration plan was devised and the restoration commenced. In this study, we collated the ishigaki of Iidamaru, which was scheduled for construction at the right time, and provided it to Kumamoto City. The results are discussed in section “Experimental Results”.

1.5 Challenge and Our Contributions

A limited number of photographic images taken before the collapse needed to be collated with those captured after the collapse. This required matching under asymmetric conditions. Image matching for stones is extremely difficult because of their similar shapes and textures. Initially, we tried to use matching methods that take local features into consideration, e.g., SIFT Lowe (2004) and AKAZE Alcantarilla et al. (2013), but these methods were ineffective because of the similar textures of the stones. After some consideration, we established that a comparison of the contour shapes was an effective approach for matching. Incidentally, the measurement was assumed to be highly accurate.

In addition, we needed to address the problem of image distortion owing to the unique slope of the ishigaki (which is known as “Musha-gaeshi”), and we had to measure a large number of fallen stones efficiently at low cost. Considering the limited budget and time constraint, we ruled out the use of expensive 3D scanners that would have provided extremely accurate 3D measurements.

The difference in the quality of the data on the pre- and post-collapse stones also poses a problem. Only a limited number of photographs of the pre-collapse stones are available, which are not sufficient to reconstruct the detailed 3D shape of their surfaces. Instead, 2D shape data on pre-collapse stones must be utilized . In contrast, the 3D shape of the post-collapse stones can be measured in detail by via close-up photography. As shown in Fig. 2A, B, the quality of the data differed between pre- and post-collapse stones, and it is not easy to match them.

Furthermore, the pre-collapse and post-collapse stones needed to be mapped on a one-to-one basis, and it was necessary to solve the problem similar to solving a jigsaw puzzle Ando et al. (2019).

Our contributions in terms of addressing these challenges are summarized as follows:

  • We propose an improved ICP algorithm to match asymmetrical 2D and 3D shapes. The proposed approach projects 3D shapes to 2D. Simultaneously, the stones are matched to each other so that they can be deformed to the same orientation even if the image were captured from different angles and aspect ratios.

  • We established a method for image measurement and contour matching of stones using computer vision. We also formulated and solved the one-to-one matching problem as an assignment problem as used in operations research.

  • We made the system we developed available to Kumamoto City, and used the identification results as the blueprint for the actual restoration work of the ishigaki of Iidamaru. Our method succeeded in identifying many stones that were previously indistinguishable using manual inspection, i.e., human eye. This project is a successful model case of industry-academia-government projects involving universities, companies, and government agencies.

  • The data set used in this study will be made public and can be used for future studies of cultural properties.

2 Related Work

In recent years, the number of fires, earthquakes, and the amount of age-related damage to buildings of cultural value, not limited to Kumamoto Castle, has become a serious problem worldwide. Many studies have been devoted to the reconstruction and restoration of cultural properties using techniques based on information technology such as image processing and pattern recognition. A research project named “e-Heritage,” which aims to digitally preserve cultural assets and restore damaged tourism resources by restoring them as 3D models using computer vision technology, is currently highly popular.

2.1 Cases of Restoring Cultural Assets

Frauenkirche Dresden

The Church of Our Lady in Dresden in eastern Germany is a famous heritage site constructed of stone. The church was bombed during World War II and was completely destroyed. The rubble was collected, after which the fragments were identified manually by hundreds of architects and the appearance was redesigned using computer graphics Pohle and Jager (2003); Collins et al. (1993).

Great Buddha Project

Ikeuchi et al. (2007) launched the “Great Buddha Project” to construct a detailed digital archive of cultural properties based on data measured using a laser scanner. They proposed a digital preservation technique using modern computer vision technology to prevent an extremely large Buddha statue, which is cultural property in Japan, from deteriorating, and recorded the 3D shape of the Buddha in detail. They succeeded in constructing a large-scale 3D model by combining multiple sets of partial 3D point clouds by laser scanning and using the iterative closest point (ICP) algorithm.

Notre-Dame Cathedral

Notre-Dame Cathedral is a famous historical cathedral in France and was built in the 13th century. In 2019, and the roof and frame were destroyed by fire.

Fortunately, a 3D-digital archive thereof had already been created. Tallon (2014) and Sandron and Tallon (2020) measured the inside and outside of the cathedral using a laser scanner, and these data are expected to be used in the reconstruction project.

OUR Shurijo Project

Shurijo (Shuri Castle) is a national cultural asset from the 14th century, located in Okinawa Prefecture, Japan. The ishigakis and foundations were destroyed during the war in 1945, but were almost completely reconstructed in 1992. However, in 2019, a fire broke out and destroyed Shurijo.

Fig. 3
figure 3

Construction of a database of pre-collapse stones. a Fortunately, 40,000 photographic images were captured before the earthquake for VR content. b Unique slope of ishigaki. c Flowchart of the restoration process. High-definition adjustments of scale, global positioning, and correction of distortion were performed

Kawakami et al. (2020) launched “OUR Shurijo Project,” which involved reconstructing 3D models of Shurijo using a method known as structure from motion (SfM) from photographs and videos recorded by tourists and local people who visited Shurijo in the past.

Restoration of Komine Castle

The Great East Japan Earthquake in 2011 was responsible for the collapse of the ishigakis of Komine Castle in Fukushima Prefecture, Japan. In this case, 7000 stones were dislodged. Because the castle is a national cultural asset, the stones had to be returned to their original positions. The restoration took place from 2013 to 2018.

Restoration of fractured objects

Important basic research has been conducted on the restoration of damaged or destroyed cultural artifacts by re-assembling pieces to discover their original shape Thuswaldner et al. (2009); Huang et al. (2006); Papaioannou et al. (2017). As a series of restoration processes, these methods are similar to our stone retaining wall restoration. Zhang et al. (2015) used a narrowing candidate search, and the entire target was matched when assembling the fragments. On the other hand, in our method, photographs of the individual pre-collapse stones were available and utilized to perform matching. Hence, the problem setups used by these authors were slightly different from that of the present work.

2.2 Technical Relevance to Our Study

The case of Komine Castle is highly similar to that of Kumamoto Castle described in this study. However, the operation during which the pre-collapse and post-collapse stones were matched was manually performed and visually confirmed by masons.

Technically, in this work, we used ICP as well as the approach that was followed in the Great Buddha Project, but we used ICP for stone retrieval. As in the OUR Shurijo Project, we used pre-collapse photographs for SfM, although a limited number of these photographs were available, and the accuracy of 3D construction varies from place to place. However, post-damage measurements have fewer limitations and can be made with higher accuracy. It was additionally necessary to match the data captured before and after the damage because of their different qualities.

3 Preliminary: Data Construction

Our ishigaki retrieval system requires information on the pre/post-collapse stones. This section describes the acquisition of the data.

3.1 Database of Pre-Collapse Stones

We constructed a database of stones and the positions they occupied before the collapse. The original position of a stone was identified by comparing the information of the pre/post-collapse stones. This database is the most important and essential element, and forms the basis of our system. The construction is, however, difficult because we needed an image of each stone in a state similar to that in which it was before the earthquake; in other words, the images we capture after the earthquake are not the same.

Fortunately, approximately 40,000 digital images were captured for the content of the virtual reality (VR) Kumamoto Castle, as shown in Fig. 3a. Although these images were available for stone identification, two technical problems remained.

  • (a) Scale Size The 2D images we captured did not include scale information. We need to convert the pixel size to an actual size scale, such as millimeters. This scale information is the most important factor for identification; therefore, high-precision conversion is required.

  • (b) Distortion of the Slope As shown in Fig. 3b, the front side of the ishigaki has a unique shape with the slope gradually becoming steeper from bottom to top. Owing to the curved slope, the vertical scale observed is compressed when the ishigaki is viewed from afar (orthographic projection view). This distortion caused a mismatch between the shape of each pre/post-collapse stone. To address these problems, we extracted an ortho face image of each stone using the following steps, as shown in Fig. 3c.

  • Step 1. 3D Model Construction A 3D model of the ishigaki with a unique slope was constructed using multi-view stereo (MVS) from images. We used the existing MVS software, Agisoft Metashape.

  • Step 2. Scale Adjustment MVS cannot handle the scale; therefore, we adjusted the scale of the restored ishigaki by comparing an existing object with an object of known size, such as an undestroyed part of the ishigaki.

  • Step 3. Position Adjustment MVS only defines the 3D points of the local coordinates. We provide the global coordinates (longitude, latitude, and altitude) of the 3D model by matching the existing 3D model and landmarks obtained by aerial photogrammetry. Actually, we adjusted the global position of the ishigaki using the longitude and latitude information on Google Earth. The position of each stone was set as the centroid of the stone surface.

  • Step 4. Texture Mapping Assign the texture of each image to the corresponding 3D model.

  • Step 5. Capturing by Virtual Camera To reduce the distortion caused by the slope of the ishigaki, we re-captured images virtually from the center position and toward the normal vector direction of the surface. Here, because of the variety of normal directions of the constructed rough ishigaki 3D model, we applied mesh smoothing to the stone surface, and we used the mean normal vector near the center of each stone surface.

  • Step 6. Contour Extraction We extracted the 2D contour of each stone from the reconstructed image. A tool that can interactively and efficiently extract contours has already been developed by Yamasaki et al. (2020).

Fig. 4
figure 4

Potential measurement methods that were considered: a In the stone storage space, the fallen stones were placed close to each other. b Measurements using a ruler and measured contours of the same stone from various viewpoints. c Marker method that enables frontalization. d Stereoscopic measurement can construct the 3D shape of the stone, which enables frontalization

Consequently, the extracted 2D contour of each stone has less distortion and has an actual size with millimeter-order accuracy. Incidentally, we were not able to extract a sufficiently precise and bumpy 3D model because of the limited number of images and the variation in position from which the images were captured. Therefore, we used the 2D contour of the pre-collapse stone for identification.

3.2 Measurement of Post-collapse Stones

We measured the shape of each post-collapse stone that fell after the collapse, similar to the construction of a database of pre-collapse stones. Unlike the pre-collapse measurement, fallen stones can be measured without any restriction of capture devices and any limitation on the number of photographs. However, because the temporal, physical, and monetary costs for measuring over 10,000 stones are limited, the measurement method must be chosen carefully. In addition, the relocated stones were densely placed at narrow intervals, as shown in Fig. 4a; hence, the camera angles from which a stone could be captured are limited. Therefore, it was not always possible to capture the front view of each stone.

We attempted the following three measurement methods in the order of (A), (B), and (C):

  • (A) Ruler Measurement As shown in Fig. 4b, a ruler was placed parallel to the front face of the stone, and pixel-to-millimeter conversion was performed based on the scale of the ruler. However, the contours extracted from the image varied depending on the photographing position. In addition, the precision of the scale was low and depended on the orientation of the ruler.

  • (B) Marker Measurement As shown in Fig. 4c, we designed a square marker (15 cm width) with four red circles on the corner positions. The marker was placed parallel to the front face of the stone, and a pixel-to-millimeter conversion was performed. The image can be frontalized by perspective transformation to convert the marker positions of the four red circles to a square position. An advantage of this measurement technique is its ease of use with a single camera. “Easy” is an important factor for site workers.

  • (C) Stereoscopic Measurement Capturing the 3D shape of the fallen stone reduces the variation in the captured view and enables more precise frontalization.

Consequently, we employed the stereoscopic measurement method (C) because of its acceptable cost and precision. Moreover, for easy measurement in the field, we used a compact stereo camera, as shown in Fig. 4d.

3.3 3D Surface Construction of Stones

We constructed a 3D surface model of the front side of the stone from a stereo pair of left and right images. The accuracy of the shape significantly affects the final retrieval results; therefore, a careful measurement method that achieves high precision and is robust to noise is required. We describe the procedures using Fig. 5.

  • Step 1. Input Data The images on the left and right (top row of the figure) are given. Using our contour extraction tool, the 2D contour of the stone in the image on the left was extracted. The contour is used to generate a mask image (see the image on the left in the middle row). The mask is used to distinguish the pixels of the stone and those of the background surrounding the stone and to avoid a mismatch with the latter pixels.

  • Step 2. Stereo Rectification The stereo images are rectified to be parallel using the camera parameters obtained from the pre-calibration of a stereo camera.

  • Step 3. Initial Stereo Matching An initial rough correspondence was made using patch stereo matching (PSM). PSM uses a local descriptor to match, e.g., SIFT. The descriptor needs to be robust to noise, for example, occlusions and shadows.

  • Step 4. Fine Stereo Matching The fine correspondence is made by using subpixel estimation, normalized cross-correlation (NCC) patch, and semi-global matching (SGM) around the initial rough correspondence. The subpixel estimation was performed by using the error cancellation and parabola fitting methods. The NCC patch does not count the pixels outside the mask to avoid drift by the background pixels.

  • Step 5. Mesh Triangulation The triangular mesh is generated by the Delaunay algorithm, which links the points (vertices) on the contour and grid points within the mask. The 3D position of each vertex with the actual scale is computed from the triangulation and internal/external parameters of the stereo camera.

  • Step 6. Frontalization That is, “generation of the frontal view image.” The stone is subjected to 3D rotation such that the averaged normal vectors of the triangles face to the front side, and the frontalized image can be rendered.

Fig. 5
figure 5

Example of 3D surface construction and frontalization of a stone surface from a pair of stereo images. A mask of the surface was used to reduce the influence of the background. The 3D surface can be converted to an image as if it were viewed from the front, removing the distortion in the projective image

Figure 5 (the bottom row) shows an example of the constructed 3D surface of a stone. In the example, we were not able to capture the left and right images from the directly opposite direction because the moved stones were placed at narrow intervals. Instead, we captured an image from another diagonal direction. Although the 2D contour was affected by the camera pose and had the perspective distortion, through 3D construction and frontalization, the correct contour with less distortion and actual scale was obtained. Figure 6 shows an example of the extracted and frontalized 3D faces of fallen stones.

Fig. 6
figure 6

Example of extracted and frontalized faces of fallen stones

3.4 GPS Position Data

In addition to the image data, the position data of each stone observed by GPS (global positioning system) are important. These data can be used to narrow down and identify the original position of the stone because a stone generally falls and rolls down in a straight line.

Fortunately, the staff of Kumamoto Prefecture recorded the accurate GPS position of each fallen stone using reference points and EDM instruments before the stones were moved to the stone storage space. The latitude, longitude, and altitude of each stone were measured and transformed into a plane orthogonal coordinate system XYZ in meters. Similarly, the GPS coordinates of each stone that formed part of the pre-collapse ishigaki were preliminarily recorded in the database.

Figure 7 shows an example of the GPS positions of fallen stones of Iidamaru superimposed on an image captured with Google Earth. Each white circle indicates the coordinates of a stone. The pre/post-collapse GPS data were used effectively for our retrieval system (detailed in Sect. 4.4).

Fig. 7
figure 7

Plot of global positions of fallen stones. The position of each stone was recorded using GPS and EDM before the stone was moved to the storage space. Each white circle in the figure represents the coordinates of a stone

3.5 Asymmetric 3D-to-2D ICP Algorithm

In the previous sections, we described the way in which we obtained the 2D contours of the pre-collapse stones and the 3D contours of the post-collapse stones.

In this section, we explain the method of contour matching.

For the 3D contour, let \(\mathbf {x}_m \in {\mathbb {R}}^3\) be an m-th 3D vertex included in the contour, and \({\mathcal {X}} = \{\mathbf {x}_{m}\}_{m=1}^{M}\) be the contour consisting of a set of M vertices. For the 2D contour, let \({\mathcal {Y}} = \{\mathbf {y}_n\}_{n=1}^{N}\), where \(\mathbf {y}_n \in {\mathbb {R}}^2\), be the contour and N the number of vertices, respectively. Note that the number of contours and dimensions of the vertex are different between two contours. To match the two contours with different properties, we proposed and used a modified version of the iterative closest point (ICP) algorithm that minimizes the following cost function:

$$\begin{aligned} S({\mathcal {X}},{\mathcal {Y}}) = \min _{\mathbf {R}, \mathbf {t}} \frac{1}{M} \sum _{m=1}^{M} \Vert \mathbf {P}\left( \mathbf {R}\mathbf {x}_m + \mathbf {t}\right) - \mathbf {y}_m^\prime \Vert ^2, \end{aligned}$$
(1)

where \(\Vert \cdot \Vert \) is the \(\ell _2\) vector norm; second, \(\mathbf {R} \in {\mathbb {R}}^{3\times 3}\) and \(\mathbf {t} \in {\mathbb {R}}^{3}\) are a rotation matrix and a translation vector, respectively; \(\mathbf {P} \in {\mathbb {R}}^{2 \times 3}\), which we introduce, is a projection matrix from the 3D space to a 2D plane, and finally, \(\mathbf {y}_m^\prime \) is the nearest \(\mathbf {y}\) to \(\mathbf {x}_m\) projected on a 2D plane, and found by exhaustive search as

$$\begin{aligned} \mathbf {y}_m^\prime = \arg \min _{\mathbf {y}_n}~\Vert \mathbf {P}\left( \mathbf {R}\mathbf {x}_m+\mathbf {t}\right) - \mathbf {y}_n\Vert ^2. \end{aligned}$$
(2)

To solve the above equations and minimize the cost, we use the following solution algorithm and adjustment:

  • Equation (1) includes the multiplication of variables \(\mathbf {P}\) and \(\mathbf {R}\), thus the equation is non-convex, and each of them cannot be solved directly at the same time. As a standard method to solve this type of problem, we alternatively solve each of them using the least squares method, that is, first solve \(\mathbf {R}\) and \(\mathbf {t}\) by fixing \(\mathbf {P}\), and then solve \(\mathbf {P}\) by fixing \(\mathbf {R}\) and \(\mathbf {t}\). In addition, the rotation matrix \(\mathbf {R}\) with orthogonality is obtained via singular value decomposition (SVD) for \(\mathbf {x}_m\) and \(\mathbf {y}_m^\prime \), similar to the standard ICP algorithm.

  • In Eq. (2), the nearest vertex \(\mathbf {y}_m^\prime \) is found by an exhaustive search within a short processing time because the number of vertices is small. The contour vertices \(\{\mathbf {x}_m\}_{m=1}^{M}\) are gradually transformed using the updated \(\mathbf {P}\), \(\mathbf {R}\), and \(\mathbf {t}\), and a better set of \(\{\mathbf {y}_m^\prime \}_{m=1}^{M}\) are obtained.

  • We aligned the dimensions by raising the 2D vector \(\mathbf {y}_m^\prime \) to a 3D vector \(\left[ \mathbf {y}^{\prime \top }_m, 0\right] \) by filling in the zeros for the z-axis to compute the zero-mean-covariance matrix. We assume that the contours before the collapse can be approximated to the 2D plane by using a virtual camera to capture the image from the front.

  • To ensure robustness to noise and to consider the shape of the contours, we avoid selecting the corresponding contours according to the difference between the normal vector of contours or large distance. The asymmetric ICP is iterated until convergence of the \(\ell _2\) norm or for a predefined number of times. Similar to the standard ICP, the initial position and pose are important for matching. We set the initial position as the center point of the contours, and we set the initial pose as four different angles \(0^\circ \) (original pose), \(90^\circ \), \(180^\circ \), and \(270^\circ \) degrees, and then selected \(\mathbf {R}\) and \(\mathbf {t}\) of the best fit.

Figure 8 shows an example of the result of matching with the asymmetric ICP algorithm. The blue line is the 2D contour before the collapse, and the red line is the 3D contour after the collapse. The image of the post-collapse stone was taken from another angle and differed from the pre-collapse shape. The 3D contour has a different aspect ratio at its initial position. As it rotates in 3D, the contour becomes closer to the 2D contour before the collapse.

Fig. 8
figure 8

Example of matching the 2D contour before the collapse (blue line) with the 3D contour after the collapse (red line) (Color figure online)

4 Ishigaki Retrieval

In this section, we explain the approach we followed to obtain a correspondence between the pre-collapse and post-collapse stones. We regard this jigsaw-puzzle-like problem using stone pieces as an assignment problem in operations research. This problem can be formulated and solved using a similarity matrix and combinatorial optimization.

4.1 Similarity Matrix

Before formulating the assignment problem, we introduce the similarity matrix. Let i and \({\mathcal {X}}_i\) be the index of a pre-collapse stone and the contour, and let j and \({\mathcal {Y}}_j\) be those of a post-collapse stone. We denote the number of stones as I \((= \#\{{\mathcal {X}}_i\})\) and J \((= \#\{{\mathcal {Y}}_j\})\), where \(\#\) denotes the number of elements in a set.

The matching cost of the two contours is given by \(S({\mathcal {X}}_i,{\mathcal {Y}}_j)\), as shown in Eq. (3). The similarity matrix \({\mathbf {C}} \in {\mathbb {R}}^{I \times J}\) is a matrix that stocks the cost at the i-th row and j-th column as

$$\begin{aligned} C_{ij} = S({\mathcal {X}}_i,{\mathcal {Y}}_j). \end{aligned}$$
(3)

Note that the number of post-collapse stones is greater than that of pre-collapse stones, \(I \le J\), because there were also other stones, e.g., stones behind those on the surface of the ishigaki, and stones that originated from the neighboring ishigakis, and they were mixed together. We refer to these stones as “noisy stones.”

Example of Assignment

Figure 9 shows an example of a similarity matrix of size \(8 \times 10\). The eight pre-collapse stones were assigned to eight of the post-collapse stones with a one-to-one correspondence. Incidentally, the true correspondence of the stones is already known, and the vertical order is sorted such that the true correspondence is diagonally located. Subfigures (a), (b), and (c) show the results of different assignment methods: (a) a simple selection method, i.e., in each row, the column with the minimum value is selected. However, multiple assignments were performed for the same stone; (b) a method to restrict the assignment of stones that have already been assigned. Multiple assignments did not occur, but some mismatches still remain; (c) optimal assignments with the global minimum cost to provide the correct assignments. Calculation of the optimal assignments is described in Sect. 4.2. The algorithm details and evaluations of (a), (b), and (c) are described in Appendix A.

Fig. 9
figure 9

Example of a similarity matrix between the pre-collapse and post-collapse stones. We assign a pre-collapse stone to a post-collapse stone according to the matrix. a Simple assignment based on the minimum similarity selection leading to multiple assignments; b avoiding multiple assignments, which leads to mismatch; and c combinatorial optimization assigning the correct correspondence

4.2 Problem Setting and Formulation

We would like to find a set of one-to-one correspondences between the I \((= \#\{{\mathcal {X}}_i\})\) pre-collapse stones and the J \((= \#\{{\mathcal {Y}}_j\})\) post-collapse stones, such that the total matching cost is minimized. The total cost and the constraints can be represented as the following standard assignment problem by using the similarity matrix \(\mathbf {C} \in {\mathbb {R}}^{I \times J}\) and a binary indicator matrix \(\mathbf {Z} \in \{0,1\}^{I \times J}\), indicating whether the correspondence between i and j is \(Z_{ij} = 1\) or not \(Z_{ij} = 0\):

$$\begin{aligned} \begin{aligned} \min _{\mathbf {Z}} ~&\sum _{i,j}^{I,J} C_{ij} Z_{ij}, \\ \text {s.t.} ~~&Z_{ij} \in \{0,1\}, \\&\forall i, ~ \sum _{j=1}^{J} Z_{ij} = 1, \\&\forall j, ~ \sum _{i=1}^{I} Z_{ij} \le 1, \end{aligned} \end{aligned}$$
(4)

where the first term is the data fidelity that represents the total cost to be minimized; the second term is the binary constraint that \(Z_{ij}\) can take only 0 or 1, and the third and fourth terms are constraints for the one-to-one assignment; the third term represents, in the horizontal direction, the total number of stones is 1, and the fourth term represents, in the vertical direction, the total number of stones is 1 or 0, where 0 is for noisy stones.

To solve Eq. 4, general solvers for combinatorial optimization can be used. We employ the Hungarian algorithm Kuhn and Yaw (1955), which is an algorithm that was specifically developed to solve assignment problems, and can solve the problem in polynomial time \({\mathcal {O}}(N^3)\). In our case, I and J were roughly within 100–500, because approximately 100–500 fallen stones originating from a single ishigaki collapsed. Even when \(I = J = 500\), the computational time is less than 1 min.

The results are presented in Fig. 9c, and this assignment method yields better results than the other methods.

4.3 Usage of Additional Information and Multi-stage Assignment for Improving the Accuracy

Although the one-to-one assignment algorithm showed significant effectiveness, we found a problem that occurs occasionally, i.e., when one stone is assigned incorrectly, the other stones will also be assigned incorrectly in a chain reaction because the algorithm attempts one-to-one assignment even if the stone has no correct correspondence.

To address this problem, we propose and practically use a multistage matching method. In this method, “easy-to-match” stones are first matched and confirmed, and then, the remaining stones are matched again, that is, stones that seem to be “difficult-to-match” are initially excluded from the matching process, and we describe the types of difficult-to-match stones in section “Stones of Difficult-to-Match”.

Incidentally, this system is a human-in-the-loop system that includes human visual inspection. As the number of stages increases, the matching performance increases, but the human workload also increases, i.e., there is a trade-off. In practice, we used two-stage matching because the test data showed that increasing the number of stages beyond two did not significantly improve the accuracy.

4.3.1 Stones of Difficult-to-Match

We define the stones of difficult-to-match as the following three types, and consider removing, i.e., “cleaning” these stones from the list of candidates in the first stage of matching:

  • (1) Too Far from a Target Ishigaki (Pos) Fallen stones tend to fall straight down, and the falling locations are not that far from the ishigaki in which they belong. The details are described in Sect. 4.4. We can automatically distinguish the stones by using the position information of the pre/post-collapse stones, which had already been measured by GPS.

  • (2) With Large Contour Error (Err) Occasionally, there are stones of different sizes but similar shapes. If the contour shapes and the scales of the two pre-collapse and post-collapse stones are considerably different from each other, the pair is incorrect. This threshold was defined as 20 mm.

  • (3) Too Small Size (Area) Stones with too small an area are difficult-to-match owing to the limitation of measurement accuracy. This threshold was defined as 0.1 \(\mathrm{m^2}\).

Fig. 10
figure 10

a The upper row of colors represents pre-collapse stones in the pre-collapse ishigaki before the collapse, and the lower colored circles are fallen post-collapse stones. Each stone is represented by a group of differently colored circles. b Example of collapse prediction. Nearly half of the stones were excluded from the search candidates (Color figure online)

Fig. 11
figure 11

GUI system that was developed. a Overall view of the ishigaki. Click a target stone to show the candidate stones presented by the system and select a candidate stone. b The system runs on a tablet PC and can be used outdoors. c Examples of candidates: (upper left) image transformed such that it has the same orientation as the pre-collapse stone; (upper right) error between two contours; (bottom left) original image; (bottom right) GPS position of the pre/post-collapse stone

4.4 Stone Fall Prediction

We describe method (1), introduced in section “Stones of Difficult-to-Match”, which uses the pre/post-collapse positions. We have already developed a method for predicting the location of a stone after falling relative to the actual location of the post-collapse stone Take et al. (2019, 2020). Figure 10 shows a plot of the positions of pre/post-collapse stones, and we know the correct correspondence as for this example. In (a), the upper set of colors represents the pre-collapse stones in the ishigaki before the collapse, and the lower circles the fallen post-collapse stones. The position of each of the stones is colored differently. It can be seen that most of the stones landed in a position directly below their original position in ishigaki.

The prediction method uses a support vector machine (SVM) for binary classification, which classifies whether each position of the fallen stone becomes a candidate when a pre-collapse stone is selected. In other words, the method distinguishes the distance between the location of each fallen stone and whether the expected landing position of the selected stone is near or far (see Take et al. (2019, 2020) for more details). Figure 10b shows the result of SVM classification by selecting a pre-collapse stone, and by distinguishing whether the fallen post-collapse stones are candidates (blue dots) or not (red dots).

Consequently, we determined that half of the fallen stones could be removed from the candidate list. The fall prediction therefore enables us to simplify the assignment problem.

4.5 GUI System

We developed a graphical user interface (GUI) system for ishigaki retrieval. The core matching engine, including ICP and 3D construction by stereo matching, was developed in C++, and the GUI was developed in Java and Electron. The GUI runs on a multi-platform desktop PC and tablet PC and provides the user with an interactive function for selecting stone candidates.

Figure 11 shows a screenshot of the GUI system. The data set of ishigakis is managed in files for each position and the faces of each stone. In (a), the image on the left is the drawing of the entire ishigaki of Iidamaru, indexed H268. A stone colored blue indicates that the assignment has been done. By clicking a stone in the figure, candidate stones are presented, as shown in (c). In (a), the image on the right consists of three parts: (Left) a photograph and drawing of a pre-collapse stone; (Right) a photograph of the post-collapse stone. The 3D model can also be displayed, and several candidates are presented for the user to choose; (Bottom) the GPS positions of the pre/post-collapse stones are displayed. The orange and blue dots correspond to the before and after positions. Incidentally, in this case, the vertical positions of the orange and blue dots are close, indicating that the stone fell straight down. In this way, the user can determine the assignment of each stone, taking into account the photographs and positions of the pre/post-collapse stones.

Fig. 12
figure 12

a Two ishigakis of Iidamaru, b an ishigaki of Akazu-no-mon, c Orthophotographs generated by 3D construction from photographs of an ishigaki before the collapse

5 Experimental Results

We present the results obtained using the proposed system and actual stones, i.e., the results obtained from the actual restoration of Kumamoto Castle.

To prepare the truth before and after correspondences, i.e., ground truth data, which were used in the evaluation, we first provided the results using the proposed system to Kumamoto City, and revised the correspondence table based on the opinions of masons and experts. Using the ground truth data, we evaluated and analyzed the proposed method.

Table 1 Data set for evaluation
Fig. 13
figure 13

Matching results for two ishigakis, H268 and H269, of Iidamaru. Stones colored in blue were successfully placed by the proposed system. These results were confirmed by Kumamoto City and experts, and it was found that 90% of the stones, which were able to shoot using the proposed system, were correctly identified (Color figure online)

5.1 Data Set

We used three ishigakis for evaluation, that is, the two Iidamaru indexed H268 and H269 (see Fig. 12a) and that of Akazu-no-mon indexed H149 (see Fig. 12b). Table 1 summarizes the number of fallen stones and photos in the dataset. The number of fallen stones of each ishigaki was 313, 164, and 216 for a total of 693. In addition, the number of stones that needed to be retrieved, moved, and photographed after the collapse was 250, 120, and 105, for a total of 475 images. All the images were captured using a compact stereo camera. The position information was recorded using GPS before the stones were moved to the storage spaces.

Note that not all the fallen stones were photographed. Many stones fell and broke into fragments, some stones fell into the moat filled with water, and some stones were mixed with stones of another ishigaki, and they could not be recovered.

Figure 12c shows the 3D images, i.e., the textured 3D models, of H268 and H269 that were constructed and ortho-rectified from the photograph of ishigaki taken before the collapse. These images were used as a database of pre-collapse stones for matching.

5.2 Evaluation Procedure

The evaluation method is as follows: First, using our GUI system, the user maps the pre-collapse and post-collapse stones. On the basis of these results, we created a correspondence table and submitted it to Kumamoto City. Kumamoto City will create a final correspondence table based on the initial correspondence table and information from the GUI system, as well as the observations of stones in the field and the opinions of the masonry experts. Then, we evaluated the proposed method by comparing the results of the first user with the final correspondence table.

For Iidamaru’s H268 and H269, we were able to complete this final correspondence table. Although the final confirmation of Akazu-no-mon has not yet been completed by the experts of Kumamoto City, the confirmation of the proposed system is available.

Fig. 14
figure 14

Examples of certain pre/post-collapse stones were correctly mapped using this system. The top four rows are the Iidamaru stones, and the bottom row is the result of Akazu-no-mon. All of these stones were confirmed by experts to have been correctly mapped

5.3 Results

5.3.1 Case1: Iidamaru

Figure 13 shows the matching results of H268 and H269 of Iidamaru. The stones colored deep blue and light blue are those for which the correspondence candidate was found, respectively, in the first stage and the 2nd stage or a later stage. These results were confirmed by Kumamoto City and experts and found to be correct. Using this system for H268, we were able to correctly identify 229 out of 250 photos, with an identification rate of 91% (=223/250). However, 313 stones fell from the original ishigaki, and the identification rate using this total number was 73% (=229/313). Similarly, for H269, 108 out of 120 photos were correctly identified, and the identification rate was 90% (=108/120), while the identification rate for the total number of fallen stones was 65%.

Consequently, we were able to correctly identify approximately 90% of the stones that could be photographed using the proposed system. Stones that could not be photographed were too small, lost, or damaged.

Fig. 15
figure 15

Examples of human-specific errors in the system. The stone in the central column was chosen by a human, and although it appears correct at first glance, the stone in the column on the right is the correct answer

The remaining 10% stones, which were difficult-to-match, are marked in Fig. 13, i.e., stones that are too small are marked with a red \(\times \), stones of which the front-side face was wrongly determined by humans, and their photographed image is different from the actual front-side image, are marked with a green \(\bigstar \), and stones that were lost after the collapse and could not be photographed are marked with a purple \(\times \). The data that are necessary to match these stones correctly in the future are discussed in section “Discussion”.

Discussion of Iidamaru

Figure 14 shows an example of stones that were correctly mapped by our system. The apparent color, size, and orientation of the pre-collapse stones differ from those of the post-collapse stones, thus it is difficult and nerve-wracking to manually check whether they are the same. For example, for ishigaki H268, 250 photographs taken before the collapse need to be compared with 250 photographs taken after the collapse, and the total number of comparisons is \(250\times 250=62,500\). H268 is the standard number of stones for ishigaki, but it would take several months to complete the verification by manual inspection. On the other hand, the verification using our system was completed in approximately one hour by a single user.

Important Results

The important results obtained by the system are as follows: The center column of Fig. 15 is the result of a photo-matching process by Kumamoto City staff before the system was developed. This may appear to be correct at first glance, but the assignments were incorrect. On the other hand, the result shown in the column on the right is that obtained by our system, and the assignments are correct. These stones are clearly very similar in shape, thus manual inspection would lead to incorrect matching.

Furthermore, the original photographs are different in orientation, pose, and scale from those taken after the collapse, which is one of the reasons for this difficulty. In contrast, our system uses 3D rotation to show the stone in the same orientation and scale as before the collapse, which simplifies the matching process.

In addition to these two examples, our system shows that human errors occur occasionally. Approximately 10% of the stones assigned by humans were found to be incorrect.

5.3.2 Case2: Akazu-No-Mon

Although the stones of Iidamaru were photographed after they were moved to the stone storage space, each stone of Akazu-no-mon could be measured individually in a special photographing place against a blue background before they were moved.

This allowed us to photograph the stones accurately from the front without being obstructed by weeds growing in the stone storage space.

The results of collation using our system are shown in Fig. 16. Here, 64 stones were identified from 105 images of post-collapse stones. The identification rate was 60% (64/105). The matching performance was lower than that of Iidamaru because Akazu-no-mon may contain stones from neighboring ishigakis. The proposed system was able to identify 64 stones within approximately 30 min, which is sufficiently effective.

Fig. 16
figure 16

Matching results for Akazu-no-mon. We were able to correctly identify 64 stones within approximately 30 min

5.4 Analysis of Iidamaru

Staff from Kumamoto City and experts revised the correspondence table for Iidamaru. Therefore, using this result, the proposed method can be quantitatively evaluated using the evaluation criteria described in the next section.

Fig. 17
figure 17

Comparison of the performance of each method described in section “Compared Methods”: a ranking evaluation and b tolerance to noise. In both cases, Opt was superior to Min. Furthermore, the results can be improved by using two-stage matching

Table 2 Summary of results obtained with each method

5.4.1 Evaluation Criteria

We used the following two criteria for our ishigaki retrieval system.

  • (I) Top-5 Accuracy This provides the ratio that the five extracted candidates of an assignment include the correct correspondence.

  • (II) Ranking Evaluation This provides the effectiveness of the recommended stone based on mAPFootnote 1 (mean average precision), defined as:

    $$\begin{aligned} \text {mAP} = \frac{1}{N}\sum _{i=1}^N \frac{1}{\text {rank}_i}, \end{aligned}$$
    (5)

    where \(\text {rank}_i\) is the first rank order in which the correct correspondence of the i-th pre-collapse stone is found. A higher mAP indicates that the correct stone correspondence exists in higher ranking.

5.4.2 Compared Methods

We examine the effect of the one-to-one assignment algorithm and the effect of the two-stage matching method, which first identifies only easy-to-match stones, i.e., difficult-to-match stones are eliminated, after which we identify stones including those that are difficult-to-match. In the two-stage matching method, we performed several different collations to determine which types of stones could be effectively eliminated. The following six methods were compared: These methods are roughly divided into one-stage and two-stage.

  • (i) Min is the simplest assignment method, which was described as a toy example in section “Similarity Matrix”, and Fig. 9a and ranks the candidates in order of decreasing error between contours. The same stones may be assigned to several different stones.

  • (ii) Opt solves the assignment problem in Eq. (4) using the Hungarian algorithm, whose toy example is shown in Fig. 9c.

The other four methods are based on the Hungarian algorithm and two-stage matching method, but they differ in terms of the criteria for eliminating the difficult-to-match stones in the first stage of the two-stage matching process.

  • (iii) Opt+Pos uses SVM to distinguish that the fallen stones originally belonged to the target ishigaki, as described in Sect. 4.4 and removes the other outlier stones from the candidates.

  • (iv) Opt+Pos+Err eliminates stones that have a large error between the contours of the two pre-collapse and post-collapse stones.

  • (v) Opt+Pos+Area eliminates stones that have small areas compared to that of the pre-collapse stone.

  • (vi) All eliminates stones by using (iii), (iv), and (v) above.

5.4.3 Analysis of Results

Figure 17a shows the result of the comparison of each method for H268 and H269 of Iidamaru. The vertical axis shows the number of correctly answered stones, and the horizontal axis shows the rank of top-k stones presented by each of the aforementioned methods. The graph shows that Opt, which does the one-to-one assignment, is superior to Min. In addition, using two-stage matching, i.e., Pos, Pos+Err, Pos+Area, and All, we were able to obtain a higher number of correct answers.

Figure 17b shows the result of the noise robustness evaluation. The horizontal axis shows the number of noisy stones, and the vertical axis shows the Top-1 accuracy. The performance of all the methods tends to decrease as the number of noisy stones increases. This result also shows that Opt basically outperforms Min, and the performance is further improved when two-stage matching is used.

Table 2 summarizes the results of the respective methods. A higher mAP value represents that the more correct matching results are included in the top ranks. In terms of the two-stage methods, Pos is highly effective, thus the positional relationship before and after the collapse plays an important role. For example, in H268, the Top-1 accuracy using Opt independently was 56%, but by using it in combination with Pos, a Top-1 accuracy of 73.6% and an improvement of 17% were achieved. By using Err and/or Area in addition to Pos, a slightly greater improvement was obtained. However, when all the methods, i.e., All, were used, the performance did not improve as expected. This seems to be the influence of over-elimination of the candidate stones, and the number of correct answers was reduced in the first stage of matching.

The results of the above analysis are summarized as follows:

  • More correct stones can be found by solving the assignment problem algorithmically and performing one-to-one assignment rather than one-to-multi assignment.

  • A two-stage assignment method is effective, in which the assignment is determined by easy-to-match stones. In particular, positional relationships before and after the collapse are very useful. In addition, a large contour error is useful.

Table 3 presents the conclusive results of the number of stones we were able to identify and the percentage of all the images taken. In this case, 401 stones in 475 photographs were correctly identified, an identification rate of 84.4%.

Table 3 Summary of matching results

5.5 Discussion

The effectiveness of our system has been confirmed by conducting demonstration experiments and analyses. In particular, the system enabled users to perform the matching process in a very short time compared to conventional manual matching. It is noteworthy that the system can assign stones that cannot be assigned by humans. On the other hand, in certain cases, stones the system was unable to identify could be assigned by humans and other assignments neither the system nor humans were able to make.

Finally, we asked actual on-site masons to use this system and interviewed them about their impressions of the system. The results were positive, e.g., they said, “we wish we could have used this system a little earlier because it is very easy to use and makes collation much easier.” In particular, the functions to rotate the stone in 3D, match its posture on the GUI, and visualize the pre/post-collapse positional relationship, were well received.

5.6 Limitation

5.6.1 Cause of Failure

In the case of Iidamaru, 10% of the stones could not be identified correctly. Many of the stones the system could not find were small. This is because of the limited number and resolution of photographs of pre-collapse ishigakis, and prevented us from creating a pre-collapse database with sufficient accuracy. The other stones are mainly broken and lost, and the details are as follows.

  • Broken The ishigaki of Iidamaru is high, thus the stones fragment when they fall. Some of the stones could be restored using adhesives, but certain stones could not be restored owing to their limited strength. When a stone is not restorable, a replica is created and used.

  • Lost: Small stones were lost because they mixed with stones behind the surface stones.

The required data and information are as follows:

  • Data for Small Stones We need to capture a high-resolution image of the pre-collapse stones. Because the size of the stone that failed to match in this case was approximately 1/3 of the size of the stone that was successfully matched, we need to increase the image resolution by at least three times.

  • Data for Broken Stones We need to gather the broken fragments. The broken face (cross section) is perceptually distinguishable by color and texture. We measure the 3D shape of the face, and if the faces of the two pieces are matched, we can merge and restore them. Zhang et al. (2015) proposed a system designed to restore a single object from partial 3D fragments by combining them in the manner of a tiling puzzle, which would be useful for repairing broken stones.

5.6.2 Automatic and Manual Steps

Our system requires manual interaction. We summarize the automatic and manual steps, respectively.

Automatic Steps

  • 3D shape construction of the stones from stereo camera and SfM.

  • Computation of the similarity matrix from shape matching by ICP and fall prediction.

  • 2D contour extraction (but its final confirmation is manual).

  • Ranking of candidate stones from the similarity matrix by solving the assignment problem.

  • Frontalization and color matching between the pre-collapse and post-collapse stones.

Manual Steps

  • Selection of correspondence from candidates.

  • Capturing each stone by a camera.

  • Identifying the ishigaki surface.

  • Scale adjustment of the pre-collapse stones.

  • Measurement of the GPS position of the stones.

In the future, it is necessary to reduce the manual steps.

6 Conclusion

This project led to the development of a system that assists in the identification of ishigaki and fallen stones after the Kumamoto earthquake. Specifically, the system uses image matching to compare the photographs of the pre/post-collapse stones. The technical contributions of this study are as follows:

  1. 1.

    We developed a data construction method based on the precise 3D model/surface from the images to address the problem related to the orientation and scale of the stone.

  2. 2.

    A method that entails matching the pre-collapse and post-collapse stones was formulated and solved as a jigsaw-puzzle-like combinatorial optimization problem.

  3. 3.

    Two-stage matching was employed to improve the matching performance.

  4. 4.

    We created a user-friendly GUI system and had it used by actual on-site masons in Kumamoto City.

The developed system was used to match the ishigaki in Iidamaru. As a result, we succeeded in identifying 337 stones and approximately 90% of the 370 photographs. These results are anticipated to be useful as a blueprint for actual restoration work. We were also able to identify approximately 60% of the stones in Akazu-no-mon’s ishigaki, where restoration work has not yet begun. This required approximately 30 minutes of work. This system has been transferred to Kumamoto City and is being used to identify many other ishigaki cases.

The dataset of stones used in this study is available at https://github.com/goukoutaki/ishigaki.