A Holistic Workflow for Semi-automated Object Extraction from Large-Scale Historical Maps

The extraction of objects from large-scale historical maps has been examined in several studies. With the aim to research urban changes over time, semi-automated and transferable holistic approaches remain to be investigated. We apply a combination of object-based image analysis and vectorization methods on three different historical maps. By further matching and georeferencing an appropriate current geodataset, we provide a concept for analyzing and comparing those valuable sources from the past. With minor adjustments, our end-to-end workflow was transferable to other large-scale maps. The findings revealed that the extraction and spatial assignment of objects, such as buildings or roads, enable the comparison of maps from different times and form a basis for further historical analysis. Performing an affine transformation between the datasets, an absolute offset of no more than 72 m was achieved. The outcomes of this paper, therefore, facilitate the daily work of urban researchers or historians. However, it should be emphasized that specific knowledge is required for the presented subjective methodology.


Introduction
Historical maps are valuable sources when investigating spatial changes over time (Herold 2018). As an essential tool for communicating geographic objects and their locations, they are often the only source of information for the understanding of spatio-temporal change (Sun et al. 2021;Kim et al. 2014). With large-scale maps (approx. > 1:20,000)-especially "city maps"-we are able to study urban morphology (Meinel et al. (2009), as cited in Muhs et al. 2016). Frequently, geographical, political, environmental, and other urbanization processes can be backtraced solely by means of historical maps.
For the analysis of the urban landscape of the past, it is inevitable to make the information from large-scale historical maps accessible. Single map objects may provide insights into former names of roads and buildings or their evolution over time. But generally, physical scans (bitmaps) of historical maps are not machine-readable. Manual attempts to acquire information from historical maps are not uncommon but error-prone, time-intensive, and non-transferable (Xydas et al. 2022;Chiang et al. 2020;Gobbi et al. 2019). There is a need for (semi-)automated approaches to solve these problems.
In this study, we provide a holistic workflow to not only extract objects from large-scale historical maps, but also to derive benefits from the entirety of geometric, relational, and semantic information. Moreover, our semi-automated approach demonstrates how a spatial assignment between historical and current maps may be enabled and therefore provides a basis for further comparison processes between these.
An established strategy used to semi-automatically extract objects from historical maps while minimizing the reader's subjective influence starts with image segmentation, which follows the principles of human perception: objects within an image are differentiated due to graphical variations (e.g., in light intensity, texture, or spatial context), artifacts, and deviations. Visually homogeneous image areas form so-called segments. By combining object segmentation and classification, the concept of geographic object-based image analysis (GEOBIA) is able to reproduce physically existing objects, like buildings or roads, from raster maps (Herold 2018;Hussain et al. 2013;Hay and Castilla 2008;Neubert 2005). However, authors agree that "there is no single extraction method that can be effectively applied to all different historical maps" (Sun et al. 2021). This is a complex task and only few studies have shown suggestions for further processing and the applicability of their results.
Most research in this field aims at extracting and vectorizing geometries from historical maps to make them analyzable, but frequently comes with several limitations and preconditions. Many studies focus on the extraction of a single feature type such as streets (Chiang and Knoblock 2013;Chiang and Knoblock 2012), river bodies (Gede et al. 2020), or different land use classes Zatelli et al. 2019) like forest areas (Ostafin et al. 2017;Herrault et al. 2013;Leyk et al. 2006) or wetlands (Jiao et al. 2020). Others assume homogeneously colored map regions (Chiang et al. 2011;Leyk and Boesch 2010;Ablameyko et al. 2002), which is rarely true for historical maps. Less complex ("binary") maps containing homogeneously black objects or contours on white backgrounds were investigated by Xydas et al. (2022), Heitzler and Hurni (2020), Le Riche (2020), Iosifescu et al. (2016), Muhs et al. (2016), and Kim et al. (2014). But differentiating objects solely based on color differences is insufficient especially for widespread monochrome historical maps or due to ancient paper texture, noise, or dirt on the hand-drawn maps (Jiao et al. 2020;Peller 2018;Muhs et al. 2016;Arteaga 2013;Leyk and Boesch 2010). Labels often remain unconsidered in the context of object recognition from historical maps as they commonly suffer from overlaps or gray-scale values similar to textures or contours of other map elements Peller 2018). Other authors presume an existing coordinate system (Le Riche 2020; Gobbi et al. 2019;Iosifescu et al. 2016) or a huge stock of training data, which is needed for machine learning approaches (Xydas et al. 2022;Heitzler and Hurni 2020;Jiao et al. 2020;Gobbi et al. 2019;Zatelli et al. 2019;Uhl et al. 2017). Moreover, few studies have focused on large-scale but rather small-scale maps (Gede et al. 2020;Heitzler and Hurni 2020;Gobbi et al. 2019;Zatelli et al. 2019;Loran et al. 2018;Uhl et al. 2017;Muhs et al. 2016;Herrault et al. 2013).
As existing research generally focuses on separate processes involved in object extraction from historical maps, our study suggests a holistic approach composed of extracting, vectorizing, and linking objects. We demonstrate the benefits of eliminating and assigning labels for this whole process and present applicabilities of the resulting geometries. Because only by considering these techniques as a whole, we are able to answer location-related questions on the evolution of geographic features and make historical maps "accessible to geospatial tools and, thus, for spatiotemporal analysis of landscape patterns and their changes" (Uhl et al. 2017). New qualitative and quantitative analyses as well as comparisons to other historical or current geodata become possible by searching through and processing information derived from historical maps Chiang 2017;Iosifescu et al. 2016). For long-term backtracing of individual buildings, for instance, shape-based comparisons across different maps are useful (Le Riche 2020; Laycock et al. 2011).
In this work, we present a semi-automatic solution to make large-scale historical maps usable for spatial analysis while minimizing time-intensive and laborious manual user intervention. Based on our previous findings on the needs of users of historical maps (Schlegel 2019) as well as on the identification and extraction of map labels (Schlegel 2021), we demonstrate the general feasibility of a comprehensive 1 3 workflow composed of (1) eliminating labels, (2) extracting geometries, (3) vectorizing and refining those, and (4) matching and spatially assigning the extracted map objects with current ones. Potential future applications, which are shown in the further course, may be involving semantic information from labels to annotate corresponding map features or an adjustment of a map's visual appearance. Prospectively, new databases can be set up and comparative studies between different datasets become possible.

Elimination of Labels
Labels are valuable components in historical maps holding important metadata. However, text within a map is typically seen as a disturbing factor when extracting geometries. Misinterpretations in the context of segmentation may easily arise due to overlaps, direct adjacencies, or similar color values to map elements and structures such as lines or textures Bhowmik et al. 2018;Chiang 2017). Monochrome maps, in particular, have a reduced number of parameters to differentiate between text and other elements. However, an initial elimination of text or labels from historical maps can be seen as a major advantage for further object extraction processes (Gede et al. 2020). Previous attempts identified labels with the help of text recognition-subsequent to object recognition and vectorization-or by shape recognition algorithms (Iosifescu et al. 2016). Chrysovalantis and Nikolaos (2020) used binarized maps to separate text from other objects (see also Bhowmik et al. (2018)). By eliminating small pixel groups, they were able to remove letters. A GRASS GIS add-on developed by Gobbi et al. (2019) and Zatelli et al. (2019) replaces relevant pixel values by means of low-pass filters within old cadaster maps. However, pixels must already be defined as "text" in advance. Telea (2004) and Bertalmío et al. (2001) suggest different image inpainting techniques, which are often applied for image restoration. Missing or damaged image regions are filled to create an image without giving the viewer a hint of changes. In our testing, these approaches caused an unsatisfactory blurring of the input image.

Object-Based Image Analysis
Many methodologies for (semi-)automated object extraction from historical maps were demonstrated in recent years but proven insufficient for various reasons. For instance, a common histogram thresholding or color space clustering (Herrault et al. 2013) ignores any spatial context, whereas artificial neural networks require an inadequate amount of training data .
Chrysovalantis and Nikolaos (2020) used GIS functionalities to convert a historical multicolor map into a binary image and then to extract and vectorize geometries of buildings. However, textured or corrupt polygons could not be handled and labels were eliminated only partially. A similar approach was conducted by Iosifescu et al. (2016). By combining GIS operations with Python libraries, Gede et al. (2020) segmented and vectorized geometries of rivers as a function of their color whereas Le Riche (2020) extracted buildings from historical maps based on colors and textures. Zatelli et al. (2019) and Gobbi et al. (2019) used GIS and R to segment and classify features from historical land use maps by regarding their colors, sizes, and shapes. Additional machine learning techniques were applied by Gobbi et al. (2019).
In recent years, deep learning attempts via convolutional neural networks (CNNs) "have recently received considerable attention in object recognition, classification, and detection tasks" (Uhl et al. 2017) from historical maps (Jiao et al. 2020, and Xydas et al. 2022. However, they suffer from major drawbacks. Results from CNNs strongly depend on the quality and generally low quantity of available training data. Often, these data stocks are created manually and solely on the basis of the input bitmap itself, which is time-consuming and impedes an applicability. Originating from the field of remote sensing, geographic object-based image analysis may also be applied to scans of maps (Hay and Castilla 2008). In the broad field of cartography, only few authors use OBIA approaches to create new geodata. Whereas Dornik et al. (2016) reproduced soil maps from climate and vegetation maps, Kerle and de Leeuw (2009) extracted point-based population data from paper maps to estimate long-term population growth. Edler et al. (2014) applied OBIA to extract and quantify the presence of roads, buildings, and land use classes and to further evaluate the complexity of topographic maps thereby.
In contrast to pixel-wise approaches, OBIA regards not only spectral information, but also, e.g., the shape, size, or neighborly relations of objects, and is, therefore, much closer to human perception. Hence, OBIA is often suggested for object extraction from historical maps with the aim to make them machine-interpretable (Blaschke et al. 2014). Many studies in the field of OBIA focus on maps of colors and smaller scales, presuppose a preceding georeferencing (Chrysovalantis and Nikolaos 2020; Gede et al. 2020;Iosifescu et al. 2016) or well-defined shapes of objects (Chrysovalantis and Nikolaos 2020; Gobbi et al. 2019; Heitzler and Hurni 2020), or disregard intersections between map features. 1 3

Vectorization and Vector Enhancement
As vector data can be better processed and analyzed than raster data, a majority of the mentioned authors proceed with a vectorization of extracted map objects. Brown (2002) and Arteaga (2013) use specific software tools to, respectively, vectorize the outlines of geologic structures and buildings from historical maps. Vectorization tools are also provided within ArcGIS, GRASS GIS (Gede et al. 2020), and the GDAL library (Jiao et al. 2020).
To purge vectorized objects, further simplification processes may follow. Multiple software and tools, including eCognition, QGIS, ArcGIS (Godfrey and Eveleth 2015), SAGA GIS (Gede et al. 2020), R (Arteaga 2013), and Python libraries, implement pre-built functions to smooth or simplify lines or polygons and to remove outliers, spikes, and other artifacts.

Object Matching
For the direct comparison of vector objects from different maps from various times, distance and similarity measures may be promising (Xavier et al. 2016). Matching geometries between different inputs is frequently performed on the basis of shape or spatial similarities (Tang et al. 2008) or identical attribute values (Frank and Ester 2006). However, semantic similarity approaches are not feasible as scanned historical maps usually hold no ancillary information. Even if names of roads or buildings were available-e.g., by a preceding text recognition-they would need to be assigned to their corresponding geometries.
When analyzing geometric relations, such as overlapping areas or distance measures (e.g., Euclidian or Hausdorff distance), only relative distances between objects are considered. This technique is useless when comparing not yet georeferenced datasets (Xavier et al. 2016). Region-based shape descriptors (e.g., area, convex hull, Moment or Grid descriptor, see Ahmad et al. (2014)) regard all pixels within a shape and may therefore be promising for a comparison between identical real-world objects from different inputs. But due to uncertainties, they are rather considered complementary matching approaches. Additional similarity measures are necessary (Xavier et al. 2016). By regarding the spatial relationship between objects, Stefanidis et al. (2002) quantified their distances and relative positions. Samal et al. (2004) and Kim et al. (2010) consulted third objects to create an overall geographic context. Also, Sun et al. (2021) regarded spatial relationships by linking identical real-world objects from different historical maps. However, their knowledge graph approach presupposes the existence and assignment of labels to their corresponding geometries.

Data
For a proof of concept of our suggested methodologies, a large-scale (~ 1:11,000) historical map from the middle of the nineteenth century was chosen, which has already been object of research within related studies (Schlegel 2019(Schlegel , 2021. The original non-georeferenced and undistorted version of the map scan was cropped to a smaller extent (~ 1000 × 800 m in reality) for reasons of runtime compression within all processes. No map projection is known. The map subset in Fig. 1 shows the city center of Hamburg with blocks of buildings, roads, and water areas. Apart from subsequently colorized water areas, the map is drawn in black and white. Many data suppliers provide their raster scans with a resolution of 300 ppi which is considered adequate for object extraction purposes (Pearson et al. 2013). Lower pixel densities induce blurring and pixelation, whereas higher values tend to highlight interfering artifacts from, e.g., folds in paper, discolorations, or smudges (Peller 2018). We continued to work with the TIFF format (without compression) as it is lossless concerning the image's original pixel values (Gede et al. 2020).
To demonstrate the transferability of the workflow, two more large-scale historical maps covering the same spatial area were used in the further course (see Fig. 9a, b). They all differ in their visual appearance and complexity in terms of contrasts, textures, or the existence of labels and gridlines.
For comparing the described data to a current counterpart, official vector datasets including recent polygonal buildings (Landesbetrieb Geoinformation und Vermessung 2022) and line-type roads (Behörde für Verkehr und Mobilitätswende (BVM) 2020) were used.

Preparation for the Elimination of Labels
As similar color values and overlaps between labels and other map objects impede a clear discriminability, an initial elimination of labels designating real-world objects significantly contributes to a facilitation of object recognition processes. We suggest to make use of the output from previous label detection attempts (see Schlegel (2021)): vector bounding boxes comprising text image areas, which can be seen in Fig. 1. An exemplary text image area is shown in Fig. 2a.
With the aim to eliminate its content from the map, it was initially cropped by means of its original bounding box (see Fig. 2b) and rotated to the horizontal by its angle of alignment (Fig. 2c)-calculated by the used text detection tool Strabo (Li et al. 2018;. However, these text image areas do not only include characters, but also edges of buildings, which is an outgrowth of Strabo (see upper margin in Fig. 2c). This is counterproductive within the subsequent step of building segmentation as these image areas were supposed to be entirely eliminated from the map. Thus, building edges would become distorted. To retain these important edges, all pixels within a bounding  Steps for separating building edges from labels shown with an exemplary dataset: a input map with bounding box containing text image area, which then was b cropped, c aligned horizontally, and d converted into a binary as well as e a three-class mask. The f resulting bounding box excluding building edges was g turned back to its original orientation box were differentiated by text and parts of buildings. A user-defined thresholding helped to generate a binary mask consisting of dark "foreground" and bright "background" pixels (see Fig. 2d). A further "foreground" differentiation was needed to separate building edges from text pixels. However, similar color values, overlaps, and smooth transitions between text and buildings were challenging. For reclassifying former "foreground" into either "text" or "building edge" pixels, multiple thresholds and conditions had to be applied (Fig. 2e). As labels designating roads most commonly run parallel to nearby building edges, this step was performed row-wise. As Fig. 2f indicates, all pixels representing "text" and "background" were combined and vectorized. The resulting polygonal bounding box was turned back by its initial rotation angle (see Fig. 2 g) and then used within the following object extraction steps.

Object Detection and Recognition
To detect homogeneous image regions and extract objects such as buildings or water areas from large-scale historical maps, we used object-based image analysis. In contrast to pixel-based approaches (e.g., Maximum Likelihood, Clustering, or Thresholding), which only regard spectral differences between pixels, OBIA generates image objects also based on common textures, shapes, context, etc. and is, therefore, more suitable for historical maps with limited spectral information and heterogeneous appearances (Blaschke et al. 2014;Hussain et al. 2013).
As none of the many free and open source packages available for semi-automated feature extraction produces comparable results, we made use of the proprietary software eCognition Developer 10.2 to generate GIS compatible data from a historical map via OBIA (Kaur and Kaur 2014). eCognition converts user-defined rule sets-built-up from functions, filters, statistics, etc. for image segmentation and classification-into machine-readable code. These concatenations of algorithms can be easily transferred to other images (Trimble Inc. 2022).
As Fig. 3a indicates, a first rough differentiation between dark (foreground) map features (e.g., buildings and labels) and the map's bright background (water areas, roads, and places) was enabled by thresholding the input TIFF. The content of the labels' bounding boxes, as shown in Fig. 2f, was simply classified as "background" and could therefore be eliminated (see Fig. 3b). The detection of further map objects is therefore significantly facilitated on the one hand and building edges remain unaltered on the other hand.
To extract contours of buildings, an edge detector was applied to the image. The building texture's repeating pattern could be detected by means of a gray-level co-occurrence matrix-which measures the vertical invariance of adjacent pixel pairs-and analyzed by texture descriptors (Chaves 2021;Trimble Inc. 2021). Regarding the original map in Fig. 1, public buildings (e.g., the townhall or churches) have a significantly darker texture and could, therefore, clearly be differentiated from other buildings based on their gray values. Water areas were identified by thresholding the RGB blue channel as well as applying supplementary texture descriptors to avoid false positives.

Vectorization
Generally, OBIA results in raster files containing individual image objects, subdivided into predefined single classes. For further processing and analysis purposes, a vectorization of this data is inevitable. Based on experiences of Iosifescu et al. (2016) and Arteaga (2013), we applied GDAL's polygonize function to perform a raster-to-vector conversion Fig. 3 Foreground objects separated from the map's background a before and b after eliminating labels for each map class. Several functions to simplify and smooth the vectorized map features, to close inlying minor gaps, and eliminate small isolated polygons were compiled within an end-to-end Python script. This way, interfering artifacts (e.g., islands, protrusions, or spikes) stemming from an imprecise segmentation or undetected labels could be handled. The resulting polygons representing (public) buildings and water areas are shown in Fig. 4 and can be processed within future analysis operations.

Linking Historical and Current Datasets
Compared to previous studies dealing with object extraction from historical maps, we go one step further and present an exemplary way of how qualitative and quantitative evaluations of long-term changes within a cityscape may be practically enabled. We therefore spatially assigned a more recent vector dataset to the historical counterpart as shown in Fig. 7. Our aim was to automate this coarse georeferencing process as far as possible. Due to changing names of roads and buildings over time, the lack of indepth information, or simply imprecise scales, distances, and directions within historical maps, we used the previously extracted geometries for georeferencing purposes (Rumsey and Williams 2002). As can be seen from Fig. 5, churches and other municipal buildings still exist over time and, beyond that, do not substantially change their basic shape and geographic location over time. Therefore, their object shapes could be matched and used for the definition of control points in the further course of georeferencing (Skopyk 2021; Havlicek and Cajthaml 2014).

Shape Matching
To define matching georeferencing control points between the historical and current dataset, identical real-world objects are to be identified. We, therefore, measured the shape similarity between the extracted public buildings shown in Fig. 5 (Sun et al. 2021;MacEachren 1985). A matching based on spatial or semantic (attribute-based) similarities was impractical due to the lack of a coordinate system as well as further information concerning the historical map.
As Fig. 5 indicates, a side-by-side comparison between geometries of public buildings extracted from the historical map on the one hand and the official vector dataset containing current buildings on the other hand was performed. We implemented a matching of their shapes based on their Intersection over Union (IoU). After adjusting the aspect ratios of corresponding counterparts via rectangular bounding boxes ("envelopes" (Esri 2022)), their respective deviations could be quantified via IoU. As can be seen from Fig. 6, a building geometry and its envelope together form a binary mask-consisting of the values 1 (building geometry) and 0 (envelope). A final superimposition of these masks helped to determine their overlapping area (intersection) proportionally to their common area (union) (see Fig. 6). All "building" pixels with a value of 1 were considered for the IoU calculation, which was conducted with the help of Python's numpy library. Table 1 summarizes the IoU results for all detected public buildings continued to use for georeferencing purposes.

Method Overview
The centroids of those geometries with the closest matches (see highlighted cells in Table 1) were defined as control points for a semi-automated, rough georeferencing between the historical and current dataset. To preserve the objects' shapes and to keep spatial deformations to a minimum within the historical data, an affine transformation of all current buildings and roads was conducted. This  Table 1 as well as Fig. 7. In our test case, only three control points with sufficient pointing accuracy could be found-such a small number is quite typical for historical maps. However, if available, a larger quantity of control points is advisable  to benefit from over-determination for the transformation process. Figure 7 shows that a georeferencing between historical and current geodata gives the chance to directly compare their contents and, thus, to evaluate changes within an area over time (Iosifescu et al. 2016).

Error Estimation
For a minimum of transformation bias, it is generally recommended to evenly spread georeferencing control points throughout the input (Clark and MacFadyen 2020). We, therefore, evaluated affine transformation results using alternative control points. In Table 2, the resulting offsets between the two datasets are quantified for three different cases. Case a) represents the initial georeferencing with centroids of three public buildings used as control points, as illustrated in Fig. 7. In case b), the upper right centroid (St.-Petri-Kirche) was replaced by the one of another public building located rather at the edge of the input (St.-Jacobi-Kirche, see right margin in Fig. 7), whereas in case c), distinct crossroads close to the map's edges provided an optimum distribution of control points. Due to missing control points in the lower left image area, cases a) and b) resulted in greater deviations compared to c) (see Table 2). However, a visual inspection revealed only minor differences between the three approaches. In view of our objective, which was to roughly locate current map features and to enable a visual comparison of these with their historical counterparts, all three approaches delivered satisfactory results.

Geographic Context
To further assess the quality of the chosen control points, their geographic context was regarded. Based on the model from Samal et al. (2004), a contextual similarity between real-world matching map features was exemplarily computed for case a). Figure 8 shows an example of how a proximity graph-connecting the centroids of four building geometries with the one of a known geometry from 5.1-was built for each dataset. The offset between the two overlaying datasets could then be expressed by the length and angle of displacement between corresponding centroids (see Table 3). The largest deviations of up to 1.8 cm (72 m in reality) and 3.4 cm (~ 34 m) on average between  Table 3 Absolute offsets between historical and georeferenced current data, expressed by the displacement vectors' length and angle shown in Fig. 8 historical and georeferenced current dataset deemed to be reasonable in view of our application case.

Applicability of the Object Extraction Workflow
The following sections demonstrate in an exemplary way the transferability of the object extraction workflow described in 4.2 by means of two more historical map subsets illustrating the same spatial extent of the city of Hamburg (hereinafter named as "map A" and "map B", see Fig. 9a and b, respectively). Minor changes had to be conducted to achieve optimum results for the two different maps.

Map A
Rough building structures could be identified when applying the OBIA workflow to the label-less map A, illustrated in Fig. 9a. However, surface-filling geometries were not detected so that single processing steps had to be modified and added. For instance, to detect the conspicuous hatching of building geometries, a simple line detection algorithm was implemented. Water areas could be identified based on their outstanding hatching pattern consisting of isolated dashes. Small gaps were filled and object contours were closed with the help of morphological closing, which avoids expanding the segmented objects Gede et al. 2020). The resulting classified image objects can be seen in Fig. 10a.

Map B
Due to its monotonous appearance, a straightforward applicability of the workflow described in Sect. 4.2 was not feasible for map B. The dark contours of building objects were extracted by means of thresholding so that their enclosed textures could simply be classified as buildings as well. Also, water areas could be identified by regarding their distinctive texture. Labels were differentiated and classified based on their neighborly relations to buildings and the maps' background (roads and places), respectively. As can be seen from Fig. 10b, these relations were not unambiguous in each case. The map objects' quality highly depends on the map's complexity. With a greater degree of complexity, OBIA results became less satisfying. Apart from visual overload, further challenges may impede a segmentation of historical maps: • Stains, folds, and tears in the maps' original material, • detailed map objects and symbols (e.g., roofs of buildings, trees, or blades of grass), • heterogeneous or absent textures, or • overlaps between labels and other map features.

Potential Future Applications
With the geometries resulting from the workflow described in chapters 4 and 5, valuable analysis and comparison processes concerning urban morphological developments become possible. According to a preceding user study (Schlegel 2019), comparisons between historical and current maps mainly relate to buildings and roads as well as general transformations in the urban structure. Figure 11 shows two potential use cases: Users may select a historical building whilst, in the background, an intersection algorithm finds appropriate current buildings and outputs related information such as its name or area (see Fig. 11a). Alternatively, current road names might be queried. By selecting a historical road section, the current road name may be returned from a database using the intersection area between the bounding box of the former and the corresponding line feature of the latter (see Fig. 11b). Based on their intersection area, related information is returned from a corresponding database 1 3

Discussion
For a considerable enhancement of object extraction processes from historical maps, a preceding elimination of labels is advantageous. Based on the results of a text detection tool used in the course of previous research (Schlegel 2021) and assuming that labels run parallel to roads and edges of buildings, we were able to eliminate straight text.
As gray values of labels often do not differ significantly from the ones of adjacent map objects such as contours or textures of buildings, their existence complicates efforts towards the extraction of geometries from a historical map. In the present work, text was separated from features of similar color by thresholding techniques so that a more precise object extraction became feasible. This procedure is irrespective of any preceding map enhancements or georeferencing and efficient especially for monochrome historical maps having a heterogeneous background. In contrast to other studies, our suggested approach neither mistakenly removes other map features nor substitutes original pixel values. Nevertheless, an optimization of the preceding text detection should be undertaken so that all map labels are considered for elimination in future research.
A main purpose of this study was to pave the way for a straightforward comparison of large-scale historical maps with recent counterparts. Vectorized and georeferenced map features allow their analyzability and searchability in the further course. With the help of enhanced objectbased image analysis as well as subsequent vector refinement and linking processes, we address this issue within a semi-automated workflow. By applying OBIA approaches, not only spectral, but also textural, shape-dependent, or contextual characteristics of map objects are considered for their identification. Available techniques from image and vector processing contributed to an adequate quality of extracted features and to make a large-scale historical map analyzable and comparable. This is inevitable for investigating urban transformations over time. On the downside, specific software, knowledge, and, in some instances, subjective and individual solutions are required, especially within the object extraction domain. Consequently, a fully automated workflow is not realizable.
A critical view on the results shows that these strongly depend on a maps' complexity and the quality of the underlying bitmap. All processing steps applied to a bitmap are affected by its color depth, format, and resolution (Gede et al. 2020). Further improvements of the suggested methodology may involve a consideration of additional maps, e.g., showing other cities, and algorithms.
By roughly georeferencing large-scale historical with current maps and putting these on top of one another, a direct comparison of their individual objects is facilitated. We suggest to define georeferencing control points based on the shape and context similarities of map features such as public buildings. It is assumed that these buildings were already classified as such from preceding object extraction. To measure the similarity of objects between a historical and a current dataset, two methodologies are presented. It was found that corresponding geometries of churches have a high similarity as, usually, their shapes remain unchanged over centuries. When comparing these for matching purposes, both maps need to have a similar scale and degree of generalization. However, shape similarities are not invariably unambiguous. Distortions induced by adjusting the geometries' aspect ratios may lead to biased results. Further similarity measures, such as the geographic context, are therefore necessary. The consideration of the geographic context of objects proved beneficial for a quality control of the control points defined for the final georeferencing. Its outcomes depend on the subjective choice of reference points. Finally, georeferencing current with historical maps does not necessarily improve their accuracy. Depending on the particular application, it must be noted that shapes and lines, distances, or proportions may be distorted. Regarding our objective of comparing historical map content, the presented rough georeferencing proved to be satisfying for potential future applications.

Conclusion and Outlook
A major purpose of this study was to present the feasibility of a holistic workflow. Within an end-to-end solution, semi-automated approaches to extract and vectorize features from a large-scale, mainly monochrome, historical map were developed and applied for the purpose of providing knowledge, of analyzability (e.g., in GIS), and comparability. A concluding georeferencing enabled a straightforward comparison to current counterparts and, in the further course, to understand changes within a cityscape over time. A significant contribution could be made by previously eliminating map labels and separating them from other objects to improve their extraction results.
It was shown that a rough georeferencing is sufficient for the juxtaposition of historical and current map objects. With the presented methodologies, an appropriate spatial allocation without distorting map objects was achieved. The present findings confirm that each map has an individual need for adaption, but only minor adjustments are required to apply the suggested approaches to maps having a similar degree of complexity.
Overall, our results make a major contribution to extract valuable information from large-scale historical maps by combining approaches for text detection (see Schlegel (2021)), the elimination of labels, OBIA, raster-to-vector conversions, and an approximate spatial referencing based on similarity measures. We thereby provide a starting point for gaining new insights from large-scale historical maps. It should be emphasized that this research serves as a demonstration of a feasible holistic workflow paving the way for the analyzability of large-scale historical maps as well as for their comparison to current counterparts. This was implemented by means of an initial example case. In terms of future research, it would be useful to extend the current findings by examining additional maps. Also, further considerations should include the practicability of comparison analyses as illustrated exemplarily in chapter 7.
Funding Open Access funding enabled and organized by Projekt DEAL. No funds, grants, or other support was received.
Code Availability All source code and exemplary data sets are openly available for reproducibility at https:// github. com/ IngaS chl/ Object-Extra ction.

Conflict of Interest
The authors report there are no competing interests to declare.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.