Introduction

Encroachment of invasive alien species on protected areas of high nature conservation value has been observed all over the globe (de Poorter 2007; Foxcroft et al. 2013). Invasive alien species occur even in the most inaccessible regions of the world, e.g. isolated ocean islands (Walsh et al. 2008), high mountains (Foxcroft et al. 2013) or polar regions (Rose and Hermanutz 2004). Despite protection regimes, regions of high nature conservation value are continuously exposed to various anthropogenic factors that disturb the ecological balance and enable the dispersal of alien elements (McKinney 2002; Pauchard and Alaback 2004; Foxcroft et al. 2013). Wetlands, which play a key role in biodiversity conservation (Howard 1999), are a particularly important group of ecosystems exposed to invasion of alien species (Dajdok and Pawlaczyk 2009). That is why the Ramsar Convention, in its successive resolutions, addresses the issue of the threats posed by invasive species to wetlands (Howard 1999, COP VIII). Vascular plants are the main group of organisms of alien origin that spread in protected areas and habitats. In Europe, the contribution of alien species to the flora of some protected areas is up to 40% (Foxcroft et al. 2013). Due to such a large contribution as well as the adverse impact of invasive species on native ecosystems, many authors emphasize the necessity of early detection of such species (e.g. Lawrence et al. 2006; Foxcroft et al. 2013; Singh et al. 2015). In Europe, the need to develop and implement a comprehensive strategy of protection against invasive species is recognised by the key international documents dedicated to nature conservation, e.g. the Convention on Biological Diversity (CBD 1992), the Bern Convention on the Conservation of European Wildlife and Natural Habitats (Berne Convention 1979) as well as the EU directives (Habitats Directive 1992) and the most recent initiative on invasive alien species: the EU Regulation 1143/2014, which focuses specifically on the prevention and early detection of alien species. In Central Europe, Spiraea tomentosa is one of the alien species that poses a threat to wetland communities (Dajdok et al. 2011, Wiatrowska et al. 2018). The species is particularly expansive on bogs that have been degraded as a result of land drainage. Few invasive species occur in this type of habitats due to their specific ecological status (Wiatrowska et al. 2018), thus Spiraea tomentosa is a suitable candidate to check whether it is possible to map a species using remote sensing in such ecosystems.

Conventional mapping of invasive species involves on-the-ground fieldwork consisting mainly in the determination of the location of plants with a GPS receiver or visual interpretation of Red, Green, Blue (RGB) images. Such approach are usually time-consuming and based on expert knowledge, while the size of surveyed areas is limited. Remote sensing offers the possibility of faster detention of invasive species over large areas and provides more reliable data. It also enables frequent updates of existing maps due to the possibility of repeated surveys (Lawrence et al. 2006).

Satellite (Kobayashi et al. 2014) or airborne (Cole et al. 2014) imagery is used for remote identification of vegetation. Multispectral satellite imagery is mainly used to detect homogeneous plant communities (Lawrence et al. 2006). The limitation in this case is the low spectral resolution, which hinders precise identification. Therefore, hyperspectral (HS) imagery is increasingly used for invasive plant species mapping, where electromagnetic radiation reflected from the Earth’s surface is recorded in hundreds of narrow bands that form continuous spectral bands (Vane and Goetz 1993). Multispectral satellite data also have a lower spatial resolution compared to airborne data and may not be sufficient to identify individual species of herbaceous plants, especially in the early stages of invasion when individual patches occupy a small area (Müllerová et al. 2013). Acquisition of airborne HS data ensures a sufficiently high spatial and spectral resolution to identify individual patches even over a small area. Airborne HS imagery was used to identify, among others, invasive shrubs along Missouri highways (Bentivegna et al. 2012) or noxious weeds in grasslands (Mirik et al. 2013).

Airborne Laser Scanning (ALS) data, and in particular a fusion of HS data with ALS products (Asner et al. 2007), are used for species identification as they offer the possibility to analyse vegetation structure. The fusion of HS imagery and ALS products was applied for, among others, the detection of invasive tree species in Hawaiian rainforests (Asner et al. 2008) or shrubs in urban environments in British Columbia, Canada (Chance et al. 2016), or grasses (Marcinkowska-Ochtyra et al. 2018). Due to national and regional ALS campaigns, the coverage of readily available ALS data worldwide is still growing. However, for national purposes in Poland, ALS data are obtained during the leaf-off season. Acquisition of ALS data at the peak of the growing season may increase the usefulness of ALS products in plant classification.

An important aspect of remote sensing approach to identification of invasive species is the appropriate selection of classifiers that will use the potential of input datasets, will be tailored to the object of the study and will be optimal in terms of computational requirements and timing of the classification process. If the analysed species forms compact patches over an area larger than an imagery pixel, pixel methods such as random forest (RF, Breiman 2001) or support vector machines (SVM, Vapnik and Lerner 1963) can be used. Otherwise, it is reasonable to use sub-pixel methods, e.g. spectral angle mapper (SAM, Kruse et al. 1993) or mixture tuned matched filtering (MTMF, Boardman 1998), which provide information on the proportion of species identified in each pixel of the image. Moreover, the majority of sub-pixel classification methods require the use of unprocessed HS data, while pixel classifiers offer the possibility of using many different sets of raster data.

The review of the reports on the identification of invasive species suggests that the use of the RF algorithm has usually performed better in terms of statistical accuracy (Lawrence et al. 2006; Singh et al. 2015) and classification time (Sluiter and Pebesma 2010) compared to other classifiers. One example is the study on four invasive species, namely Verbascum thapsus, Urtica dioica, Cirsium vulgare and Marrubium vulgare in the Lava Beds National Monument in California, where high classification accuracies (Kappa coefficient for RF = 0.809) were obtained for RF methods compared to classification trees, logistic regression and LDA (Linear Discriminant Analysis) methods (Kappa coefficient from 0.195 to 0.716; Cutler 2007). Similarly, the highest classification accuracies for 15 different species of Mediterranean shrubs in southern France were obtained using the RF classifier (overall accuracy OA = 49.5%) compared to traditional classifiers, i.e. quadratic discriminant analysis (QDA, OA < 39.3%) and k-nearest neighbours (k-nn, OA < 28.8%; Sluiter and Pebesma, 2010). In addition, a significantly shorter time of the classification process, compared to the SVM and ANN (artificial neural networks) methods, was emphasised.

The objective of the study

The objective of this study was to develop a remote sensing method for mapping invasive Spiraea tomentosa L. In addition, an attempt was made to determine the optimal phenological period for classifying this species. The effect of HS and ALS data fusion on the accuracy of target species classification was also investigated by performing classification on different raster datasets. The potential for applying the developed method was tested by checking at what percentage cover it is possible to identify Spiraea tomentosa, using the proposed classification methods?

Study area and target species

The study area of 1845.2 ha (51ο22′ N, 15ο19′ W) is located within the range of the Lower Silesian forests, Poland (Fig. 1), and is a Natura 2000 site (code PLB020005 Bory Dolnośląskie) with an area of 172,093.4 ha. It comprises one of the largest compact forest complexes in Poland. Infertile, sandy pine forest habitats dominate here. Phytocoenoses of mixed coniferous forests and fragments of deciduous (beech and oak-hornbeam) forests occur in more fertile habitats. The area features extensive heaths and birch woods as well as transition mires in terrain depressions (partly drained) and small patches of alder swamp woods.

Fig. 1
figure 1

Location of the study area in Poland and the distribution of on-ground reference polygons against the background of the HySpex image (27 August 2016) in the composition of natural colours (RGB). The marked areas a, b and c are discussed in detail in the paper

Spiraea tomentosa (steeplebush) is a shrub originating from North America and naturalised in several European countries (Verloove 2006; Dajdok et al. 2011, Wiatrowska et al.2018). It is a small shrub (on average about 1–1.5 m high) from the rose family (Rosaceae). It produces numerous underground stolons and usually unbranched, upright (not creeping) shoots. Pyramid-shaped pink flowers are formed at the tops of shoots developed in a current year in mid-July. Fruits are small follicles that may remain attached to plants through the winter (November–February) (Kujawa-Pawlaczyk 2009).

Spiraea tomentosa is one of the invasive alien species that pose a local threat to wetlands in central Europe (Dajdok et al. 2011, Wiatrowska et al. 2018). The high invasiveness of the species is attributed to i.a. the fact that it reproduces quickly and generatively, and at the same time shows the ability to reproduce vegetatively. Each year, a mature specimen of the shrub can produce about one million small seeds with germination capacity estimated at 93% (Wiatrowska and Danielewicz 2016). In addition, the shoots of steeplebush take root after contact with the soil surface, resulting in the formation of so-called layers, which promotes rapid compaction and growth of existing populations of this species (Wiatrowska et al. 2018).

The species penetrates peat bog plant communities, moist coniferous forests, meadows and pastures within woodland glades and pond shores. It is particularly invasive on dry peat bogs, where it can overgrow almost all natural or semi-natural vegetation, eliminate most peat bog species and change the community structure (Dajdok et al. 2011). Among habitats protected under the Natura 2000 Habitats Directive, peat bogs are particularly threatened—including degraded raised bogs still capable of natural and stimulated regeneration (code 7120), active peat bogs—raised with peat-forming vegetation (code 7110), in particular lowland bogs (code 7110-1), lowland transition mires and quaking bogs (mostly with Scheuchzerio-Caricetea vegetation; code 7140-1) and depressions on peat substrates of the Rhynchosporion alliance (code 7150; Wiatrowska et al. 2018).

Few invasive species occur in this type of habitats due to their specific ecological conditions, such as a high groundwater level. According to recent monitoring data acquired in Poland for Natura 2000 habitats (Perzanowska 2017), transition mires are slightly invaded by Alnus rugosa, Erechtites hieracifolia, Solidago gigantea, and Bidens frondosa. Among these species, Spiraea tomentosa is listed as the most serious threat with the increasing cover and number of sites since the last monitoring in 2010. In such non-forest habitats, the steeplebush was mostly observed on the banks of drainage ditches. The extensive system of such man-made environmental structures in the Lower Silesian forests was found to be the most important factor favouring the expansion of S. tomentosa (Wiatrowska and Danielewicz 2016). Therefore, the invasion of the discussed species in the study area is primarily determined by a large proportion of wet habitats connected by a dense network of drainage ditches.

Due to the considerable threat posed by Spiraea tomentosa to valuable ecosystems (e. g. elimination of most peat bog species and changes in the community structure), attempts are being made in Poland to identify (Breigin 2014) and control (Pawlaczyk and Karaśkiewicz 2009) the species at the early stages of its invasion.

The species is also problematic for forest management due to its inflammability. Old, dry plant shoots are often the ultimate cause of forest fires, also in adjacent stands. Moreover, they represent a problem for both tree regeneration and afforestation of former agricultural land (Wiatrowska and Danielewicz 2016). In the study area, Spiraea tomentosa covers mostly dry transition mires.

Materials and methods

The first stage of the study was to acquire: HS data, a point cloud from an ALS and on-ground reference polygons (Fig. 2). On-ground and airborne data were obtained simultaneously (during 7 days for each campaign). Remote sensing data were corrected and pre-processed (details of this process can be found in the publication—Sławik et al. 2019). Then, many derivative products were generated, e.g. vegetation indices, minimum noise fraction transformation (MNF, Boardman and Kruse 1994) bands, several ALS products regarding habitat conditions (System for Automated Geoscientific Analyses—SAGA products), laser pulse reflection intensities and geometrical relations of vegetation points (Boise Center Aerospace Laboratory—BCAL products). Detailed information about the data used can be found in the subsections below. RF classifications were performed on the combined remote sensing data and on randomly selected 50% reference polygons. Other 50% reference polygons were used to verify the classification results.

Fig. 2
figure 2

Workflow used in the study

Acquisition and preparation of on-ground reference data

Spiraea tomentosa was examined twice during the growing season of 2016 at various stages of the life cycle (Table 1). The on-ground reference data were acquired simultaneously with the acquisition of airborne data.

Table 1 Spiraea tomentosa development during the summer and autumn season

On-ground reference polygons were established both for the Spiraea tomentosa sites (116 polygons) and for non-forest type of vegetation (300 polygons). The unit of reference polygons was a circle with a radius of 1–2 m. The radius of standard reference polygons was 2 m. The radius of a reference polygon was reduced even to 1 m only when the narrow width of a patch made it impossible to establish larger, internally homogeneous plots. Geolocation of the reference polygons was determined using GPS Mobile Mapper 120 with the real-time differentially corrected Global Positioning System (DGPS) and the measurement accuracy ranging from 0.2 to 1 m.The reference polygons were established in the same place during the first (P1) and second (P2) field period. In addition to geolocation, relevant characteristics related to the species and/or the surroundings were noted for each reference polygon during both periods. The percentage cover in a given polygon, divided into dead and living shoots, was noted for Spiraea tomentosa, and the dominant growth stage was determined as well. In addition, the percentage cover of co-dominant species was recorded.

During the visual analysis of the aerial images, it was noticed that some of the on-ground reference polygons for Spiraea tomentosa were in shadow. After doing the first classification test based on the 25 MNF bands (P2) and data set including all 116 Spiraea tomentosa polygons (50% in training and 50% in validation), we received a highly overestimated result. Target species overestimation was visible during statistical accuracy assessment (UA = 32.76%, PA = 100%, F1 = 49.36%) and during visual assessment of the classification image, especially in shaded places. For this reason, it was decided to remove the polygons for Spiraea tomentosa that were in the shade and create a background subclass—shadows. After removing 26 shaded Spiraea tomentosa polygons from the reference set the species overestimation problem has decreased (UA = 82.35%, PA = 83.56%, F1 = 82.96%). So the method presented in this article assumes that the species should not be detected in shady places to avoid overestimation.

Finally, inn both periods, 90 reference polygons with varying density (from 30 to 100%) of Spiraea tomentosa were used. Due to the development of Spiraea tomentosa and co-dominant species during the growing season, reference polygons in different periods were characterised by a varied relative abundance of the species. In the summer period (P1), the average percentage cover of target species in reference polygons was 73%, including 40 polygons with more than 70% cover. In the autumn period (P2), however, the average percentage cover was 65%, including 30 polygons with over 70% cover. A total of 500 background polygons were established in the surveyed area in both periods. The background class included 300 polygons for non-forest types of vegetation found in the study area, including plant communities that may be invaded by Spiraea tomentosa (e.g. wet meadows, dry transition mires, heathland), 80 polygons for forests and 40 polygons for each of the other classes (e.g. soils, water, shadows). A shadow class was added using photointerpretation techniques to mask shaded areas. The proportion between the Spiraea tomentosa polygons and the background classes corresponded to their proportions in the landscape.

Remote sensing data

HS images

HS data were obtained with two sensors HySpex VNIR-1800 and SWIR-384 in two flight periods (P1 and P2—Table 1). The mean flight altitude was 730 m above ground level. The swath width was 384 m with a planned overlap of 30%. The native pixel of acquisition was 1 m for the SWIR sensor and 0.5 m for the VNIR sensor. The August dataset was acquired from 9:30 to 10:30 local time (sun altitude 33°–40°) with clear sky conditions, and the September dataset was acquired from 11:00 to 12:00 local time (sun altitude 33°–36°), also with clear sky conditions.

Each time, the data were acquired along thirteen flight lines. The acquired data were pre-processed: first, they were converted to a radiance unit with HySpex RAD software. Parametric geocoding and orthorectification were performed using the digital surface model reconstructed from laser scanning (acquired during the same flight) in PARGE software. Then, the combined data from both sensors were subjected to atmospheric compensation in ATCOR4 software. Finally, all acquired scan lines were mosaicked into one image. This resulted in a 430-band image in the spectral range of 416.18–2396.44 nm and spatial resolution of 1 m.

The minimum noise fraction transformation was performed on the corrected HS images to remove noise and compress the most useful information. Based on the graph of eigenvalues and visual analysis, 25 MNF bands for both periods were selected for further processing. HS vegetation indices (IND) were calculated to acquire additional spectral information on the content of water, pigments or photosynthetic activity of vegetation (Table 2).

Table 2 Remote sensing indices used in the Spiraea tomentosa classification

ALS data

During the same flight, ALS data were also acquired using ALS Riegl LiteMapper 6800i. The scanning angle was 60 degrees and the footprint size was 0.22 m. The expected separation distance between subsequent echoes from the same pulse was 0.4 m. The scanning wavelength of the sensor was 1550 nm. The full waveform decomposition into a point cloud of 7 pts/m2 density was made using RiAnalyze software. Point cloud classification was performed using TerraSolid software. On the basis of the classified point cloud, the Digital Terrain Model (DTM) was calculated by interpolating points in the ground class using moving planes and Canopy Height Model (CHM) was generated. ALS products with a spatial resolution of 1 m were generated from the point clouds using the System for Automated Geoscientific Analyses—SAGA (Olaya and Conrad, 2009) and BCAL Lidar (light detection and ranging) Tools from the ENVI software (BCAL Lidar Tools, 2016; Table 3). The ALS products used in the work provide information on morphometric, wetness and lighting features of the area and the vertical structure of vegetation. These terrain metrics are expected to be relevant to mapping of Spiraea tomentosa because the topography of the area is diverse and the species usually occurs in wet places.

Table 3 ALS variables tested in the classification process

Classification process

The supervised classifier RF (Breiman, 2001) from the EnMAP-Box 2.2.1 (Environmental Mapping and Analysis Program, Van der Linden et.al. 2015) software was selected to identify Spiraea tomentosa in the study area. The algorithm is effective with diverse classes, like vegetation, especially when compared to the sub-pixel classifiers. No specific RF parameter tuning was performed and most parameters were left at the commonly used defaults. There were 100 trees learned for each model, with the Gini criterion used for determining splits, and taking into account the square root of the number of features for each split. Two classes were included in the classifications: Spiraea tomentosa and the background, which included other plants, water, forest, soils and shadows. As a result, in each period (P1, P2) there were 90 polygons (which cover about 0.001% of the area) for Spiraea tomentosa and 500 polygons (which cover about 0.025% of the area) for the background. The polygons separately for target species and background were randomly divided into training (50%) and validation (50%) polygons. Stratified random sampling was used for the Spiraea tomentosa class to ensure comparable representation of polygons with different percentage coverage in the training and validation set. We tested five raster datasets for both periods to see which ones improve the accuracy of species identification (Table 4).

Table 4 Tested raster datasets

RFclassifications were performed using training polygons and the above raster datasets. Then, the accuracy was assessed based on the classification result and validation polygons. The following statistical parameters were analysed: overall accuracy (OA), Kappa coefficient, producer’s accuracy (PA), user’s accuracy (UA) and F1 accuracy for each class.

In addition, the validation polygons for Spiraea tomentosa were projected onto the classification output image and it was determined whether they were fully correctly classified (the species was detected within the entire polygon), partially correctly classified (the species was detected within the portion of the polygon) or wrongly classified (the entire polygon was classified as the background). The results were compiled in the form of a bar chart taking into account the percentage cover of the species, which allowed us to analyse the impact of species cover on the correctness of classification.

Finally, the botanical expert assessment was performed. This assessment consisted in an arbitrary verification of the correctness of the classification by botanists performing measurements in places where reference polygons were not established, and the absence or presence of Spiraea tomentosa was noted during the on-ground measurements.

Results

Results of Spiraea tomentosa classification

The accuracy of the Spiraea tomentosa classification calculated based on the validation dataset is presented in Table 5. Of the five raster datasets tested, the highest accuracies for both measurement dates (P1: Kappa = 82.35%, F1 for the target species = 82.96%; P2: Kappa = 76.53%, F1 for the target species = 77.25%) were obtained for the set consisting of only 25 MNF transformation bands (Scenario 1)—Table 5, which means that adequately processed (reduced noise and compressed) information of hyperspectral images HySpex with a spectral range of 0.4–2.5 μm is sufficient to correctly identify the species Spiraea tomentosa. On the other hand, the raster datasets consisting of ALS products alone (Scenario 3) enabled the identification of S. tomentosa with low accuracy (Kappa = 38.20%) and were characterised by a significant overestimation—the user’s accuracy for both periods was only about 30%.

Table 5 Statistical accuracy for P1 and P2 results. Statistics for both periods are based on a validation data set containing 45 reference polygons for Spiraea tomentosa and 250 background polygons (150 for other plants, 40 for forests, and 20 for other subclasses, e.g.: soils, waterand shadows)

The applied IND, referring to spectral characteristics of vegetation related to i.a. the content of pigments in leaves, also did not differentiate Spiraea tomentosa from the remaining vegetation background in a sufficiently clear way as to increase the accuracy of identification (Table 5—Scenario 2). Adding additional layers, derived from both ALS and IND, to the input set reduced the classification accuracy by up to about 10% in the case of P1 and about 25% in the case of P2. Both the vegetation indices and the laser scanning products caused a significant overestimation of Spiraea tomentosa—UA, after adding ALS and IND to MNF bands (Scenario 5), was reduced by 18% for P1 and by over 40% for P2. Higher accuracies of S. tomentosa identification were obtained in autumn (P2: Kappa ranged from 41.01 to 82.35%, F1 for S. tomentosa ranged from 43.96 to 82.96%) compared to summer (P1: Kappa ranged from 38.20 to 76.53%, F1 for S. tomentosa from 41.13 to 77.25%). In the summer period, the producer’s accuracy for S. tomentosa was much lower (PA from 68.18 to 71.09%), i.e. it was much more underestimated compared to the autumn period (PA from 78.79 to 84.33%), depending on the raster dataset. On the other hand, the overestimation of S. tomentosafor the best set of rasters was insignificant and ranged from about 10% for the summer period (UA for P1 = 89.11%) to about 17% for the autumn period (UA for P2 = 82.35%) and increased up to approximately 70% after adding additional products.

Distribution of Spiraea tomentosa based on the classification results

The analysis of the classification correctness was also based on the number of correctly classified reference polygons of the species divided into classes according to the percentage cover of a given species and by visual analysis of the classification output map.

The analysis of the classification correctness on the validation set (Fig. 3) shows the relationship between the percentage cover of S. tomentosa in the polygons and the correctness of the classification. The most effective classification is observed in polygons with Spiraea tomentosa cover of 90–100%. Polygons with a lower coverage were mostly correctly detected, but a slight underestimation was observed. The most underestimated polygons (classified as background instead of Spiraea tomentosa) occurred in P1 and had a coverage of 40%.

Fig. 3
figure 3

Stacked bar chart: the number of validation polygons in P1 and P2 with different Spiraea tomentosa coverage. The bars represent the sum of Spiraea tomentosa validation polygons for each period and species coverage class. The number of correctly, partially or incorrectly classified Spiraea tomentosa polygons can be compared by different shades of gray

The botanical analysis of the results distinguished three basic sites and types of Spiraea tomentosa occurrence:

(1) Large compact patches of Spiraea tomentosa are observed on dry transition mires where no mowing or cutting is implemented. The species occurs there in the form of dense bushes, usually about 1–1.5 m high. The distribution on the classification maps well reflects the cover of S. tomentosa patches (Fig. 4)—in these places Spiraea tomentosa reaches the highest cover of above 90% and the result of classification is highly accurate and similar in both periods.

Fig. 4
figure 4

Comparison of the results of Spiraea tomentosa identification for selected areas in period P1 and P2

(2) The species abundance is also observed along occasionally mown drainage ditches, where the species forms well-classifiable compact zones of bushes with a height of about 1–1.5 m (Fig. 4). Also in this case the results for both periods are similar.

(3) Wet meadows mown once a year are the third habitat where the species is found. Spiraea tomentosa have much lower cover-abundance values there (30–50%) due to the land-use type. They do not have the habit of shrubs, but they usually form single shoots up to 0.8 m high. The results of the classification for Period 2 are correct (Fig. 4), while for Period 1 they are underestimated. This can also be seen in Fig. 3 where better results for polygons with cover-abundance of 30–70% were obtained in autumn, when remote-sensing parameters more easily discriminate Spiraea tomentosa from the background.

Discussion

Single data vs data fusion

The classifications performed according to five scenarios and the obtained accuracy results indicate that the optimal raster dataset for the identification of the target species consists of MNF bands only (Table 5). The study has not confirmed the usefulness of ALS products for the identification of Spiraea tomentosa. This could be caused by the selection of ALS products and the lack of radiometric calibration of the point cloud. In general, the set of easy to generate products, that could be produced with standard remote sensing software or free software products were selected. The more complete set of ALS-derived products could perform better, but extraction of such products were beyond the scope of this article. The use of ALS data with a higher density of points could also probably produce better results of Spiraea tomentosa classification. An additional complication could have been the inclusion of topographic indices (MRRTF, MRVBF, TWI, TI) in the range of products, which in the case of an invasive species with a wide spectrum of tolerance in relation to habitat conditions probably had no information value. A similar conclusion was reached by a team of researchers studying another expansive species – Calamagrostis epigejos (Marcinkowska-Ochtyra et al. 2018). The accuracies of Spiraea tomentosa identification obtained with the use of ALS products alone were much lower (Kappa about 40%) compared to those obtained with the use of HS data alone, and, at the same time, much lower than those obtained by other authors. For example, Zlinszky et al. (2012) presented mapping of Phragmites australis in the vicinity of Lake Balaton in Hungary, where high accuracy (Kappa = 0.80) was obtained using the ALS data and the Decision Tree method. Similarly, Singh et al. (2015) mapped an invasive shrub, Ligustrum sinense, in Charlotte, North Carolina using ALS data and an RF classifier and achieved overall accuracies between 81 and 89%.

The obtained results indicate that using only the HS data yields better results in the classification of Spiraea tomentosa than merging the HS and ALS data (data fusion) both in summer and autumn (Table 5). The lack of precision gain at the moment when the ALS data were integrated with the HS data results from the fact that the ALS data, though acquired simultaneously with the HS data, had too little information value to identify the species under study. The effectiveness of data fusion proved to be limited due to the low potential of single data (ALS), which is in line with the conclusions reached during the research on the identification of Natura 2000 Habitats using HS and ALS data (Sławik et al. 2019). The incorporation of selected plant indices into the classification did not produce any positive effect either, regardless of the date of data acquisition.

Influence of data acquisition time on classification results

Statistical analysis of the output images (Table 5) and visual assessment of the spatial distribution of Spiraea tomentosa suggest that the identification results in both measurement periods may be considered correct and acceptable. Thus, in the case of the studied species, the most important variable affecting the classification results was the set of raster data used in the classification rather than the date of data acquisition (Table 5). In addition, this indicates high repeatability and stability of the results regardless of the data acquisition time, which enables the use of the presented methods to monitor invasive species. According to the botanical opinion and statistical accuracy assessment (Table 5), slightly better results of classification were obtained in period P2, where Spiraea tomentosa showed over 6 percentage points lower underestimation error (Table 3).

Differences in the results of classification were particularly visible in areas with smaller cover-abundance of plants, i.e. underestimation in P1 and correct mapping in P2 (Fig. 4). This applies, for example, to wet meadows where Spiraea tomentosa have lower cover-abundance values (30–50%) due to the land-use type (mown once a year). As mentioned earlier, in such places S. tomentosa does not grow in the shape of shrubs, but forms single shoots up to 0.8 m high, much less distinct from the surrounding vegetation, which caused the results of classification for Period 1 to be underestimated.

The higher accuracies of Spiraea tomentosa identification obtained for the autumn period (P2) are probably related to a change in spectral background characteristics, making the S. tomentosa distinguishable from the surrounding plants. Thus, it has been confirmed that scheduling a survey flight for certain species for the autumn period significantly increases the probability of correct species identification (Niphadkar and Nagendra 2016; Marcinkowska-Ochtyra et al. 2018). Another reason for the better results of Spiraea tomentosa mapping at the end of the growing season may be the fact that herbaceous species in the summer season have higher cover-abundance, higher levels of green biomass, hence the lignified shoots of Spiraea tomentosa are visible only when some of the herbaceous plants wither and die. The effect of green herbaceous plant biomass on the detection of shrub species was also discussed in studies related to Lonicera maackkii (Resasco et al. 2007). Moreover, the results of Spiraea tomentosa mapping could have been better if the remote sensing data had been acquired during the flowering peak of plants. Then the spectral discrimination of the species from the background may have been significantly higher, which would have been reflected in markedly better results of the analyses. The flowering period is very often considered the best for identifying species that bloom profusely and massively at the same time (e.g. Hunt et al. 2007, Somodi et al. 2012), such as Spiraea tomentosa.

Applications and limitations

The statistical (Table 5, Fig. 3) and visual map assessment (Fig. 4) indicates that Spiraea tomentosa mapping by remote sensing methods is correct and has potential for practical applications. Statistical accuracies for Spiraea tomentosa reported in this paper are similar or higher compared to the results reported by other authors related to the detection of invasive plant species using the RF algorithm. For example, during the identification of invasive species in Madison County, Montana, based on hyperspectral CHRIS (Compact High Resolution Imaging Spectrometer) data and the RF algorithm, an overall accuracy of 86% for Euphorbia esula L. and 84% for Centaurea maculosa Lam. (Lawrence et al. 2006) were obtained. Similarly high accuracy results were obtained for the classification of other plant species from the flight ceiling using only hyperspectral data (Kopeć et al. 2019) or ALS and hyperspectral data fusion (Marcinkowska-Ochtyra et al. 2018).

Identification of Spiraea tomentosa with high accuracy (OA > 98%, Kappa about 80%) may result from low site complexity—spectral variability, species and landscape diversity (Andrew and Ustin, 2008) and the specific structure of this species. Due to the characteristic growth type (numerous underground runners that produce dense upright, unbranched shoots), Spiraea tomentosa forms large and compact, single-species patches that discriminate the shrub from the surrounding herbaceous plants. The success of the Spiraea tomentosa mapping in this study was also due to the use of the Random Forest algorithm and hyperspectral data with high resolutions and high information potential. The high spectral and radiometric resolution of HS imagery helped to discriminate the S. tomentosa from spectrally similar plants, while the high spatial resolution of airborne data was necessary to detect small or narrow patches of this species, e.g. along ditches (Fig. 4).

Pixel-based classification methods, i.e. RF, are best suited for the identification of plant species patches of an area larger than 1 pixel of the imagery used and with a high percentage cover (Bradley 2014). High species density affects the purity of pixels and facilitates species identification. This is confirmed by the results of the present study, which show that the number of correctly classified polygons for Spiraea tomentosa increases with increasing percentage cover. The best classification results for S. tomentosa were obtained for polygons with 90% and higher coverage (Fig. 4), however, the results for polygons with lower coverage were also correct. Polygons with a coverage of less than 70% were underestimated, especially in summer. Such results indicate the limitation of the remote sensing method in the identification of sites with low density of a target species, in the early stages of invasion, when it does not yet dominate over the native vegetation (Ulm et al. 2017). The highest underestimation of the target species is visible in the meadows, where the species reaches on average the smallest coverage and size at the same time. A similar relationship was reported by Skowronek et al. (2017), who identified the invasive bryophyte Campylopus introflexus in the APEX (Airborne Prism EXperiment) hyperspectral imagery. All polygons of this species with a percentage cover of over 66% were classified correctly, but the algorithm made more mistakes when the percentage cover of C. introflexus was smaller. Poorer identification of polygons of plant species with a small percentage cover results from the pixel approach, where each pixel of the image is assigned to the class that constitutes its majority.

Research, conducted on four other species (Kopeć et al. 2019), indicates that validation polygons with a low proportion of the target species (below 50% cover) were classified correctly only in the range between 9 and 40%. Detection of a species with a small percentage cover is possible using fuzzy methods, which assign a probability of membership for each class to each pixel (Zlinszky and Kania 2016). They have not been tested in this study, because Spiraea tomentosa did not occur often enough in the study area with a percentage cover of less than 30%. However, the probability of species occurrence determined by fuzzy algorithms also increases with the species percentage cover (Williams and Hunt 2002).

Based on the results reported in this paper, it may be assumed that the use of HS data and the RF classification method is a good approach for the identification of the invasive plant Spiraea tomentosa, especially in large and difficult-to-access study areas. Fieldwork is then needed to collect on-ground reference data for training and validation, but it can be limited to a smaller area.

The results of the remote sensing mapping indicate the potential of the method described to be used for the identification of the threat associated with the expansion of Spiraea tomentosa in peatlands and other valuable wetland areas. This method can therefore be used in the management of protected areas, especially those difficult to penetrate in the field due to their size and inaccessibility. In addition, based on the example of this species and previous results obtained for other taxa (e.g. Resasco et al. 2007, Niphadkar and Nagendra 2016), it can be concluded that it is possible to monitor invasive species whose morphological structure is similar to that of Spiraea tomentosa penetrating non-forest peat ecosystems. This applies especially to species with a similar habitat—shrubs or small trees. In many European countries, invasive species represent an increasing threat to peat bogs due to their drying (Tomassen et al. 2004).

For example, a survey by Fernandez et al. (2012) found invasive species on 35 out of 48 raised bogs surveyed in the Irish midlands. The most common invasive species surveyed were Pinus contorta, Rhododendron ponticum and Sarracenia purpurea. The first two of the above-mentioned species, i.e. Rhododendron ponticum and Pinus contorta have a habit similar to that of Spiraea tomentosa, and Rhododendron ponticum is the main species that has caused problems for peatland conservation in Ireland (Malon and O’Connell 2009). In Central Europe, Cornus sericea poses a similar threat to peatlands and can be found for example in peat bogs in the Biebrza National Park (Brzosko et al. 2016).

Conclusions

The remote sensing study described in this paper has shown that it is possible to precisely map Spiraea tomentosa based on HS data only. It has been proven that the fusion of HS data with basic ALS products does not effectively improve the classification results. It has also been shown that it is possible to identify species using the RF algorithm for cover-abundance of more than 30%. Similar accuracy results were obtained for summer (P1) and autumn (P2) data. Nevertheless, taking into account all the applied accuracy measures, the data acquired during the autumn period (P2) were significantly better compared to those acquired during the autumn period (P1). The results indicate that the correct classification of this species is possible even outside the flowering period. Also, the time of airborne data acquisition proved an important factor affecting the possibility of mapping the species with lower cover-abundance (30–50%). Out of the two analysed seasons (summer—fruiting period; autumn—discolouration period), better results were obtained for polygons with cover-abundance between 30–70% in autumn when remote-sensing parameters more easily discriminate Spiraea tomentosa from the background. With over 70% cover-abundance of Spiraea tomentosa, the results of the classification were correct in both periods. The results prove that the method has a great potential for application and that due to the high information capacity of HS data and a well-scheduled survey flight, it is possible to map this species even in the early stages of invasion.