1 Introduction

Widespread challenges related to sustainable forestry and forest management, conservation and restoration of habitats, biodiversity maintenance, land management and monitoring, and climate change have generated a global need for classification of natural habitats and ecosystems (Vaz et al. 2015; Flanagan et al. 2015; Buffa and Villani 2012; Dengler et al. 2009: Song et al. 2005). Ecological or habitat classification is targeted primarily to the creation of a “common language” (Flanagan et al. 2015; Barbati et al. 2014, Mc Roberts et al. 2011; Shumchenia and King 2010) that can be used to describe, assess, monitor, and manage ecological communities.

A classification is needed to describe the diversity of forest ecosystems, to analyze their change over long time series, spanning over a full range of spatial and geographic scales, and providing knowledge of reference conditions and current states of ecosystems to make decisions about conservation and resource management (Faber-Langendoen et al. 2014). Vegetation classification systems are based on a set of criteria, including physiognomy (growth forms, structure) and floristics (compositional similarity and characteristic species combinations), in conjunction with ecological characteristics, including site factors, disturbance, bioclimate, and biogeography (Faber-Langendoen et al. 2014; Barbati et al. 2014; Chytrý 2012; EEA 2006).

More specifically, a “forest type classification scheme” is aimed at stratifying a large forest area (e.g., stocked forest land) into smaller and more homogeneous units to facilitate the analysis, interpretation, and reporting of forest data (Barbati et al. 2014; Mc Roberts et al. 2011).

In this regard, the increasing availability of large forest plot datasets all over the world (Ewald 2003) could be the basis for a well-founded supra-regional vegetation classification (Dengler et al. 2009). Several authors have proposed different approaches that allow the consistent classification of such data (Bruelheide 2000; Chytrý et al. 2002; Dengler et al. 2009; Knollovà et al. 2005; Illyés et al. 2007). The vegetation classification over large geographical areas and whole continents is nowadays a realistic aim (Dengler et al. 2009; Rodwell et al. 2002; Peet et al. 2001).

The revision of the European Nature Information System (EUNIS) terrestrial habitat classification, based on georeferenced vegetation plot data, has recently led to the identification of 37 distinct forest habitat types (EUNIS level 3), considering stocked forest land only (EEA 2015a, 2015b). The EUNIS classification provides a pan-European reference for documenting, monitoring, and assessing the quality of habitats at the European level (http://eunis.eea.europa.eu/). On the other hand, the EUNIS forest types are difficult to use in monitoring or inventories as reference classification due to, e.g., a high number of classes and different spatial coverage.

The European Forest Types (EFT) classification system has proven to be operational and easier to use to facilitate understanding, interpretation, and communication of data on indicators describing the status and trends of forests, and forest management in Europe. The EFT were proposed (EEA 2006; Barbati et al. 2007) as a reference scheme for reporting on sustainable forest management (SFM) indicators, including those related to the assessment of biodiversity, according to ecologically homogeneous strata at pan-European level (EEA 2006; Forest Europe 2011; Mc Roberts et al. 2011). A recent revision of the EFT classification allows classifying stocked forest land of the pan-European region into 14 categories and 78 types (Barbati et al. 2016; Caudullo et al. 2016). The 14 categories represent groups of ecologically distinct forest communities dominated by specific assemblages of tree species, either native to Europe (categories 1–13) or introduced/alien (category 14). The exact definition of boundaries between classes is a common problem in nomenclature schemes adopted in vegetation classification (Mucina 1997). In the EFT scheme, a set of classification rules is provided with the nomenclature, in order to allow the user to assign in the field forest plots to categories and types (Barbati et al. 2006). The set of classification rules is based on (i) the main factors (biogeographical, climatic, elevation, edaphic conditions) that determine the latitudinal/altitudinal zonation of European forest vegetation or, otherwise, the appearance of azonal communities (hydrological regimes); (ii) the identification of dominant (single or group) of tree species, based on basal area/crown cover information.

The 78 types are designed to further describe the variety of forest communities covered by each category, and the breaking points between different types are defined in terms of species compositions. The European Atlas of Forest Tree Species recently documented by a tree species matrix the list of dominant or secondary trees species that characterize each class of the EFT nomenclature (Pividori et al. 2016).

We refer to Barbati et al. (2014) for a detailed description of the main characteristics of each category and type; nevertheless, key ecological and compositional features of the 14 categories are briefly summarized in Annex 1.

EFT were extensively used in several research projects, for example, for documenting the presence of mixed-species stands in European forests (Bravo-Oviedo et al. 2014), quantifying fuel flammability of forest vegetation (Corona et al. 2014), and defining post-fire management measures in forest of burned areas (Barbati et al. 2010).

In the State of European Forests 2011 (SOEF 2011) (FOREST EUROPE 2011), EFT were operationally implemented in a pilot project to report on some of the SFM quantitative indicators: forest area and its variation, growing stock volume, and deadwood (Barbati et al. 2011). Twenty-eight countries reported the indicators stratified according to the 14 EFT categories. These countries represent approximately 83% of the EU forest area (excluding the Russian Federation). The EFT allowed to report the pan-European SFM indicators according to ecologically sound units to frame indicators values. However, EFT-based reporting has not been implemented in the latest State of European Forests (SOEF 2015; FOREST EUROPE 2015). The SOEF 2015, in fact, applies a very simplified forest type classification and only for reporting on forest area (predominantly coniferous, predominantly broadleaved and mixed forests) and growing stock (broadleaves and conifers). By consequence, the aggregation of indicator values into broadly defined species groups imply reducing the information conveyed, especially in the countries characterized by a forest area with high variability in ecological conditions and level of naturalness. For instance, this can lead some countries to aggregate SFM indicators of non-native coniferous plantations (e.g., Sitka spruce, Douglas fir) with those of native conifer stands (e.g., Scot pine, fir, spruce). Likewise, in the countries with a wide bioclimatic variability (e.g., Italy, France, and Spain), the indicators of coniferous forests of the Alpine biogeographical region are averaged with those of the Mediterranean ones. In addition, the three forest type classes adopted by SOEF 2015 (FOREST EUROPE 2015) are not helpful in monitoring differential phenomena of gains and losses in forest area, e.g., climate change-related. That is, a stable value of broadleaved dominated forest area may actually hide gains and losses in the forest area covered by drought-sensitive and drought-tolerant forest types (e.g., a decline and substitution of beech forest by thermophilous deciduous species).

Though the motive for the decision of reporting SOEF indicators by these broadly defined forest types is not reported by any official document, it is reasonable to assume that the need of reducing the reporting burden for national correspondents represents a key factor. In fact, experts responsible for compiling country-level information for the SOEF process, in personal communications, complained about the lack of clear quantitative criteria to unambiguously assign sampling units (e.g., National Forest Inventories plots) to the EFT classes. For this reason, it appears particularly useful to develop a straightforward method for the objective classification of European forests according to the EFT system of nomenclature.

Classification of a forest area according to a given forest type scheme should be usually accomplished by field crew during a field survey (Westfall 2009). However, this step is subjective and relies on the personal experience of the personnel involved (Hédel 2007; Westfall 2009). Consequently, when large monitoring programs are carried out by different field crews, the information collected in the field can be affected by inconsistency, due to the fact that different crews may classify the same forest area according to different forest types (Westfall 2009).

Remote sensing techniques and Geographic Information Systems (GIS) support the production and availability of digital maps of environmental variables (Ioannis et al. 2006) such as biogeographical regions (EEA 2016), climate data (Hijmans et al. 2005), global bioclimatic classification systems (Rivas-Martínez et al. 2004), orography, geology, soils (Panagos et al. 2011), water bodies (EEA 2013a), wetlands (EEA 2013b), and tree species distribution (San-Miguel-Ayanz et al. 2016). This information opens up for new possibilities to develop automated vegetation classification models (Millington et al. 2002), to map the presence and abundance of a given species (Duveneck et al. 2015), community presence (Ackers et al. 2015; Zimmermann and Kienast 1999), community structure and composition (Ohman and Gregory 2002; Adamo et al. 2015), or forest types (Beard et al. 2013).

Formalized classifications systems which attempt to derive vegetation maps at the continental scale level are evolving in recent years (Jiménez-Alfaro et al. 2014), since they are crucial for supporting nature conservation actions and forest planning in the European Union (Douda et al. 2016).

Several methods were applied to test for the automated classification of vegetation and forest habitats on the basis of data mining or artificial intelligence approaches in a GIS environment (Adamo et al. 2015; Openshaw and Openshaw 1997). Automated classification methods most commonly consist of induction rules, decision trees, random forest, rule-based expert system (RBES) (Pérez-Ortiz et al. 2016; Capelo et al. 2007; Cook et al. 1996; Andrew 1996), artificial neural networks (Wang et al. 2005), genetic algorithms (Chen and Yao 2008), and support vector machines (Torbick et al. 2007). Among these, RBES, also known as knowledge-based systems (Andrew 1996; Capelo et al. 2007; Pérez-Ortiz et al. 2016), is one of the most widely applied approaches for vegetation classifications since they solve the problems by rules derived from human expert knowledge (Grunwald 2009; Hayes-Roth 1985).

An expert system can be defined as “a hierarchy of rules, or a decision tree that describes the conditions under which a set of low level constituent information (user-defined variables, raster imagery, vector coverages, spatial models, external program and simple scalars) gets abstracted into a set of high level information classes” (Bingyuan et al. 2014). Conventional RBES uses human expert knowledge to solve real-world problems that normally would require human intelligence.

Several scientists demonstrated the potential of RBES applied to spatial data to improve the accuracy of landscape and ecological classifications (Gao et al. 2004). RBES offers a way to codify information and use it to make predictions (Czajkowski et al. 2007) based on basic axioms of logical set theory usually consisting of if-then rules (Robinove 1986). Further, it offers important advantages in terms of simplicity and computational efficiency (Robinove 1986).

The aim of this study was to develop and test a spatially explicit algorithm that automatically classifies a given forest habitat according to the EFT nomenclature (Barbati et al. 2014). Two different non-parametric classification approaches were tested: (i) a rule-based expert system (RBES) classification algorithm and (ii) a Random Forests (RF) classification tree.

The two systems were tested to classify by EFT categories 2010 level I plots from the Forest Focus/International Cooperative program on assessment and monitoring of air pollution effects on forest (ICP Forests), based on data collected in the field in the framework of the BioSoil-Biodiversity project (JRC 2011). After having analyzed the results achieved, the automated classifier that performed better (the RBES) was then tested for a pixel-based classification of a raster pan-European distribution map of beech-dominated forests into the two corresponding EFT categories (6—beech forest and 7—mountainous beech forest).

2 Materials and methods

2.1 Study area

The two tested classification algorithms, RBES and RF, have been developed to classify stocked forest land of the pan-European region according to the EFT classification. The available sample data to test the classifiers cover 17 EU countries where ICP Forests Level I data on forest structural variables have been collected in the framework of the BioSoil Forest Biodiversity Demonstration project (JRC 2011): Austria, Belgium, Czech Republic, Denmark, Estonia, France, Finland, Germany, Ireland, Italy, Latvia, Poland, Slovak Republic, Slovenia, Spain, Sweden, and UK. These countries on the basis of the SOEF 2015 (with the exclusion of the Russian Federation) represent 52% of the European area and 62% of the European forest area (Forest Europe 2015).

2.2 Data

2.2.1 ICP level I biosoil biodiversity plots

The two systems were tested on 2010 level I plots acquired in the field between 2005 and 2008 from the Forest Focus/International Cooperative program on assessment and monitoring of air pollution effects on the forest (ICP Forests) collected in the field in the framework of the BioSoil-Biodiversity project. As input data, we used the basal area per tree species and the best available wall-to-wall pan-European digital maps of several environmental variables.

Raw data were obtained and pre-elaborated in the framework of the official activities of the UNECE ICP Forests Network (Working Group on Forest Biodiversity 2007). The BioSoil plot is a circle with an area of 2000 m2 (radius of 25.24 m) (Puletti et al. 2017). The database includes information on a number of structural and compositional variables (Bastrup-Birk et al. 2007; Hiederer and Durrant 2010). For testing the RBES algorithm, we used the following data: geographic position of the plots (x and y coordinates), diameter at breast height (1.30 m) (DBH), and species of each callipered tree. A total of 105 forest tree species was found in the field in the 2010 plots.

In the BioSoil project, each plot in the field was assigned to an EFT category according to the 14 categories documented in EEA (2006). For the purposes of this study, the information collected in the field is considered error free and is used as reference truth to quantify the accuracy of the classification algorithms.

2.2.2 Current beech forest distribution map

The current beech forest distribution map was created by the JRC at 1-km resolution in the framework of the European Atlas of Forest Tree Species (San-Miguel-Ayanz et al. 2016). The map was used to test the automated classification at pixel level. The map relies on statistical interpolation of field observations and remotely sensed information, performed with the Constrained Spatial Multi-Frequency Analysis (C-SMFA). Further details are provided by de Rigo et al. (2016). To test the classification system, we used the pixels where the relative probability of presence of beech is ≥ 50% (i.e., pixels of the map with a value ≥ 0.5) in order to derive a map for beech-dominated forest areas.

2.2.3 Digital maps

Six digital thematic maps were used in this study as input geodatasets for the automated classifiers.

The map of biogeographic regions of Europe, prepared in scale 1:10,000,000 by the EEA (EEA 2015a, 2015b), which contains the official delineations of nine regions. The map was developed for the Habitats Directive (92/43/EEC) including the design of the Natura 2000 and EMERALD networks related to the implementation of the Convention on the Conservation of European Wildlife and Natural Habitats (Bern Convention) (Council of Europe 1979). The Bioclimatic Map of Europe (Rivas-Martínez et al. 2004), at a 1:16,000,000 scale, identifies the thermoclimatic belts of Europe in 5 regions, 9 subregions, 34 provinces, and 88 subprovinces (Rivas-Martínez et al. 2004). The Natural Vegetation map of Europe, in a scale of 1:10,000,000 (Bohn et al 1994), was used to map the potential range of distribution of oligotrophic soils where acidophilous oakwoods can occur (class F1, see rule 5 hereinafter described).

From the Copernicus Land Monitoring Services project at the European Environment Agency (EEA) (http://land.copernicus.eu/pan-european/high-resolution-layers), we used two high-resolution layers, both available as raster boolean maps having a geometric resolution of 20 m: (a) wetlands and (b) permanent waterbodies. Finally, we used the pan-European Digital Elevation Model (DEM) with an original spatial resolution of 1 arc sec available at EEA (2010).

2.2.4 Non-native tree species database

We used the list of tree species identified as alien at country level by the project DAISIE—Delivering Alien Invasive Species Inventories for Europe (http://www.europe-aliens.org/). The 105 tree species surveyed in ICP plots were accordingly reclassified as non-native (if included in the DAISIE list) or native (if not included in the DAISIE list) in the 17 investigated countries.

2.2.5 Methodology

For each plot, we calculated the percent basal area of each tree species surveyed out of the total basal area of the plot and, on the basis of the geographic location of the center of the plots, we extracted the information from the six digital maps.

2.2.6 Rule-based expert system EFT algorithm

According to Barbati et al. (2014), a species is considered dominant when its basal area per plot represented at least 50% of the total plot basal area. Species that represented less than 5% of the total plot basal area were not considered. The RBES algorithm is a system of conditional (if-then) rules which recursively partitions the data set into smaller subdivisions on the basis of a set of rules defined at each branch of the decision tree.

  1. Rule 1:

    we classified in forest category 14 (introduced and alien tree species forest) all the plots where the dominant species was not native at the country level, according to the DAISIE database.

  2. Rule 2:

    forest categories 11–12 correspond to azonal forest communities characterized by specific hydrological regimes. We assigned to category 12 (floodplain forests) the plots within a 1-km buffer created from the COPERNICUS permanent waterbodies high-resolution layer (EEA 2013a, 2013b), and in category 11 (mire and swamp forest), the plots falling within areas mapped by the COPERNICUS wetland high-resolution layer (EEA 2013a, 2013b).

  3. Rule 3:

    forest categories 1, 2, and 3 are defined on the basis of the biogeographical regions and species composition. The geographic boundaries of the Boreal and of the Alpine biogeographical regions, required for the classification of categories 1 (boreal forest) and 3 (Alpine coniferous forest) respectively are taken from the map of biogeographic regions of Europe (EEA 2015a, 2015b). The limit of the hemiboreal area where category 2 (nemoral coniferous and mixed broadleaved coniferous forest) is distributed is instead delineated from the unit “temperate hemiboreal bioclimate” of the Bioclimatic Map of Europe (Rivas-Martínez et al. 2004).

  4. Rule 4:

    forest categories 6 and 7 are beech-dominated communities with a different altitudinal zonation. In Mediterranean and Alpine biogeographic regions, the plots with altitude less than 650 m a.s.l. were assigned to category 6, the others to category 7; in Continental, Atlantic, Pannonian, Hemiboreal, and Boreal regions, the altitude threshold for category 6 was set to 350 m.

  5. Rule 5:

    plots were assigned to category 4 (acidophilous oak and oak-birch forest) on the basis of species composition and spatially congruence with geographical distribution of acidophilous oakwoods (F1 class) delineated in the Natural Vegetation map of Europe (Bohn 1994).

  6. Rule 6:

    to classify the remaining categories 5 (mesophytic deciduous forest), 8 (thermophilous deciduous forest), 9 (broadleaved evergreen forest), and 13 (non-riverine alder, birch, or aspen forest), we used the information on the tree species composition only.

The information available in the matrix of European Forest Types’ dominant tree species published in the European Atlas of Forest Tree Species (Pividori et al. 2016) was used for identifying dominant trees to classify the plots. The system was implemented in an R code, using the packages raster (Hijmans 2015), shape (Soetaert 2014), and spatial.tools (Greenberg 2014). The overall decision tree used in the RBES is available as Annex 2 to this contribution.

2.2.7 Random Forest classification tree

RF classification tree is a non-parametric “ensemble learning” algorithm based on a machine learning approach (Breiman 2001). Classification trees create a set of rules for binary partitioning input dataset in regions (called nodes) which are as much homogeneous as possible on the basis of the available predicting variables (Breiman et al. 1984). RF is an enhancement of traditional decision trees because it consists of a large number of trees, and the final classification is determined as the majority of the results obtained by the trees. In this study, we used RF as it is implemented in R-cran RandomForest package on the basis of 500 trees (Liaw and Wiener 2002). The code permits to assess the total accuracy of the classification with a bootstrapping approach and the importance of each specific predictor through the Mean Decrease Gini coefficient (Liaw and Wiener 2002).

2.2.8 Accuracy assessment

The accuracy of the two classification approaches was evaluated against the EFT classification at category level assigned in the field. To do so, we first created a confusion matrix comparing the 2010 plots in terms of field-based and algorithm attributions to EFT categories. On the basis of the confusion matrix, we calculated the overall accuracy as the percent of plots, where the two classifications agree out of a total of 2010 plots and the kappa index of agreement (KIA) (Cohen 1960); for each category, we also calculated omission (EO) and commission (EC) errors and the user and producer accuracy.

We followed the same approach also to evaluate the accuracy of the beech map produced for the two EFT categories (6 and 7). In this later case, we compared the result of the classification for the pixels relative to the position of the 2010 ICP plots.

Data availability

The R code of the algorithm is available from the corresponding author upon request.

3 Results

3.1 ICP classification

The overall accuracy of the EFT classifications was high, 0.94 and 0.90, and the KIA 0.64 and 0.60 for RBES and RF, respectively (Tables 1 and 2, Fig. 1).

Table 1 The confusion matrix is calculated by comparing the EFT category assigned to forest plots in the field with the EFT category returned by the RBES EFT algorithm. Overall accuracy = 0.94, KIA = 0.64 (see Annex 1 for categories description) In Italics the number of plot where the EFTs classification obtained by RF classification is congruent with the ones attributed in the Field by crew.
Table 2 The confusion matrix is calculated by comparing the EFT category assigned to forest plots in the field with the EFT category returned by the RF classification tree. Overall accuracy = 0.92, KIA = 0.60 (see Annex 1 for category description)
Fig. 1
figure 1

ICP plots classified into the categories of the European Forest Types in the 17 considered countries by the RBES algorithm (see Annex 1 for the description of EFT categories)

The errors of omission of the RBES algorithm ranged between 0.09 (category 10) and 0.219 (category 11), while the errors of commission ranged between 0 (categories 9) and 0.34 (category 14) (Table 1).

The errors of omission of RF classification tree ranged between 0 (category 10, 11, and 12) and 0.138 (category 6), while the error of commission ranged between 0 (category 9) and 4.0 (category 11) (Table 2).

Using the species composition only in the RF algorithm, the overall accuracy dropped to 0.62, thus indicating that information from the geospatial dataset is critically relevant for the EFT classification. The ranking order of these predictors evaluated on the basis of the indicator Mean Decrease gini (MDG) was the following: latitude (MDG = 253.73); bioregions (MDG = 212.51); elevation (MDG = 115.42); longitude (MDG = 97.13); soils (MDG = 14.00); proximity to river (MDG = 9.10); and presence of wetland (MDG = 1.34)

Per-class user’s accuracies (UA), values higher than 0.85 were reached in most EFT; the lowest UA value was 0.78 for category 11. Producer’s reliabilities are significantly higher in RBES than in RF (ranging between 0.2 and 0.4) for some EFT (10 to 13 classes). Per-class producer’s accuracy ranges from good (~ 0.7 for class 4) to perfect (class 9) in RBES, and from very low (~ 0.2 for class 11 and 12) to perfect (class 9) in RF.

3.2 Beech map classification

Since the RBES algorithm produced slightly better results, we used it to produce a classification of the current distribution of beech forests in the EFT categories 6 - beech forest and 7- mountain beech forest (Fig. 2).

Fig. 2
figure 2

Beech map classified in two EFT categories (6—beech forest and 7—mountainous beech forest)

The overall accuracy of the map evaluated on the basis of the ICP forest plots classified in the field was equal to 0.93 (Table 3). In (Fig. 1) it is reported the EFTs classification obtained by RBES of the ICP Forest plots.

Table 3 The confusion matrix is calculated by comparing the EFT category of the plot assigned in the field with the EFT category returned by the RBES algorithm

4 Discussion

Both RBES and RF classification algorithms produced very satisfactory results in the automated classification of the European Forest Types categories. The RBES, compared to the RF approach, is preferable for two main reasons: (1) because RBES is a totally transparent approach that can be easily replicated with different input data from field surveys or thematic maps; (2) because the RBES in our test demonstrated a better performance, in terms of producer accuracy.

The two classification approaches produced similar results for categories 1, 2, 3, 4, and 9. RF produced better results for categories 4, 5, 6, 8, and 14 while the RBES performed better for categories 7, 10, 11, 12, and 13. Analyzing in detail the source of errors in the misclassification of categories 4, 5, 6, and 8, we found that in 63% of these plots the classification in the field was affected by some inconsistency between dominant species, as quantified by basal area data, and the EFT category identified in the field. Two possible explanations can be hypothesized. First, clear rules linking tree species composition to forest categories, as presented in Pividori et al. (2016), were not available at the time when the BioSoil project was carried out. Thus, in these cases, the inconsistency can be attributed to the subjective evaluation of the field crew. Secondly, if it happens that the field plot is located in a small stand, surrounded by a wider forest type with a different composition, the plot is attributed to the largely represented forest category dominated by forest trees other than those observed in the plot. In this case, the source of error derives from an ambiguity between field survey and forest classification scales.

The RBES classification errors in category 14 are instead to be attributed to a change in the definition of this specific category, occurring as a consequence of the EFT modifications introduced by Barbati et al. (2014). More specifically, plantations of native species were included in category 14 during the BioSoil project, while the revised EFT (Barbati et al. 2014) includes in this category forest stands dominated by non-native species.

The rest of the errors introduced by the RBES algorithm are most probably due to the limited spatial resolution of thematic maps used, which cannot capture the small-scale variability of forest habitats that instead were detected during the field survey. This is especially true for category 12. The buffer around waterbodies of 1 km is in fact most probably too wide, but it was the minimum possible value to be consistent with the pixel size of the analysis which is imposed by the trees species map resolution (1 km × 1 km) currently available at European level (San-Miguel-Ayanz et al. 2016).

When the RBES system is applied to stratify into EFT other forest type-based maps, as in the case of the beech map provided by the JRC (de Rigo et al. 2016), we found an excellent accuracy (OA 0.93) when comparing pixel-based classification with the ground survey carried out in the ICP level I plots. In this case, classification errors are mainly due to the coarse resolution of the beech map if compared to the field survey unit dimension.

Our findings ultimately demonstrate that the RBES system can be easily applied to objectively classify forest inventory plots by EFT. The information needed on stand variables is in fact routinely collected in European forest inventories: (i) basal area (derived by callipered DBH), (ii) trees species, and (iii) geographic location of the plot, and based on plot position, information on other environmental variables can be automatically extracted from thematic maps.

Such an automated RBES system could be routinely applied in the classification of all European forest inventory plots which is the operation needed to obtain European forest statistics based on EFT categories, overcoming the limited informative power of the current SOEF 2015 report which is instead based on the adoption of only three broad species groups (conifers, broadleaves, mixed forests) (FOREST EUROPE 2015).

5 Conclusions

The outcome of the present study leads to the application of a spatially explicit RBES algorithm which is able to automatically classify a stocked forest area according to the EFT system of nomenclature at pan-European scale level. This system is a decision tree based on two main information sources: (1) the basal area per tree species and (2) the geographic position which is used to query several different thematic pan-European digital thematic maps. In order to better understand the potentiality of the RBES approach, we carried out a comparison with the well-known data mining RF algorithm.

The comparison was based on the data collected in 2010 ICP Forests Level I plots surveyed in the framework of the BioSoil Forest Biodiversity Demonstration project. The results obtained with the RBES approach in terms of overall accuracy (94%) and KIA (0.62) were very satisfactory and better than those obtained with RF. All the variables we used in the classifications are relevant. On the basis of the MDG analysis carried out in the RF approach, we found that without the use of geographic variables (those variables coming from maps and not from the field work), the overall accuracy of the classification decreases by 30%. The maps currently available at pan-European level have limited resolution, for the discrimination of local site environmental variability; it is thus highly recommended that in the future high-resolution layers also for bioclimatic variables could become available.

The RBES system was able to classify all the plots in a way that was consistent, to a large extent, with the field-based classification performed by the crews. The automated classification could be easily applicable to the data collected in the NFIs in Europe, for example, to stratify in a harmonized way NFI datasets into EFT and process-associated forest type-based SFM indicators to be reported in the framework of Forest Europe reporting (SOEF).

The RBES system, available in a complete form as the classification diagram in Annex 1 to this contribution, can be potentially applied across different spatial scales. As we demonstrated for the beech forests, the algorithm can be used to derive EFT maps, at least when digital raster maps with the percent presence of the forest species (in terms of basal area or forest cover) are available. Moreover, multipurpose large-scale inventories would benefit from the integration of the European forest types, in terms of increased capability to assess mechanisms that influence changes of the forest systems and the provision of forest ecosystems (Corona 2016).

The rule-based expert system here presented can significantly contribute to the large-scale deployment of the EFT classification in forest monitoring initiatives in Europe. We also hope that in the future more accurate and detailed pan-European digital maps related to forest and environmental variables would become available to improve the quality of the classification and, therefore, the perspectives of future operational use of this system.