Keywords

1 Introduction

This chapter sets out the aims of this book and explains the methods and approaches applied in its production. It also aspires to be a guide, offering readers instructions as to how best to use the book. We therefore strongly encourage all readers to read this chapter carefully, so as to gain a clearer understanding of all the different aspects analysed in this book. This chapter also provides essential information for those wishing to do the practical exercises in this book.

We begin by presenting the aims of the book and we offer a few tips explaining how each group of users can make best use of this book according to their particular requirements. Then, we provide information about the software and the data required to carry out the practical exercises presented in Parts II and III (Sect. 5). In the last section, we offer a detailed explanation of the review of LUC datasets carried out in Chap. “Land Use Cover Datasets: A Review” and Part IV (Sect. 6).

The book is the fruit of two research projects which seek to provide a clearer understanding of the uncertainties associated with Land Use Cover maps and with the results of Land Use Cover Change modelling exercises (INCERTIMAPS Project: Suitability and uncertainty of land use and land cover maps for the analysis and modelling of territorial dynamics) and the promotion of Open Access software for teaching spatial science (PE117519: Herramientas para la Enseñanza de la Geomática con programas de Código Abierto). See complete information about these projects in the section Acknowledgements.

2 What is the Main Aim of This Book?

The aim of this book is to provide an up-to-date state of the art on Land Use Cover (LUC) datasets and validation tools. The book summarizes the available information and makes it accessible to any interested user, including some of the latest developments in the field.

The book was conceived as a practical tool to inform readers about currently available LUC datasets at global and supra-national scales and to help them understand more about the validation of LUC data and LUCC modelling exercises, so enabling them to validate their own data and models. To this end, the book combines brief theoretical explanations with practical information and exercises.

Part I of the book briefly covers the theoretical foundations of LUC mapping, LUCC modelling and the analysis and assessment of their associated uncertainties. Parts II and III were conceived as practical guides to enable any reader to use any of the tools and data. Part II covers the visualization of LUC data and the production of reference datasets to validate LUC maps. Part III describes the use of common validation tools and the interpretation of their results. All the practical exercises are accompanied by an explanation of the basic theory behind them, so as to enable users to understand the analyses and the principles on which the techniques are based. Finally, Part IV of the book characterizes the most relevant available LUC data. It also provides all the necessary information as to how to download and use the datasets.

As the book aims to reach the widest possible audience, the theory is briefly explained in simple, understandable terms. Practical exercises are implemented in QGIS, an open-source Geographical Information System, which can be downloaded for free.

3 Who is the Book Aimed At?

The book is aimed at anyone interested in Land Use Cover (LUC) mapping, Land Use Cover Change modelling and Land Use Cover Change analysis. Although to make full use of the book, some background in the field is recommended, it aims to be accessible and useful to all kinds of user, regardless of their level of expertise. Nonetheless, a basic knowledge of spatial analysis and GIS analysis is required to understand a lot of the information provided.

The book will be particularly useful for researchers working in the fields of LUC mapping and LUCC modelling and especially for those interested in validation methods and the available sources of LUC data. Those interested in the application of open-source software in LUC may also find this book very useful, as it is the only book working with open-source software that focuses on these topics from a holistic perspective. For the QGIS community, the book provides the relevant information and tools to enable users to take full advantage of the software and expand the fields in which it can be effectively applied.

4 How to Use This Book?

The book can be used in different ways, depending on the type of user and their particular background and interests. With this in mind, it has been conceived as a flexible tool that can be used for a wide variety of purposes.

Beginners in this field are referred to Chap. “Land Use Cover Mapping, Modelling and Validation. A Background”, as are other users interested in gaining an overall picture of LUC mapping, LUCC modelling and the essential concepts required for uncertainty and validation analyses. This short, yet comprehensive chapter sets out the basic theoretical principles on which the rest of the book is based and is therefore recommended reading for all users.

For LUC data visualization and creation, readers are referred to Part II of this book. It provides an overview of the different options available for symbolizing LUC data and LUC change in GIS. It also addresses some of the problems associated with the spatial visualization of LUC information. This part of the book also includes a tutorial on the creation of a set of sample points for LUC data validation with QGIS.

Users interested in the validation of LUC datasets and Land Use Cover Change (LUCC) modelling exercises should refer to Chap. “Validation of Land Use Cover Maps: A Guideline”. This provides guidelines for validating different LUC products: single LUC maps, LUC map series, and outputs from LUCC modelling exercises. The different tools and methods referred to in these guidelines are then described in detail and applied in practice in the example exercises in Part III of the book.

Users interested in doing the example QGIS exercises appearing in this book should refer to Sect. 5 of this chapter, which presents all the data and the cases studied in this book. It also offers essential information about the particular version of QGIS that we use and about how to integrate R software into QGIS, a necessary step when carrying out some of the exercises set out in the book.

Those interested in LUC data sources should refer to Chap. “Land Use Cover Datasets: A Review”, which offers an introduction to LUC mapping at global and supra-national scales, including a review of the different datasets available. Part IV of the book offers in-depth descriptions of most of the datasets that are available for download, detailing their specific characteristics and how they can be accessed. The methodology followed in the review of the datasets is described in Sect. 6 of this chapter.

5 QGIS Exercises: Software, Study Areas and Data

5.1 GIS Software

Of all the Geographical Information Systems (GIS) currently available, in this book we use QGIS, a well-known, open-source GIS software that is widely used and recognized. It provides a unified interface to many other relevant open-source GIS software programmes, such as SAGA, GDAL, GRASS or LasTools (Menke et al. 2016). It also allows integration with R, a powerful open-source software for statistical analysis.

We opted for the QGIS 3.10.13 “A Coruña” version of QGIS for the practical exercises included in this book. This is because it was the newest long-term release version of QGIS available when we began writing the book.

Users could try other versions of QGIS when doing the exercises included in this book. However, they should bear in mind that the exercises have been created and tested using the version indicated above and that certain issues and errors may arise when using any other version of QGIS. Earlier versions of QGIS prior to QGIS 3 are strongly discouraged, as important changes were made in the software between versions 2 and 3 and many features of QGIS 3 do not work in earlier versions of the software.

The latest version of QGIS is available at the QGIS website (www.qgis.org). Users who require a specific version of this software should visit: https://qgis.org/downloads/. Full documentation relating to the software can also be found at the official website: https://www.qgis.org/en/docs/index.html, where inexperienced QGIS users will find a brief introduction to the software interface and the main tools.

Several user manuals are also available to help beginners make the most of the software. These include the books published by Packt (Graser et al. 2017; Cutts and Graser 2018) and the series of manuals coordinated by Baghdadi et al. (2018a, b, c, d), which contain both generic and thematic GIS exercises.

5.2 QGIS Plugins

QGIS works with plugins written in the C++ and Python programming languages. These plugins are an easy way to expand the capabilities of the software, which is why many of the features of the software are currently implemented through these plugins.

There are two types of plugins: core and external plugins (QGIS Project 2020). The core plugins are maintained by the QGIS Development Team and automatically form part of the distributed software. The external plugins are developed by a community of users and are available at the QGIS Python Plugins Repository (https://plugins.qgis.org/plugins/).

The external plugins may be up-to-date or outdated and are usually available for specific QGIS versions. The official plugin repository includes information about all these questions. External plugins that are still in the early stages of development and have not been widely used are marked by QGIS as experimental plugins and are not directly available through the software.

Several QGIS plugins are used in the exercises presented in this book (Table 1). In all cases, we used the most up-to-date versions of these plugins as of when we began writing. Some of the plugins may have been updated since then, which could lead to certain differences in the interface and the results. This is something that readers should be aware of when using the plugins.

Table 1 QGIS plugins employed in the practical exercises of the book

The Semi-Automatic Classification Plugin is one of the most important QGIS plugins and is used in many of the exercises in this book. It was developed and updated by Luca Congedo (2016) and provides a comprehensive interface and set of tools for classifying remote sensing imagery. This includes many tools for validating image classifications, which are also used in this book. For more information on the plugin and how to use it, users must refer to the plugin manual (Luca Congedo 2016) and official website (see Table 1).

LecoS (Landscape ecology Statistics) is a plugin developed by Jung (2016) to calculate the spatial metrics usually employed in the field of landscape ecology. Although other methods can be implemented in QGIS to calculate these metrics, the LecoS plugin is the best-known QGIS tool for this purpose. All the relevant information about the plugin is available at the official website (see Table 1).

The R Processing Provider allows the R software capabilities to be integrated into QGIS. Full documentation on the plugin is available at the official website (see Table 1). Users can also find extra information on the plugin and the way the R language can be integrated into QGIS in the official documentation on QGIS.Footnote 1 To find out more about how to integrate R into QGIS, users should consult Sect. 5.3 of this chapter.

QuickMapServices is a very used QGIS plugin that allows to import to the QGIS interface many different web-map services of different kinds (XYZ tiles, TMS, WMS, WMTS, ESRI ArcGIS Services). More information on the plugin is available in the official website (see Table 1) and the manual recommended by the plugin’s authors, in Russian.Footnote 2

The Google Earth Engine Data Catalog plugin provides direct access in QGIS to the data catalog that takes part of the Google Earth Engine platform. Users will need a Google account to make use of this plugin. However, not much information is available about the plugin. If needing more information, users are referred to its official website (see Table 1).

We also use MapAccurAssess, a plugin specifically developed for the exercises of this book by Domínguez Vera (2021). Although not available yet in the official QGIS plugin repository, it can be downloaded from the official repository of information accompanying this book (see Table 1). The plugin provides a tool for assessing the accuracy of classified Land Use Cover images, taking into account the recommendations made by Olofsson et al. (2013). For more information about the plugin, users are referred to the plugin manual, in Spanish (Domínguez Vera, 2021). It is also available in the official repository for this book.

To install any of these plugins in QGIS, access the “Manage and install plugins…” tool in the plugins menu to find the plugin you require. Once selected, click on the “Install Plugin” option (Fig. 1). In the “Settings” tab of the tool, users can also make experimental and deprecated plugins available in QGIS. To install MapAccurAssess, use the “Install from ZIP” tab, select the downloaded file and then click “Install Plugin” (Fig. 2).

Fig. 1
figure 1

QGIS plugins. Standard plugin installation workflow

Fig. 2
figure 2

QGIS plugin. Plugin installation from a zip file

5.3 Integrating R into QGIS

Some of the exercises presented in this book use R, a free, open-source statistical software. QGIS enables the R environment to be integrated into the software, making it easier for any QGIS user to take full advantage of the tools available through R.

QGIS does not have the required tools to compute all the validation tools and methods that have been reviewed in this book. We have therefore had to implement some of them in QGIS through the R processing environment. Users wishing to find out more about R and its integration into QGIS, with practical exercises about how to use both software packages in combination, should consult the manual by Islam (2018).

To integrate R into QGIS, users must begin by downloading the R software. R and any of its associated data can be downloaded from a comprehensive file network, from which users must select the mirror closest to their location at https://cran.r-project.org/mirrors.html.

Once downloaded and installed, users must also install a series of packages in R to execute the different tools and methods included in the book (Table 2). This step cannot be carried out through the QGIS interface. Users must open R and manually install the different packages. To do this, select Packages > Install Package(s)… from the menu (Fig. 3). In the window that opens, select the mirror from which to download the packages (Fig. 4). Finally, select the package to be installed (Fig. 5). Installation of the package may take a little while to complete. Installation is complete when the R console allows the user to write new code (Fig. 6).

Table 2 List of R packages required to use the R scripts provided in this book
Fig. 3
figure 3

Integrating R in QGIS. Installing the required pachakes in R: first step

Fig. 4
figure 4

Integrating R in QGIS. Installing the required pachakes in R: second step (mirror selection)

Fig. 5
figure 5

Integrating R in QGIS. Installing the required pachakes in R: third step (package(s) selection)

Fig. 6
figure 6

Integrating R in QGIS. Installing the required pachakes in R: end of the workflow

Fig. 7
figure 7

Integrating R in QGIS. R configuration in QGIS

Table 2 lists the packages required to do the different exercises appearing in this book. In the table, next to each package name, we offer a link to the website with all the information about the package: description, download link, reference manual, etc.

After installing R and the required packages, we need to install the QGIS plugin that allows us to integrate the two software packages. This is the “Processing R provider” plugin. Instructions to this end can be found in Sect. 5.2 of this chapter. After installing the plugin, users must download the scripts we have developed to integrate the R tools and capabilities into QGIS. These scripts are listed in Table 3 and are available at https://doi.org/10.5281/zenodo.5418985 in the official repository for this book.

Table 3 List of the R scripts developed for use in this book

Once downloaded, the script files must be pasted into the R scripts folder of QGIS. The path to this folder can be found in the “Options” menu of QGIS. To access it, go to Settings > Options…. and then select the “Processing” submenu (Fig. 7). In the “Providers” tab, there is a specific tab for “R”. After opening this tab, a list appears including the “R scripts folder” path, which indicates where users must save the scripts that come with the book.

5.4 Study Areas

The exercises provided in this book are applied to three specific study areas: the Ariège Valley (France), the Asturias Central Area (Spain) and the Marqués de Comillas municipality (Mexico). We now offer a brief introduction to these study areas, so as to give readers the contextual information they require for a clearer understanding of the results of the exercises.

Fig. 8
figure 8

Location map of the Asturias Central Area

Fig. 9
figure 9

Map showing the location of the Ariège Valley

Fig. 10
figure 10

Location of Marqués de Comillas

5.4.1 The Asturias Central Area (Spain)

The Asturias Central Area is a rural-industrial-urban area located in the heart of Asturias, in Northern Spain (Fig. 8). It hosts around 80% of the Asturian population and most of its economic activity (Rodríguez Gutiérrez et al. 2009). It is made up of a polycentric set of cities of different sizes that play a complementary socioeconomic role. The cities are surrounded by a network of villages and plenty of rural space, where a traditional rural economy and lifestyle is mixed with peri-urban dynamics (Rodríguez Gutiérrez et al. 2013).

The cities at the top of the urban hierarchy are Oviedo, Gijón and Avilés, which concentrate most of the urban LUC dynamics in recent decades (Gobierno del Principado de Asturias 2016). The area within the triangle formed by the three cities has also been the subject of important LUC dynamics, with the emergence of new industrial and residential developments, attracted by the accessibility that the area’s extensive transport network provides (Méndez García and Ortega Montequín 2013). The south of the Asturias Central Area is dominated by small industrial cities, mainly Mieres and Langreo, located in long, narrow valleys where there is almost no new space for development (Prada Trigo 2011). These were formerly mining/industrial towns which are now in decline.

5.4.2 Ariège Valley (France)

The Ariège Valley area consists of the central part of the valley formed by the River Ariège, which is situated is in the department of the same name about 70 km south of Toulouse (Fig. 9). It covers an area of 1113 km2 and has a population of about 80,000 inhabitants. The Ariège Valley is a rural area with agriculture in the northern part and wooded land in the south, approaching the Pyrenees. The largest town is Pamiers, in the centre of the valley, with about 15,000 inhabitants, while the departmental capital, Foix, has a population of 9700. Saverdun, in the north of the valley, has 4900 inhabitants.

In the past, the Ariège Valley was a centre for industrial and mining activities while today it is mainly rural. Tourism is increasingly common. The most notable LUC dynamics are reforestation and the increase in built-up areas, which are mainly concentrated along the river.

5.4.3 Marqués de Comillas (Mexico)

Marques de Comillas is a physiographical region of the Lacandon rainforest in Chiapas, Mexico (Fig. 10). Bounded by two rivers, the Usumacinta and the Lacantun, it comprises approximately 15% (2032 km2) of the Lacandon region. The climate is hot and humid, with an average annual temperature of 24.3 °C and average annual precipitation of 2960 mm, most of which falls from May to December (García-Amaro 2004).

A colonization programme by the Mexican Government in the 1970s encouraged the establishment of farming communities in forest-covered areas, promoting agriculture, agroforestry (cacao) and cattle ranching, which is currently the most important business activity. Over the last 40 years, Marqués de Comillas has suffered a dramatic loss in forest cover; in the mid-1980s, forests occupied 83% of the region, while today, this has fallen to just 29%, less than half of which are well-preserved forests. The landscapes are now made up above all of mosaics of agricultural lands, cattle pastures and human settlements.

5.5 Data

All the data used in the example exercises provided in this book can be found online and downloaded at https://doi.org/10.5281/zenodo.5418318 in the official repository for this book. This data consists of LUC maps for the three different study areas (Ariège Valley, Asturias Central Area and the Marqués de Comillas municipality) and the data from LUC modelling exercises for the first two. The data for Ariège Valley comes from the work carried out Nabila Bounoua and Jéromine Le Campion, students of the Master in Geomatics SIGMA at the University of Toulouse Jean Jaurès.

Detailed information on the LUCC modelling exercises developed for the Asturias Central Area and Ariège Valley can be found in studies by García-Álvarez et al. (2019) and Bounoua and Le Campion (2019). The LUC maps for these two areas were obtained from two different datasets: CORINE Land Cover, SIOSE. The LUC map for the Marqués de Comillas municipality was obtained through the classification of satellite imagery.

We will now briefly describe the LUC datasets and maps that form part of the database for each study area. At the end of this section, there is a table with all the files used in this book.

CORINE Land Cover (CLC) is a pan-European dataset of LUC information available for five different dates from 1990 to 2018. It provides detailed, coherent LUC information for most of the countries in Europe. It is usually carried out by photointerpretation in vector format at a scale of 1:100,000, with a Minimum Mapping Unit (MMU) of 5–25 ha and a Minimum Mapping Width (MMW) of 100 m. Detailed information about this dataset can be found in Chap. “General Land Use Cover datasets for Europe” of this book.

A simpler version of CLC is used in the Ariège Valley (Fig. 11) and the Asturias Central Area (Fig. 12) case studies. In the latter, CLC is available in both vector and raster format. Although CLC is officially distributed in raster format at a spatial resolution of 100 m, the CLC rasters for the study areas in this book are provided at a different spatial resolution: 50 m for Asturias and 15 m for Ariège. These rasters were obtained after rasterizing the CLC vector layers.

Fig. 11
figure 11

Land Use Cover map (CORINE Land Cover) Ariège Valley

Fig. 12
figure 12

Land Use Cover maps (CORINE Land Cover, SIOSE) Asturias Central Area

SIOSE (Sistema de Información sobre Ocupación del Suelo de España) is a Spanish dataset in vector format that provides very detailed LUC information. It was obtained by photointerpretation of aerial imagery at 1:25,000, with a MMU of 0.5–2 ha and a MMW of 10 m. It follows a specific data model aimed at objects, which means that all the land uses and covers in a polygon are described by a specific code. This means that instead of being assigned to a specific LUC category, each polygon is described by a code detailing its LUC composition.

Some of the maps in the Asturias Central Area case study were obtained after simplification of the SIOSE database. The maps were obtained after the classification of each SIOSE polygon into a single category and after the rasterization at 50 m of the original vector dataset (Fig. 12). More information on how this operation was performed can be found in García-Álvarez (2018). Extra information about the characteristics of SIOSE can be found in Valcárcel et al. (2008) and García-Álvarez and Camacho Olmedo (2017).

The Marques de Comillas LUC map (Fig. 13) is part of a database on Land Cover and Land Cover/Land Use Changes in the State of Chiapas in Mexico. The original database covers 7.5 million ha, of which the Marqués de Comillas map covers a small section of approximately 200,000 ha. The maps were computed via a supervised classification of 2019 Sentinel-2 imagery. They were subsequently photo-interpreted to correct errors from the supervised stage as well as to include information on agricultural land uses. The map contains eight thematic categories describing levels of forest conservation, and other land uses; the approximate scale is 1:40,000, with an MMU of one ha. More information can be found at the following link: https://bosqueschiapasdemo.ecosur.ourecosystem.com/.

Fig. 13
figure 13

Land Use Cover Map Marqués de Comillas

In the following tables, we list the files from the different datasets and LUC modelling exercises described above that have been used in different exercises in this book. More datasets are available online, including extra LUC maps and model drivers not considered in the exercises in this book.

The tables include information about the name of the file available for download and the descriptive name used to refer to these files in the book. For each dataset, we also provide the projection of the dataset and the file describing the legend of the maps. A document listing all these characteristics for the layers only available online is also provided when downloading the data.

Ariège Valley (Val d’Ariège)

Projection: WGS84/UTM 31N (EPSG: 32631)

Associated files: BD_Val_Ariege (Word document file): explanation and legend

Asturias Central Area

Projection: WGS84/UTM 30N (EPSG: 32630)

Associated files: Legend Asturias maps (spreadsheet)

Marqués de Comillas

Projection: WGS84/UTM 15N (EPSG: 32615)

Associated files: Marques_LUC_datasets (Word document file): dataset description and legend

6 Review of Land Use Cover Datasets

Chapter “Land Use Cover Datasets: A Review” and Part IV of the book contain a review of the Land Use Cover datasets available at global and supra-national scales. Due to the limited extent and scope of this book, we did not review national and regional LUC datasets, which are far too numerous for our purposes.

The datasets we reviewed are classified into two groups, depending on the information they provide. The first group is made up of the datasets that provide information about the different land uses or covers without focusing on any one of them in particular, i.e. general LUC datasets. The second group consists of the LUC datasets that map a specific land use or cover in detail (e.g. vegetation, croplands, built-up areas…). These are referred to as thematic LUC datasets. Some datasets are difficult to assign to one of the two groups, as they map a wide range of LUC categories while also providing specific detail on just one of them. The authors decided which group to assign them to on a case-by-case basis.

The datasets were also classified according to their extent, differentiating between global and supra-national LUC datasets. The first group of datasets maps land uses or covers all over the Earth, while the second maps them for a specific area covering more than one country. The maps in the second group may cover a whole continent or focus on just a few countries.

When making the review, we consulted the most relevant web portals and repositories of LUC data (Table 4). A few selected papers, reports and other relevant documents reviewing or comparing LUC datasets were also consulted (Manakos and Braun 2014; Mora et al. 2014; Grekousis et al. 2015; Tsendbazar et al. 2015; Diogo and Koomen 2016; Klotz et al. 2016; Pérez-Hoyos et al. 2017; Fritz et al. 2019).

Table 4 List of repositories and web portals distributing LUC information at global and supra-national scales

Very old or outdated maps, which were produced according to traditional cartographic methods, are not included in this review. Nor are other old maps that combine LUC information with other data about climate or biogeographic variables, such as the maps produced by Matthews (1983) and Olson et al. (1983). Traditional maps obtained through photointerpretation of aerial imagery and field survey, which offer information about certain specific land covers such as vegetation and agricultural areas, are not included in the review either. Although they may be interesting sources for historical LUC change analysis, they are usually only available for national or more detailed areas and normally have not been digitalized.

There are plenty of other spatial datasets that provide important information for studying specific land covers. For vegetation covers, maps of live biomass are a good example (Kindermann et al. 2008; Thurner et al. 2014). These datasets were not included in our review because they are not specific sources of LUC information focusing exclusively on land cover. However, there is an enormous amount of data like this that may be useful for the study and characterization of LUC. This data comes in many different forms and from a range of different sources.

Part IV of the book characterizes in detail all the reviewed LUC datasets that are currently available for download and may be relevant for a wide community of users. Datasets produced at very coarse scales or which are already very outdated are not described in Part IV, as they are of limited utility for most members of the LUC community. LUC datasets currently unavailable for download are not characterized in Part IV either. We tried to obtain, either online or by contacting the authors, all the global or supra-national datasets to which we found references. Some of them, however, are no longer available. These datasets have not been reviewed.

The LUC datasets described in Part IV were characterized according to the following elements: information about the project or context within which they were produced; information about their method of production; description of the data available for download; and practical information for using the dataset in an effective way. For each dataset we also provide all the technical references in which it is described as well as other references of interest in which it is used or analysed. A table summarizing the main characteristics of the dataset (extent, temporal availability, spatial resolution, updates, accuracy…) is also provided.