Supervised and semi-supervised classifiers for the detection of flood-prone areas

Gnecco, Giorgio; Morisi, Rita; Roth, Giorgio; Sanguineti, Marcello; Taramasso, Angela Celeste

doi:10.1007/s00500-015-1983-z

Supervised and semi-supervised classifiers for the detection of flood-prone areas

Methodologies and Application
Published: 09 January 2016

Volume 21, pages 3673–3685, (2017)
Cite this article

Soft Computing Aims and scope Submit manuscript

Giorgio Gnecco¹,
Rita Morisi¹,
Giorgio Roth²,
Marcello Sanguineti³ &
…
Angela Celeste Taramasso²

513 Accesses
16 Citations
Explore all metrics

Abstract

Supervised and semi-supervised machine-learning techniques are applied and compared for the recognition of the flood hazard. The learning goal consists in distinguishing between flood-exposed and marginal-risk areas. Kernel-based binary classifiers using six quantitative morphological features, derived from data stored in digital elevation models, are trained to model the relationship between morphology and the flood hazard. According to the experimental outcomes, such classifiers are appropriate tools when one is interested in performing an initial low-cost detection of flood-exposed areas, to be possibly refined in successive steps by more time-consuming and costly investigations by experts. The use of these automatic classification techniques is valuable, e.g., in insurance applications, where one is interested in estimating the flood hazard of areas for which limited labeled information is available. The proposed machine-learning techniques are applied to the basin of the Italian Tanaro River. The experimental results show that for this case study, semi-supervised methods outperform supervised ones when—the number of labeled examples being the same for the two cases—only a few labeled examples are used, together with a much larger number of unsupervised ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Water quality prediction using machine learning models based on grid search method

Article Open access 29 September 2023

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Survey on SVM and their application in image classification

Article 11 January 2018

References

Bates PD, Marks KJ, Horritt MS (2003) Optimal use of high resolution topographic data in flood inundation models. Hydrol Process 17:537–557
Article Google Scholar
Belkin M, Niyogi P, Sindhawani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
MATH Google Scholar
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Book MATH Google Scholar
Degiorgis M, Gnecco G, Gorni S, Roth G, Sanguineti M, Taramasso AC (2012) Classifiers for the detection of flood-prone areas using remote sensed elevation data. J Hydrol 470–471:302–315
Article Google Scholar
Degiorgis M, Gnecco G, Gorni S, Roth G, Sanguineti M, Taramasso AC (2013) Flood hazard assessment via threshold binary classifiers: the case study of the Tanaro basin. Irrigation Drainage 62:1–10
Do Carmo MP (1976) Differential geometry of curves and surfaces, vol 2. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Dodov BA, Foufoula-Georgiou E (2006) Floodplain morphometry extraction from a high-resolution digital elevation model: a simple algorithm for regional analysis studies. IEEE Geosci Remote Sens Lett 3:410–413
Gallant JC, Dowling TI (2003) A multiresolution index of valley bottom flatness for mapping depositional areas. Water Resour Res 39:1347–1360
Article Google Scholar
Giannoni F, Roth G, Rudari R (2005) A procedure for drainage network identification from geomorphology and its application to the prediction of the hydrologic response. Adv Water Resour 28:567–581
Article Google Scholar
Guzzetti F, Stark CP, Salvati P (2005) Evaluation of flood and landslide risk to the population of Italy. Environ Manag 36:15–36
Article Google Scholar
Hjerdt KN, McDonnell JJ, Seibert J, Rodhe A (2004) A new topographic index to quantify downslope controls on local drainage. Water Resour Res 40. doi:10.1029/2004WR003130
Horritt MS, Bates PD (2002) Evaluation of 1D and 2D numerical models for predicting river flood inundation. J Hydrol 268:87–99
Article Google Scholar
Hunter NM, Bates PD, Horritt MS, Wilson MD (2007) Simple spatially-distributed models for predicting flood inundation: a review. Geomorphology 90:208–225
Article Google Scholar
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17:395–416
Article MathSciNet Google Scholar
Manfreda S, Di Leo M, Sole A (2011) Detection of flood-prone areas using digital elevation models. J Hydrol Eng 16(10):781–790. doi:10.1061/(ASCE)HE.1943-5584.0000367
Melacci S, Belkin M (2012) Laplacian support vector machines trained in the primal. J Mach Learn Res 12:1149–1184
MathSciNet MATH Google Scholar
Nardi F, Vivoni ER, Grimaldi S (2006) Investigating a floodplain scaling relation using a hydrogeomorphic delineation method. Water Resour Res 42(9). doi:10.1029/2005WR004155
Nardi F, Grimaldi S, Santini M, Petroselli A, Ubertini L (2008) Hydrogeomorphic properties of simulated drainage patterns using digital elevation models: the flat area issue. Hydrol Sci J 53:1176–1193
Article Google Scholar
Noman NS, Nelson EJ, Zundel AK (2001) Review of automated floodplain delineation from digital terrain models. J Water Resour Plan Manag 127(6):394–402
Santini M, Grimaldi S, Nardi F, Petroselli A, Rulli MC (2009) Preprocessing algorithms and landslide modelling on remotely sensed DEMs. Geomorphology 113:110–125
Article Google Scholar
Theodoridis S, Koutroumbas K (2009) Pattern recognition, 4th edn. Academic Press, New York
Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Morgan & Claypool Publishers, San Rafael

Download references

Acknowledgements

Marcello Sanguineti is a member of the Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM).

Author information

Authors and Affiliations

Institute for Advanced Studies (IMT), Piazza San Ponziano, 6, 55100, Lucca, Italy
Giorgio Gnecco & Rita Morisi
Department of Civil, Chemical and Environmental Engineering (DICCA), University of Genova, Via Montallegro, 1, Genoa, 16145, Italy
Giorgio Roth & Angela Celeste Taramasso
Department of Computer Science, Bioengineering, Robotics, and Systems Engineering (DIBRIS), University of Genova, Via Opera Pia, 13, 16145, Genova, Italy
Marcello Sanguineti

Authors

Giorgio Gnecco
View author publications
You can also search for this author in PubMed Google Scholar
Rita Morisi
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Roth
View author publications
You can also search for this author in PubMed Google Scholar
Marcello Sanguineti
View author publications
You can also search for this author in PubMed Google Scholar
Angela Celeste Taramasso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcello Sanguineti.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by V. Loia.

Appendices

Appendix 1: Support vector machines

Let a set made of a finite number l of labeled training data {($x_{i}, y_{i}$), $i = 1,\ldots ,l$} be given, with $x_{i}\in {\mathbb {R}}^{m}$ and $y_{i}\in ${-1,1}. Here, with a slight change in notation with respect to the previous sections, the label $-1$ (instead than the label 0) is used to denote the “negative” class, while +1 is the “positive” class label. Given a regularization parameter $\gamma _{A}>0$ and a suitable function space $H_\mathcal{K} $, more precisely, a reproducing kernel Hilbert space (Cristianini and Shawe-Taylor 2000), the (binary) support vector machine (SVM) training problem consists in searching for a classifier $f^{*}$ that solves the following optimization problem: find

$$\begin{aligned} \text{ min }_{f \in H_\mathcal{K} } \left( {\frac{1}{l} \sum \limits _{i=1}^l ( {1-y_i f(x_i )})_+ +\gamma _A f_{H_\mathcal{K} }^2 }\right) . \end{aligned}$$

(1)

By $||\cdot ||_{H_\mathcal{K} }^2 $ we denote the square of the norm in the reproducing kernel Hilbert space $H_\mathcal{K} $, and $({1-y_i f(x_i )})_+$ is the so-called hinge-loss function, which is defined as

$$\begin{aligned} ( {1-y_i f(x_i )})_+ :=\max ( {0,1-y_i f(x_i }). \end{aligned}$$

(2)

The term $\frac{1}{l}\mathop \sum \nolimits _{i=1}^l ( {1-y_i f(x_i )})_+$ in (1) penalizes the classification error on the training set, whereas the term $\gamma _A ||f||_{H_\mathcal{K} }^2 $ in (1) enforces a small norm of the optimal solution $f^{*}$ in the reproducing kernel Hilbert space $H_\mathcal{K} $ (i.e., typically, high smoothness for $f^{*}$). Given a (possibly unseen) data point $x \in {\mathbb {R}}^{m}$, the optimal classifier $f^{*}$ assigns to x the label +1 if $f^*( x)\ge 0,$ otherwise it assigns to x the label $-1$.

The optimization problem (1) can be rewritten in the following way: find

$$\begin{aligned} \text{ min }_{f \in H_\mathcal{K} , \xi _i {\mathbb {R}} } \left( {\frac{1}{l}\mathop \sum \limits _{i=1}^l \xi _i + \gamma _A ||f||_{H_\mathcal{K} }^2 }\right) \end{aligned}$$

(3)

$$\begin{aligned} \text{ subject } \text{ to } \left\{ {{\begin{array}{ll} y_i f(x_i )\ge 1-\xi _i , &{} \quad \mathrm{for} \ i=1,\ldots ,l, \\ \xi _i \ge 0, &{}\quad \mathrm{for} \ i=1,\ldots ,l. \\ \end{array} }} \right. \end{aligned}$$

We denote by $\mathcal{K}:{\mathbb {R}}^m\times {\mathbb {R}}^m\rightarrow {\mathbb {R}} $ the (uniquely determined) kernel function associated with the reproducing kernel Hilbert space $H_\mathcal{K}$ (Cristianini and Shawe-Taylor 2000). The optimal solution $f^{*}$ of the optimization problem (3) is provided by the representer theorem (Cristianini and Shawe-Taylor 2000) in the following form:

$$\begin{aligned} f^*( x)= \mathop \sum \limits _{i=1}^l \alpha _i^*\mathcal{K}( {x,x_i }), \end{aligned}$$

(4)

where the optimal coefficients $\alpha _i^*{\mathbb {R}}$. Therefore, solving the optimization problem (3) is reduced to determining the finite-dimensional coefficients $\alpha _{i}$ that minimize its objective, when the function f is constrained to have the form (4). For a reproducing kernel Hilbert space $H_\mathcal{K} $, the kernel $\mathcal{K}$ has often a simple expression. This is the case, e.g., of the linear kernel

$$\begin{aligned} \mathcal{K}( {x,y}):= \langle x,y \rangle _{{\mathbb {R}}^m} , \end{aligned}$$

(5)

and of the Gaussian kernel

$$\begin{aligned} \mathcal{K}( {x,y}):=\text{ exp }\left\{ {-\frac{||x-y||_{{\mathbb {R}}^m}^2 }{2\sigma ^2}} \right\} , \end{aligned}$$

(6)

where $\sigma > 0$ is a fixed width parameter. It often happens that only a small subset of the coefficients $\alpha _{i}^{*}$ (with respect to their total number l) is different from 0; the input data points $x_{i}$ associated with nonzero $\alpha _{i}^{*}$ are called support vectors.

In practice, a binary SVM classifier can be interpreted as a binary linear classifier in a (possibly infinite-dimensional) auxiliary feature space associated with the reproducing kernel Hilbert space $H_\mathcal{K} $. The mapping between the original feature space ${\mathbb {R}}^{m}$ and the auxiliary feature space is typically nonlinear. A binary SVM classifier often allows one to separate nonlinearly data points that are not linearly separable in the original feature space.

Appendix 2: Manifold regularization

Manifold regularization is a class of semi-supervised learning techniques, described in Belkin et al. (2006), whose goal consists in exploiting the information contained in the training objects in order to determine the underlying geometry of the input data in such a way to improve the overall classification performance. Manifold regularization aims at capturing such a geometrical structure and exploiting it to build a classifier having better classification performance than a fully supervised one, in situations where only a few labeled data are provided, but a much larger set of unlabeled data is available. Indeed, label information is not needed to determine the underlying geometry of the input data; hence, both labeled and unlabeled data can be used to this aim.

A main assumption of manifold regularization is that the input data points are drawn from a probability distribution whose support resides on a Riemannian manifold embedded in the original feature space. A two-dimensional manifold can be thought as a surface embedded in a higher dimensional Euclidean space (Do Carmo 1976). The surface of the Earth, for instance, is approximately a two-dimensional manifold embedded in a three-dimensional space. Similar remarks hold for larger dimensional manifolds. A Riemannian manifold is one on which one can define the “intrinsic distance” between any two points on the Riemannian manifold itself as their geodesic distance on the manifold, i.e., the length of the shortest path on the manifold between the two points. Manifold regularization assumes that similar labels are expected to be assigned to points $x_{i}$ and $x_{j}$ that are close with respect to the intrinsic distance on the Riemannian manifold they lie on. So, determining an approximation of the Riemannian manifold is expected to help the classification process.

In practice, an approximation of the Riemannian manifold can be obtained by using both the labeled and unlabeled input data, exploiting them to build an undirected graph G = (V, E) which provides a discrete model of the manifold itself and which is associated with a symmetric matrix W of suitable nonnegative weights. We recall that a graph is a representation of a set of objects where some of them are connected by links (“edges”); the graph is called undirected if no orientation on such links is defined, while it is weighted if one is given a measure of the strength of the links between pairs of objects (“vertices”). In the context of the present work, the vertices in the set V correspond to the input data points (the feature vectors), while the links between different pairs of input data points form the edge set E. In the context of manifold regularization, one considers weighted undirected graphs; assigning a weight to an edge means defining a measure of similarity between the associated vertices (input data points). Once a similarity measure has been chosen, the larger the similarity between the two input data points $x_{i}$ and $x_{j}$, the stronger their connection in the graph. Due to the basic assumption of manifold regularization reported in the paragraph above, the higher the weight $W_{ij}=W_{ji }$between $x_{i}$ and $x_{j}$, the higher the probability that they belong to the same class.

Determining a suitable similarity measure between every pair of input data points (hence, a suitable weight matrix W) is a challenging task, and several methods have been proposed in the literature to deal with such an issue. In fact, this measure is fundamental to build the graph that models the manifold where the data lie on. Usually, two choices for the weights $W_{ij}$ are considered. They could be either binary weights, or Gaussian weights. In the first case, the weight between two points is set to 1 if they are sufficiently close in the original feature space, otherwise it is set to 0. In the second case, one sets to $W_{ij} := e^{-\vert \vert x_i -x_j \vert \vert ^2/4t^2}$ the weight between two points $x_{i}$ and $x_{j}$ that are connected according to the first method, where $t > 0$ is a suitable width parameter. All the other weights are defined as being equal to 0. In this work, we have decided to use only the second method to define the weights of the edges. Note that the first one can be considered a limit case of the latter, since it is obtained from it for $ t\rightarrow +\infty $ .

Another way to build the graph approximating the Riemannian manifold is described in von Luxburg (2007). One of its main features consists in applying a k-nearest neighbors procedure to determine the edges of the graph, where k is a user-defined parameter. In a first stage, each input data point is connected to its k-nearest input data points, using the Euclidean metric in the original feature space to define the set of k-nearest neighbors. In general, this procedure leads to the definition of a directed graph; in order to obtain an undirected one (which is needed by manifold regularization), two possible methods are usually implemented:

the first method consists in connecting two input data points $x_{i}$ and $x_{j }$ if and only if either $x_{i }$ is among the k-nearest neighbors of $x_{j }$ or, vice versa, $x_{j }$ is among the k-nearest neighbors of $x_{i}$;
the second method, which leads to a less connected (i.e., sparser) graph, creates a link between the two nodes $x_{i }$ and $x_{j}$ if and only if $x_{i }$ is among the k-nearest neighbors of $x_{j}$ and, vice versa, $x_{j }$ is among the k-nearest neighbors of $x_{i}$. In this work, we have applied this kind of method, because we prefer to deal with less connected graphs.

Once the topology (i.e., the edge set E) of the graph has been fixed, the weight matrix W can be defined by following one of the two k-nearest neighbors procedures described in the paragraph above, i.e., by assigning to the edges determined through such procedures either a binary weight or a Gaussian weight.

Figure 5 shows an example of construction of the graph. Note that in the figure, we consider only two features for each point in order to visualize easily the graph.

Appendix 3: Laplacian support vector machines

Likewise in “Appendix 1”, we assume that a set made of a finite number l of labeled training data {($x_{i}, y_{i}$), $i = 1 ,\ldots ,l$}, with $x_{i}\in {\mathbb {R}}^{m}$ and $y_{i} \in ${$-1$,1} is available. We also assume the presence of a second set made of a finite number u of unlabeled training data {$x_{j}$, $j = l+1,\ldots ,l+u$}, with $x_{j}\in {\mathbb {R}}^{m}$. As in “Appendix 1,” $H_\mathcal{K} $ denotes a reproducing kernel Hilbert space, whereas $ \gamma _A >0$ is a regularization parameter. We also assume that a second regularization parameter ${\gamma }_I >0$ is given. With these premises, the (binary) Laplacian support vector machine (LapSVM) (Belkin et al. 2006) extends the SVM formulation described in “Appendix 1” by solving the following optimization problem (which is inspired by the principle of manifold regularization, see “Appendix 2”): find

$$\begin{aligned}&\text{ min }_{f \in H_K } \left( \frac{1}{l}\mathop \sum \limits _{i=1}^l ( {1-y_i f(x_i )})\right. \nonumber \\&\left. \quad + \gamma _A ||f||_{H_\mathcal{K} }^2 +\,\frac{\gamma _I }{(u+l)^2}{\varvec{f}}^{\varvec{T}} {\varvec{Lf}}\right) \end{aligned}$$

(7)

where $\varvec{f}:=[f( {x_1 }),\ldots ,f(x_{l+u} )]^T$ and L is the graph Laplacian matrix defined as $L:=D-W$. Here, W denotes a suitable (l + u) $\times $ (l + u) symmetric matrix of weights (see “Appendix 2” for one its possible constructions); its generic element $W_{ij}$ = $W_{ji}$ is the weight of the edge between the ith and the jth input data points. D is a diagonal matrix whose diagonal elements are defined as $D_{ii} :=\mathop \sum \nolimits _{j=1}^{l+u} W_{ij} $. Likewise in Appendix A, the goal of the term $\frac{1}{l}\mathop \sum \nolimits _{i=1}^l ( {1-y_i f(x_i )})_+ $ in (7) is to penalize the classification error on the training set, whereas the term $\gamma _A ||f||_{H_\mathcal{K} }^2 $ in (7) enforces a small norm of the optimal solution $f^{*}$ in the reproducing kernel Hilbert space $H_\mathcal{K} $ (i.e., typically, high smoothness for $f^{*}$). Finally, the term $\frac{\gamma _I}{(u+l)^2}\varvec{f}^{\varvec{T}} \varvec{Lf}$ enforces smoothness of the optimal solution $f^{*}$ also with respect to the graph approximation of the Riemannian manifold.

The expression of the optimal solution $f^{*}$ of problem (7) follows again from another form of the representer theorem and is given by

$$\begin{aligned} f^*(x):= \sum \limits _{i=1}^{l+u} \alpha _i^*K( {x,x_i }), \end{aligned}$$

(8)

for suitable optimal coefficients $\alpha _i^*\,\epsilon \,{\mathbb {R}}$. Again, solving the optimization problem (7) is reduced to determine the finite-dimensional coefficients $\alpha _{i}$ that minimize its objective, when the function f has the form (8).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gnecco, G., Morisi, R., Roth, G. et al. Supervised and semi-supervised classifiers for the detection of flood-prone areas. Soft Comput 21, 3673–3685 (2017). https://doi.org/10.1007/s00500-015-1983-z

Download citation

Published: 09 January 2016
Issue Date: July 2017
DOI: https://doi.org/10.1007/s00500-015-1983-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised and semi-supervised classifiers for the detection of flood-prone areas

Abstract

Access this article

Similar content being viewed by others

Water quality prediction using machine learning models based on grid search method

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Survey on SVM and their application in image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Appendices

Appendix 1: Support vector machines

Appendix 2: Manifold regularization

Appendix 3: Laplacian support vector machines

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Supervised and semi-supervised classifiers for the detection of flood-prone areas

Abstract

Access this article

Similar content being viewed by others

Water quality prediction using machine learning models based on grid search method

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Survey on SVM and their application in image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Appendices

Appendix 1: Support vector machines

Appendix 2: Manifold regularization

Appendix 3: Laplacian support vector machines

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation