Cluster analysis is used to identify common patterns of environmental stress among deltas and define a delta environmental typology. Several clustering methods have been described in the literature; here we contrast results using the K-Means (KM) and Affinity Propagation (AP) methods (MacQueen 1967; Frey and Dueck 2007). Each delta, characterized by the 10 basin, coastal, and offshore indicators, is described by a vector in R
10, and assigned to the nearest of K cluster means according to the minimum Euclidean distance. Each cluster mean is then changed to match the centroid of its clustered data points. This process of data point reassignment to the nearest mean, and cluster mean readjustment to the centroid is repeated until the algorithm converges. While straightforward, the algorithm suffers from several drawbacks. Convergence to a local minimum is guaranteed; however, this may not be the global optimum. The final clustering is sensitive to the initial points chosen for the K-means. Finally, the number of clusters, k, must be explicitly specified. Initialization and convergence issues can be addressed using cluster ensembles. To provide guidance in the choice of k, we also identify clusters using the AP method. This is a message-passing algorithm, where data points are chosen as “exemplars,” or cluster centers, based on their similarity to other data points. Nearby points, in R
10, are considered similar. In an iterative process, candidate “exemplars” are chosen to maximize similarity among all clusters. A key strength of the AP algorithm is that the number of clusters does not need to be specified, but rather are an output of the clustering process.
AP clustering identifies eight distinct clusters. We compare these clusters with a set of KM clusterings with values of k ranging from 6 to 9. Each KM clustering is computed as the best clustering, in terms of the minimum mean distance between samples and cluster centers, from 1000 random initializations of cluster centers. While the numeric labels used to identify a given cluster within an AP or KM clustering are arbitrary, we have numbered the labels to highlight similar delta clusters across each AP or KM clustering. The delta membership for each clustering scheme is presented in Table 2.
Identification of the optimal clustering is a non-trivial task, and a substantial literature (e.g., Rousseeuw 1987; Halkidi et al. 2001; Meilă 2007) has been devoted to the development of metrics for assessing the quality or consistency of a given clustering. We compute silhouette scores, a measure of cluster density, as an estimate of the quality of a given clustering. For each sample, the silhouette score is defined as the ratio:
$$(b - a)/{ \hbox{max} }(a, b)$$
(1)
where a is the mean distance from the sample to all other intra-cluster samples, and b is the mean distance from the sample to all other samples in the next closest cluster. This ratio approaches 1 for “perfect” clusters where all samples are identical. Values near zero suggest that the sample belongs only marginally better in the assigned cluster than the next closest cluster, while negative scores indicate the sample is closer to a neighboring cluster than its own cluster. This metric assumes that small distance between samples is the defining characteristic of a cluster, the same assumption that the K-Means and Affinity Propagation algorithms use to define clusters. Average silhouette scores across all samples in a given clustering are shown in Table 3. Mean scores are similar across each algorithm at approximately 0.20, with slightly higher scores for KM8 and KM9. These positive, but low, scores suggest that the boundaries between some clusters are weakly defined.
Table 3 Mean silhouette scores for each clustering method
We focus here on results from KM8 and use this clustering to define the global deltas environmental typology. KM8 has the same number of clusters as the Affinity Propagation algorithm, with slightly better silhouette scores than the other clusterings. For several clusters, however, there is little to no difference in delta membership across algorithms. Sensitivity of each cluster to the clustering algorithm chosen is examined below. Geographic distribution of the clusters, or environmental types, are mapped in Fig. 5a, with the distribution of indicator scores among each type in Fig. 5b. Characteristic “fingerprints” for each type are shown in Fig. 6.
Type 1 deltas (light blue) are characterized by very low anthropogenic stress in the local delta domain, and to a lesser degree in the upstream watershed as well. The deltas in this type are the Amazon, Amur, Burdekin, Lena, Mackenzie, and Yukon deltas. The four remote, high-latitude delta systems in the study are all in this type. All have very low delta and basin population densities. Indeed, the only moderate environmental indicator values among Type 1 deltas are elevated basin reservoir trapping values in the Amur, Burdekin, Lena, and Mackenzie deltas, and moderate sea-level rise trends in the Amazon, Burdekin, Lena, and Mackenzie deltas.
Type 2 deltas (dark blue) are moderately stressed across most of the environmental indicators. These systems have mid-range population densities, both in the delta and the upstream basin. This type comprises the Dnieper, Grijalva, Indus, Mississippi, Niger, Sao Francisco, Tana, Volta, and Yellow deltas. These deltas are geographically dispersed, have moderate to high wetland conversion in the upstream watershed and coastal delta areas, and moderate to high volume of artificial reservoirs on the upstream river network. We note that additional environmental challenges can be highly important in specific deltas, such as soil salinization in the Indus Delta (Aslam and Prathapar 2006), and this methodology utilizes an approximate estimate of the current delta environmental states to develop clusters.
Type 3 deltas (light green) have low populations in both spatial domains, but high wetland disconnectivity in the upstream basin, suggesting heavy conversion of wetland to agricultural use. Deltas in this type include the Congo, Fly, Mahakam, and Orinoco deltas. Located in the tropics, these deltas have low upstream reservoir trapping, and do not rely on groundwater extraction. Both of these characteristics reflect high freshwater input, as the reservoir volume indicator is normalized by river discharge. A consistent supply of freshwater obviates the need for unsustainable groundwater extraction to support agricultural or municipal use. The Fly, Mahakam, and Orinoco are exposed to very high local sea-level rise rates. Hydrocarbon extraction activities occur in the Congo, Mahakam, and Orinoco deltas as well.
Type 4 deltas (dark green) are low to moderate population deltas, several located in the subtropics, and tend to be reliant on water engineering efforts. The deltas in this type are the Colorado, Ebro, Moulouya, Parana, Rio Grande, Senegal, and Shatt-el-Arab. Located primarily in arid regions, the upstream river networks feeding these deltas are heavily dammed, with a high mean water residence time in artificial reservoirs. Modeling results suggest that the Colorado, Rio Grande, Senegal, and Shatt-el-Arab all rely on unsustainable groundwater extraction from the delta. Local sea-level rise trends tend to be low for these deltas.
Type 5 deltas (pink) are very densely populated and urbanized systems. This delta type includes the Han, Hong, Pearl, Rhine, and Yangtze. These are some of the most densely populated deltas in the world, and also have densely populated upstream basins. The populations are highly urban, with high impervious surface area. With the exception of the Rhine, oil and gas extraction is not common in these deltas. Reservoir trapping in the upstream basin is low to moderate, and groundwater extraction is not a major factor for deltas in this type.
Type 6 deltas (red) are moderately to very highly populated, both in the delta and upstream watershed. Deltas comprising this type are the Brahmani, Ganges–Brahmaputra, Irrawaddy, Mekong, Po, and Tone. This type is similar to Type 5, though the upstream watershed has greater wetland disconnectivity, suggesting more agricultural conversion of wetlands. Additionally, this type is differentiated from Type 5 by substantially greater exposure to local sea-level rise. Hydrocarbon extraction occurs in four deltas in this type: the Brahmani, Ganges, Irrawaddy, and Po.
Type 7 deltas (light orange) are highly populated and similar to Type 5, but with a greater reliance on upstream dams and local groundwater extraction for water management. This type includes the Chao Phraya, Godavari, Krishna, Nile, and Sebou deltas. Unlike Types 5 and 6, none of the deltas in this type have substantial hydrocarbon extraction activities, per the 2000 USGS World Energy Assessment. These deltas tend to be urbanized, with high impervious surface areas, similar to Type 5 and to a lesser extent Type 6.
Type 8 deltas (dark orange) tend to be moderately populated, both in the upstream basin and the coastal delta. This type is similar to Type 2, with somewhat greater population density and greater local sea-level rise trends. Deltas in this type are the Danube, Limpopo, Magdalena, Mahanadi, Rhone, and Vistula. Relative to Type 2, deltas in Type 8 are also characterized by a lower likelihood of oil and gas extraction activity. Furthermore, deltas in this type have relatively low wetland disconnectivity scores in the upstream basin, suggesting more intact wetland systems in the contributing watershed.