Information theory as a consistent framework for quantification and classification of landscape patterns

Quantitative grouping of similar landscape patterns is an important part of landscape ecology due to the relationship between a pattern and an underlying ecological process. One of the priorities in landscape ecology is a development of the theoretically consistent framework for quantifying, ordering and classifying landscape patterns. To demonstrate that the information theory as applied to a bivariate random variable provides a consistent framework for quantifying, ordering, and classifying landscape patterns. After presenting information theory in the context of landscapes, information-theoretical metrics were calculated for an exemplar set of landscapes embodying all feasible configurations of land cover patterns. Sequences and 2D parametrization of patterns in this set were performed to demonstrate the feasibility of information theory for the analysis of landscape patterns. Universal classification of landscape into pattern configuration types was achieved by transforming landscapes into a 2D space of weakly correlated information-theoretical metrics. An ordering of landscapes by any single metric cannot produce a sequence of continuously changing patterns. In real-life patterns, diversity induces complexity—increasingly diverse patterns are increasingly complex. Information theory provides a consistent, theory-based framework for the analysis of landscape patterns. Information-theoretical parametrization of landscapes offers a method for their classification.

far used simulated landscapes which lack the character and diversity of form 120 found in real-life landscapes. Demonstrations of orderings on real-life land-121 scapes (Wang and Zhao, 2018;Gao et al., 2017) used two few landscapes to 122 make a judgment. 123 The above overview of different approaches to quantification, ordering, and 124 classification of landscape patterns reveals a lack of consistent methodology. 125 Different aspects of pattern analysis were addressed using different approaches, 126 and those approaches, with the exception of the Boltzmann entropy, were not 127 rooted in any theory. The principal objective of this paper is to demonstrate 128 that the Information Theory (IT) (Shannon, 1948), as applied to a bivariate 129 random variable representing a landscape, constitutes a consistent, theory-130 based quantitative methodology addressing all aspects of pattern analysis. x is a class of the focus cell and y is a class of an adjacent cell. Using ad-156 jacent cells is the simplest way to take into account spatial relations when rule (4-connectivity) and we distinguish between frequencies of (c i , c j ) pairs 163 and frequencies of (c j , c i ) pairs. Using other definitions of adjacency and/or 164 unordered pairs is also possible (Riitters et al., 1996). 165 Probabilities of (x, y) are given by a joint probability p(x = c i , y = c j ) -a 166 probability of the focus cell having a class c i and an adjacent cell having a class 167 c j . We calculate the values of p(x = c i , y = c j ) by dividing the co-occurrence 168 matrix by the total number of pairs in the pattern. The informational content 169 of bivariate random variable (x, y) is given by the IT concept of joint entropy 170 which is computable directly from p(x, y), (1) the H(x, y) ordering of the evaluation set of landscapes in Fig. 1).

183
Next, we consider subsets of cell pairs such that a class of the focus cell  The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint landscape with the highest configurational complexity is not the same as the 201 landscape with the highest overall complexity because, even so it has a more 202 intricate geometry it has fewer categories. p(y = c j ) log 2 p(y = c j ). ( The value of H(y) is the number of bits needed on average to specify a class 208 of cell. H(y) is a metric of a compositional complexity of a pattern, which is 209 also frequently referred to as pattern diversity (see the H(y) ordering of the 210 evaluation set of landscapes in Fig. 1).

211
We could also focus on variable x (a class of the focus cell) and calculate

216
The IT chain rule formula (see, for example, Cover and Thomas (2012)) This formula shows that the informal statement -landscape patterns are char-219 acterized by both their composition and their configuration, which collectively 220 define landscape structure -which is often found in landscape ecology papers,

221
is not only a verbal description but has a quantitative justification.

222
One of the most useful concepts of IT is the mutual information, I(y, x),

251
For these landscapes, we computed a set of 17 configurational landscape 252 metrics (see Table 1 in Nowosad and Stepinski (2018)  I(y, x), and U . We also calculated the value of Boltzmann entropy, S using 263 the formula given in Table 6 of Cushman (2018)     The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint   Table 1 suggest using H(y) and U as the two parameters 314 to utilize in a 2D parametrization of landscape patterns because they are 315 . CC-BY 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint the least correlated of all information-theoretical metrics. Fig. 2A  The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint By analyzing the HY U diagram (possibly referring to Fig. 2D for an un-  ?Neill et al., 1988;Li and Reynolds, 1993). From  The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint that the contagion index, which is considered to be a measure of clumpiness, 412 is not really a good indicator of this property. Although a deficiency of conta-413 gion index as a measure of landscape clumpiness has been previously pointed 414 out by (Li and Reynolds, 1993;Riitters et al., 1996;He et al., 2000), here we 415 demonstrate it clearly on real-life landscapes.

416
The third finding -landscape diversity induces landscape compositional The term (1 − α) is an "expected" value of U , consistent with the observed of themes is needed. This is a straightforward task, which, however, is beyond 449 . CC-BY 4.0 International license author/funder. It is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/383281 doi: bioRxiv preprint the scope of this paper. To facilitate classification of landscapes configurations