Mt Glorious revisited: secondary succession in subtropical rainforest
In this paper, I re-examine the subtropical rainforest succession previously studied by Williams, Lance, Webb, Tracey and Dale (1969) (WLWTD) using a clustering procedure based on the Minimal Message Length principle of induction. This principle permits the optimal number of clusters to be estimated automatically. Optimality is defined here as a trade-off between quality of fit and complexity of model, both measured in message length units.
Because of the common unit of measurement, we can assess the numerical effectiveness of the procedures adopted in the previous study and compare the results obtained by using density as against presence/absence data or the value of numeric data independent of presence/absence effects. The results also bear on the “principle of explicability” which posits that users seek interpretable results, even if they are less efficient in purely numerical terms.
The optimal density result identified 8 clusters, although these were further clustered into 3 higher level groupings. The pattern of 2 temporal stages followed by spatial segregation is clear, with extra detail concerning aberrant stands and temporal dependency in the third spatial stage also apparent. This analysis was the most effective at recovering structure in the data, of those examined.
Imposing the WLWTD analysis on density data was markedly suboptimal and even the number of clusters recognised (7) was strictly incorrect. However, by subjective interpretation WLWTD selected a number of clusters which was very close to the optimal density solution. For this reason insight gained into the processes operating was not overly compromised. The optimal density result cleans up a few corners and adds more detail but the main outlines are sufficiently clear in the subjectively assessed presence data.
The results from optimal presence/absence analysis were understandable and effective, though considerably less detailed than those obtained using the density data or those from WLWTD’s original analyses. Indeed the 3 clusters established using the presence data reflect the higher level of structure which is recognisable in the density result. Using numeric data with 0 values set to missing values, showed little of interest.
Invocation of Kodratoff’s principle of explicability, which argues for interpretability to dominate efficiency, was unnecessary since the efficient analyses were directly interpretable. The introduction of domain knowledge during the subjective interpretation in the original analysis was apparently sufficient to counter any losses due to the inefficiency of the clustering method. Given more effective clustering methods and using the density data, it becomes unnecessary.
KeywordClustering Explicability Minimum Message Length Prediction Rainforest Succession
Minimal Message Length
Numeric data with 0 values set to missing values
Williams, Lance, Webb, Tracey and Dale (1969)
Unable to display preview. Download preview PDF.
- Austin, M. P. 1970. An applied ecological example of mixed data classification. In: R. S. Anderssen and M. R. Osborne (eds.), Data Representation, Univ. Queensland Press, Brisbane. pp. 113–117.Google Scholar
- Barsalou, L. W. 1995. Deriving categories to achieve goals. In: A. Ram and D. B. Leake (eds.), Goal Directed Learning. MIT Press Cambridge MA. pp. 121–176.Google Scholar
- Bunge, M. 1969. Metaphysics, epistemology and methodology of levels. In: L. L. Whyte, A. G. Wilson and D. Wilson (eds.), Hierarchic Structures, American Elsevier, New York. pp. 17–28.Google Scholar
- Dale, M. B. 1976. Hierarchy and level: prolegomena to a cladistic classification Tech. Memo. 1, CSIRO Division of Tropical Crops and Pastures, St. Lucia, Brisbane.Google Scholar
- Dale, M.B. 1999. The dynamics of diversity: mixed strategy systems. Coenoses 13: 105–113.Google Scholar
- Dale, M. B. and P. Hogeweg. 1998. The dynamics of diversity: a cellular automaton approach. Coenoses 13: 3–15.Google Scholar
- Dale, M. B. and D. Walker. 1970. Information analysis of pollen diagrams. Pollen et Spores 2: 21–37.Google Scholar
- Diday, E. 1988. The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H. H. Bock (ed.), Classification and Related Methods of Data Analysis, North Holland, Amsterdam. pp. 673–683.Google Scholar
- Edwards, R. T. and D. Dowe. 1998. Single factor analysis in MML mixture modelling. Lecture Notes in Art. Intell 1394 Springer, pp. 96–109.Google Scholar
- Hilderman, R. J. and H. J. Hamilton. 1999. Heuristics for ranking the interestingness of discovered knowledge. Proc. 3rd Pacific-Asia Conf. Knowledge Discovery PKDD’99, Beijing, Springer Verlag Berlin. pp. 204–209.Google Scholar
- Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Cepaduesed., Toulouse.Google Scholar
- Legendre, P. and E. Gallagher. 2000. Ecologically meaningful transformations for ordination biplots of species data. Ecology (submitted).Google Scholar
- Pazzani, M. J. and D. Kibler. 1992. The utility of knowledge in inductive learning. Machine Learning 9: 57–94.Google Scholar
- Wallace, C. S. 1995 Multiple factor analysis by MML estimation Tech Rep. 95/218, Dept. Computer Science, Monash University, Australia.Google Scholar
- Watanabe, S. 1969. Knowing and Guessing. J. Wiley, New York.Google Scholar
- Williams, W. T. and M. B. Dale. 1962. Partitioned correlation matrices for heterogenous quantitative data. Nature 196: 502.Google Scholar
- Williams, W. T., G. N. Lance, L. J. Webb, J. G. Tracey and M. B. Dale. 1969. Studies in the numerical analysis of complex rain-forest communities. III. The analysis of successional change. J. Ecol. 57: 513–535.Google Scholar