Compression and knowledge discovery in ecology

Dale, M. B.

doi:10.1556/ComEc.14.2013.2.10

Compression and knowledge discovery in ecology

Open access
Published: 30 December 2013

Volume 14, pages 196–207, (2013)
Cite this article

Download PDF

You have full access to this open access article

Community Ecology Aims and scope Submit manuscript

Compression and knowledge discovery in ecology

Download PDF

M. B. Dale¹

139 Accesses
1 Citation
Explore all metrics

Abstract

Knowledge discovery is the non-trivial process of identifying valid, novel, interesting, potentially useful and ultimately understandable patterns in data. It encompasses a wide range of techniques ranging from data cleaning to finding manifolds and separating mixtures. Starting in the early 50’s, ecologists contributed greatly to the development of these methods and applied them to a large number of problems. However, underlying the methodology are some fundamental questions bearing on their choice and function. In addition, other fields, from sociology to quantum mechanics, have developed alternatives or solutions to various problems. In this paper, I want to look at some of the general questions underlying the processes. I shall then briefly examine aspects of 3 areas, manifolds, clustering and networks, specifically for choosing between them using the concept of compression. Finally, I shall briefly examine some of the future possibilities which remain to be examined. These provide methods of possibly improving the results of clustering analysis in vegetation studies.

Article PDF

Correspondence analysis, spectral clustering and graph embedding: applications to ecology and economic complexity

Article Open access 26 April 2021

An Introduction to KDB: Knowledge Discovery in Biodiversity

Insights in Hierarchical Clustering of Variables for Compositional Data

Article Open access 16 November 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: the action hierarchy approach. In: Heckerman, D., Mannila, H., Pregibon, D. and Uthurusamy, R. (eds.), Proceedings 3rd International Conference Knowledge Discovery Data Mining. AAAI, pp. 111–114.
Aerts, D. and Gabora, L. 2005. A theory of concepts and their combinations I: The structure of the sets of contexts and properties. Kybernetes 34: 151–175.
Article Google Scholar
Aha, D.W., Kibler, D. and Albert, M.K. 1991. Instance-based learning algorithms. Mach. Learn. 6: 37–66.
Google Scholar
Aitchison, J. 1986. The Statistical Analysis of Compositional Data. Chapman & Hall, London.
Book Google Scholar
Akaike, H. 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control 19: 716–723.
Article Google Scholar
Allen, T.F.H. and Hoekstra, T.W. 1990. The confusion between scale-defined levels and conventional levels of organization in ecology. J. Veg. Sci. 1: 5–12.
Article Google Scholar
Anderson, M. Fu, G-S., Phlypo, R. and Adali, T. 2013 Independent vector analysis: identification conditions and performance bounds. arxiv 1303.7474.
Antonelli, P.L. 1990. Applied Volterra-Hamilton systems of the Finsler type: increased species diversity as a non-chemical defense for coral against crown-of-thorns. In: Bradbury, R. H. (ed.), Acanthaster and the Coral Reef: A Theoretical Perspective, Lecture Notes in Biomathematics 8, Springer-Verlag, Berlin. pp. 220–235.
Chapter Google Scholar
Babušaka, R., van der Venn, P.J. and Kaymak, U. 2002. Improved covariance estimation for Gustafson-Kessel clustering. Proceedings of the 2002 IEEE International Conference on Fuzzy Systems, Honolulu. pp. 1081–1085.
Beals, E.W. 1973 Ordination: mathematical elegance and ecological naiveté. J. Ecol. 61: 23–35.
Article Google Scholar
Béjar, J. 2000. Improving knowledge discovery using domain knowledge in unsupervised learning. Lect. Notes Comput. Sc. 1810: 47–54.
Article Google Scholar
Benzecri, J-P. 1973. L’Analyse des Données. Vol. II. L’Analyse des Correspondances. Dunod, Paris.
Google Scholar
Bio, A.M.F., Alkemade, R. and Barendregt, A. 1998. Determining alternative models for vegetation response analysis: a non-parametric approach. J. Veg. Sci. 9: 5–16.
Article Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D. and Warmuth, M.K. 1987. Occam’s razor. Inform. Process. Lett. 24: 377–380.
Article Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D. and Warmuth, M.K. 1989. Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36: 929–965.
Article Google Scholar
Bolognini, G. and Nimis, P.L. 1993. Phytogeography of Italian deciduous oakwoods based on numerical classification of plant distribution ranges. J. Veg. Sci. 4: 847–860.
Article Google Scholar
Bond, T.G. and Fox, C.M. 2007. Applying the Rasch Model: Fundamental Measurement in the Human Sciences. 2nd ed. (includes Rasch software on CD-ROM). Lawrence Erlbaum, Mahwah, NJ.
Google Scholar
Bonnard, C., Berry, V. and Lartillot, N. 2005. Multipolar consensus for phylogenetic trees. Syst. Biol. 55: 837–843.
Article Google Scholar
Borg, I. and Groenen, P. 2005. Modern Multidimensional Scaling: Theory and Applications. 2nd ed. Springer, New York.
Google Scholar
Brooks, R.J. and Tobias, A.M. 1996. Choosing the best model: level of detail, complexity and model performance. Math. Comput. Model. 24: 1–14.
Article Google Scholar
Buehrer, D. and Lee, C.-H. 2013 Class algebra for ontology reasoning. arXiv 1302.0334.
Bunitine, W. and Jakulin, A. 2006. Discrete component analysis. arXiv 0604410.
Caruana, R.R. and Freitag, D. 1994. How useful is relevance? Working Notes of the AAAI Fall Symposium on Relevance. AAAI Press, New Orleans, pp. 25–29.
Google Scholar
Carroll, J.D. and Chang, J.J. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of ‘Eckhart-Young’ decomposition. Psychometrika 35: 283–319.
Article Google Scholar
Cheeseman, P. 1990. On finding the most probable model. In: Sharger, J. and Langley, P. (eds.), Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmann, San Mateo, pp. 73–96.
Google Scholar
Chen, K. 2013. Towards the acquisition of temporal knowledge. arXiv 1304.3079.
Cilibrasi, R. 2006. Statistical inference through data compression. ILLC Dissertation Series DS–2006–08, Institute for Logic, Language and Computation, Universiteit van Amsterdam.
Coscia, M., Giannotti, F. and Pedrechi, D. 2012. A classification of community discovery methods in complex networks. arXiv 1206.3552.
Coombs, C.H. and Kao, R.C. 1955. Nonmetric Factor Analysis. Engineering Research Bulletin 38, Engineering Research Institute, University of Michigan, Ann Arbor.
Crutchfield, J.P. 1990. Information and its metric. In: Lam, L. and Morris, H.C. (eds.), Nonlinear Structures in Physical Systems — Pattern Formation, Chaos, and Waves. Springer, Berlin, pp. 119–130.
Chapter Google Scholar
Dale, M. 1985. Graph theoretical methods for comparing phytosociological structures. Vegetatio 63: 79–88.
Google Scholar
Dale, M.B. 2000. On plexus representation of dissimilarities. Community Ecol. 1: 43–56.
Article Google Scholar
Dale, M.B. and Anderson, D.J. 1973. Inosculate analysis of vegetation data. Austr. J. Bot. 21: 253–276.
Article Google Scholar
Dale, M.B. and Barson, M.M. 1989. Grammars in vegetation analysis. Vegetatio 81: 79–94.
Article Google Scholar
Dale, M.B. and Clifford, H.T. 1976. The effectiveness of higher taxonomic ranks for vegetation analysis. Austr. J. Ecol. 1: 37–62.
Article Google Scholar
Dale, M.B. and Hogeweg, P. 1998. The dynamics of diversity: a cellular automaton approach. Coenoses 13: 3–15.
Google Scholar
Dale, P.E.R. 1983. Scale problem in classification: an application of a stochastic method to evaluate the relative heterogeneity of sample units. Austr. J. Ecol. 8: 189–198.
Article Google Scholar
Day, W. H. E. 1988. Consensus methods as tools in data analysis. In: Bock, H.H. (ed.), Classification and Related Methods of Data Analysis. North Holland, Amsterdam, pp. 317–324.
Google Scholar
de Leeuw, J. 2005. Multidimensional Unfolding. The Encyclopedia of Statistics in Behavioral Science, Wiley, N.Y.
Google Scholar
Diday, E. and Bertrand, P. 1986. An extension to hierarchical clustering: the pyramidal presentation. In: Gelsema E.s. and Kanak, L.N. (eds), Pattern Recognition in Practice. Elsevier Science, Amsterdam, pp. 411–424
Chapter Google Scholar
Diday, E., and Emilion, R. 1997. Treillis de Galois maximaux et Capacités de Choquet. Comptes Rendus de l’Académie des Sciences. Analyse Mathématique Séries 1, Mathematics. 325: 261–266.
Google Scholar
Echenin, M., Peltier, N. and Tourret, S. 2013. An approach to abductive reasoning in equational logic. Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 531–537.
Epstein, S. 2013. All sampling methods produce outliers. arXiv 1304.3872.
Fekete, G. and Lacza, J.Sz. 1970. A survey of plant life form systems and the respective research approaches II. Annals Historico-Naturales Musei Nationalis Hungarici Pars Botanica 62: 115–127.
Google Scholar
Feoli, E. and Zuccarello, V. 1986. Ordination based on classification: yet another solution? Abstracta Botanica 10: 203–219.
Google Scholar
Feoli, E. and Zuccarello, V. 1994. Naivete of fuzzy system space in vegetation dynamics. Coenoses 9: 25–32.
Google Scholar
Foster, D., Kakade, S. and Salakhutdinov, R. 2011. Domain adaptation: overfitting and small sample statistics. ArXiv 105.0857v1.
Gell-Mann, M. 1994 The Quark and the Jaguar. W. H. Freeman, San Francisco.
Google Scholar
Gençay, R., Selçuk, F. and Whitcher, B. 2001. An Introduction to Wavelets and Other Filtering Methods in Finance and Economics. Academic Press, N.Y.
Google Scholar
Gifi, A. 1990. Nonlinear Multivariate Analysis. Wiley, New York.
Google Scholar
Globerson, A. and Tisby, N. 2003 Sufficient dimensionality reduction. J. Machine Learning Res. 3: 1307–1331.
Google Scholar
Goodall, D.W. 1952. Objective methods in the classification of vegetation I. The use of positive interspecific correlation. Aust. J. Bot. 1: 39–63.
Google Scholar
Gopalakrishna, A.K., Ozcelebi, T., Liotta, A. and Lukkein, J. 2013. Relevance as a metric for evaluating machine learning algorithms. arXiv 1303.7093.
Gorban, A., Sumner, N.R. and Zinovyev, A. 2008. Beyond the concept of manifolds: principal trees, metro maps, and elastic cubic complexes. In: Gorban, A., Kégl, B., Wunsch, D. and Zinovyev, A. (eds.), Principal Manifolds for Data Visualization and Dimension Reduction, Lecture Notes in Computational Science and Engineering 58: 219–237.
Gower, J.C. 1977. The analysis of asymmetry and orthogonality. In: Barra, J. R. et al. (eds.), Recent Developments in Statistics. North Holland, Amsterdam, pp. 109–123.
Google Scholar
Grassberger, P. 1991. Information and Complexity Measures. In: Atmanspacher, H. and Scheingraber, H. (eds), Dynamical Systems, Information Dynamics, Plenum Press, New York, pp. 15–33.
Chapter Google Scholar
Gull, S.F. 1988. Bayesian inductive inference and maximum entropy. In: Erickson, G.J. and Smith, C.R. (eds.), Maximum Entropy and Bayesian Methods in Science and Engineering. 1. Foundations. Kluwer, Dordrecht. pp. 53–74.
Chapter Google Scholar
Gustafson, E. and Kessel, W. 1979. Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings I. E. E. E. Conference Decision Control. pp. 761 –766.
Hájek, P. and Havránek, T. 1977. On generation of inductive hypotheses. Int. J. Man-Mach. Stud. 9: 415–438.
Article Google Scholar
Heiser, W.J. 1987. Joint ordination of species and sites: the unfolding technique. In: Legendre, P. and Legendre, L. (eds.), Developments in Numerical Ecology. Springer, Berlin. pp. 189–221.
Chapter Google Scholar
Hernández-Orallo, J. 1998. Consilience as a basis for theory formation. In: Magnani, L. Nersessian, N.J. and Thagard, P. (eds.), Proc. Conf. Model Based Reasoning, Pavia (MBR’98). Kluwer/Plenum. pp. 17–19.
Google Scholar
Hernández-Orallo, J. 1999. Computational measures of information gain and reinforcement in inference processes. PhD Thesis, Department of Logic and Philosophy, University of Valencia.
Hill, M.O. 1973. Reciprocal averaging: an eigenvector method of ordination. J. Ecol. 61: 237–249.
Article Google Scholar
Hill, M.O. and Gauch, H.G. Jr. 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47–58.
Article Google Scholar
Hron, K., Templ, M. and Filzmoser, P. 2010. Exploratory compositional data analysis using the R-package robCompositions. In: Aivazian, S., Filzmoser, P. and Kharin, Yu. (eds.), Proceedings 9th International Conference on Computer Data Analysis and Modeling, Belarusian State University, Minsk. 1: 179–186.
Hubert, L., Meulman, J. and Heiser, W. 2000. Two purposes for matrix factorization: a historical appraisal. SIAM Review 42: 68–82.
Article Google Scholar
Hyvärinen, A. and Oja, E. 2000. Independent component analysis: algorithms and applications. Neural Networks 13: 411–430.
Article PubMed Google Scholar
Hyvärinen, A. and Pajunen, P. 1999. Nonlinear independent component analysis: existence and uniqueness results. Neural Networks 12: 429–439.
Article PubMed Google Scholar
Ihm, P. and van Groenewoud, H. 1984. Correspondence analysis and Gaussian ordination. COMPSTAT lectures 3: 5–60.
Google Scholar
Jeffrey, H. 1961. Theory of Probability. Cambridge University Press, Cambridge.
Google Scholar
Jiang, J. 2008. A literature survey on domain adaptation. https://doi.org/si-faka.cs.uiuc.edu/jiang4/domain adaptation/survey/da sur-vey.pdf
Joshi, M., Lingras, P., Yiyu Yao, Virendrakumar, C.B. 2010. Rough, fuzzy, interval clustering for web usage mining. In: Lingras, O., Yao, Y. Y. and Virendrakumar, C.B. (eds), 10th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 397–402.
Kadous, M.W. 1995. Expanding the scope of concept learning using meta features. School of Computer Science and Engineering, University of New South Wales. https://doi.org/rexa.info/paper/4ccb84298ff6f0a62f8263c57259cc114cb1b328
Kawakami, H., Akinaga, R., Suto, H. and Katai, O. 2003. Translating novelty of business models into terms of modal logics. Proceedings 16th Australian Conference on AI, Lecture Notes in Computer Science. pp. 821–832.
Kaymak, U. and Setnes, M. 2002. Fuzzy clustering with volume prototypes and adaptive cluster merging. IEEE Transactions on Fuzzy Systems 10(6): 705–712.
Article Google Scholar
Kearns, M., Mansour, Y. and Ng, A.Y. 2013. An information analysis of hard and soft assignment methods for clustering. arXiv 1302.1552.
Kemp, C., Perfors, A. and Tenenbaum, J.B. 2007. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10: 307–321.
Article PubMed Google Scholar
Kiers, H.A.L. 1994. SIMPLIMAX: Oblique rotation to an optimal target with simple structure. Psychometrika 59: 567–579.
Article Google Scholar
Keogh, E.J., Lonardi, S., Ratanamahatana, C.A., Wei, L., Lee, S-H. and Handley, J. 2007. Compression-based data mining of sequential data. Data Min. Knowl. Disc. 14: 99–129.
Article Google Scholar
Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Editions Cépadues, Toulouse.
Google Scholar
Kolmogorov, A.N. 1965. Three approaches to the quantitative definition of information. Problems of Information Transmission 1: 4–17.
Google Scholar
Koppel, M. and Atlan, H. 1991. An almost machine-independent theory of program-length complexity, sophistication, and induction. Information Sciences 56: 23–33.
Article Google Scholar
Kordon, A. 2009. Computational intelligence marketing. SIGEVO-lution 4: 2–11.
Article Google Scholar
Kourie, D.G. and Oosthuizen, G.D. 1998. Lattices in machine learning: complexity issues. Acta Informatica 35: 289–292.
Article Google Scholar
Krishnapuram, R. and Keller, J. 1993 A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1: 98–110.
Article Google Scholar
Kruskal, J.B. 1964. Multidimensional scaling by optimizing goodness of fit to nonmetric hypothesis. Psychometrika 29: 1–27.
Article Google Scholar
Kušelová, I. and Chytrý, M. 2004. Interspecific associations in phytosociological data sets: how do they change between local and regional scale? Plant Ecol. 173: 247–257.
Article Google Scholar
Lambert, J.M. and Williams, W.T. 1962 Multivariate methods in plant ecology IV. Nodal Analysis. J. Ecol. 50: 775–803.
Article Google Scholar
Lance, G.N. and Williams, W.T. 1967 A general theory of classificatory sorting strategies I. Hierarchical systems. Comput. J. 9: 373–380.
Article Google Scholar
Laurence, S. and Margolis, E. 1999. Concepts: Core Readings. MIT Press, Cambridge.
Google Scholar
Lavorel, S., Mcintyre, S., Landsberg, J. and Forbes, T.D.A. 1997. Plant functional classifications: from general groups to specific groups based on disturbance. Trends Ecol. Evol. 12: 474–478.
Article CAS PubMed Google Scholar
Lempel, A. and Ziv, J. 1976. On the complexity of finite sequences. IEEE Trans. Inf. Theory 22: 75–81.
Article Google Scholar
Liu, B., Hsu, W., Mun, L-F. and Lee, H.-Y. 1999. Finding interesting patterns using user expectation. I.E.E.E. Transactions Knowledge Data Engineering 11: 817–832.
Google Scholar
Lloyd, S. 2001. Measures of complexity: A non-exhaustive list. IEEE Control Systems Magazine 21: 78.
Google Scholar
Lopez-Ruiz, R., Sanudo, J., Romera, E. and Calbet, X. 2012 Statistical complexity and Fisher-Shannon Information. Applications. arXiv 1201.2291.
Lugosi, G. and Zeger, K. 1996. Concept learning using complexity regularization. IEEE Transactions Information Theory 42: 48–54.
Article Google Scholar
Macnaughton-Smith, P. 1965. Some statistical and other numerical techniques for classifying individuals. Home Office Res. Unit Rep. 6, HMSO, London.
Google Scholar
McQuarrie, A.D.R. and Tsai, C.-L. 1998. Regression and Time Series Model Selection. World Scientific, Singapore.
Google Scholar
Mikkelson, G.M. 2001. Complexity and verisimilitude: realism for ecology. Biol. Philos. 16: 533–546.
Article Google Scholar
Mondal, N. and Ghosh, P.P. 2013. On the existence of parallel computation in nature. arXiv 1304.0160.
Moraczewski, I.R. 1993a. Fuzzy logic for phytosociology 1. Syntaxa as vague concepts. Vegetatio 106: 1–11.
Article Google Scholar
Moraczewski, I.R. 1993b. Fuzzy logic for phytosociology 2. Generalizations and prediction. Vegetatio 106: 13–20.
Article Google Scholar
Ng, A., Jordan, M. and Weiss, Y. 2001. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems 14:849–856.
Google Scholar
Niven, B.S. 1988. The ecosystem as an algebraic category: a mathematical basis for theory of community and ecosystem in animal ecology. Coenoses 3: 83–88.
Google Scholar
Niven, B.S. 1992. Formalization of some basic concepts of plant ecology Coenoses 7: 103–113.
Google Scholar
Orlóci, L. 1991. On character-based plant community analysis: choice, arrangement, comparison. Coenoses 5: 103–108.
Google Scholar
Pascual-Montano, A., Crazo, J.M., Kochi, K., Lehman, D. and Pascual-Montano, R. 2006. Nonsmooth nonnegative matrix factorisation. IEEE Transactions Pattern Analysis Machine Intelligence 28: 403–415.
Article Google Scholar
Pestov, V. 2010. PAC learnability of a concept class under non-atomic measures: a problem by Vidyasagar. arXiv 1006.5090.
Pestov, V. 2011. PAC learnability versus VC dimension: a footnote to a basic result of statistical learning. arXiv 1104:2097.
Peters, G. 2006. Some refinements of rough k-means clustering. Pattern Recognition 39: 1481–1491.
Article Google Scholar
Podani, J. 1986. Comparisons of partitions in vegetation studies. Abstracta Botanica 10: 235–290.
Google Scholar
Podani, J. 1989. A method for generating consensus partitions and its application to community classification. Coenoses 4: 1–10.
Google Scholar
Podani, J. 1998. Explanatory variables in classifications and the detection of the optimum number of clusters. In: Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.-H. and Baba, Y. (eds.), Data Science, Classification and Related Methods. Springer, Tokyo, pp. 125–132.
Porter, B.W., Bareiss, E.R. and Holte, R.C. 1990. Concept learning and heuristic classification in weak-theory domains. Artificial Intelligence 45: 229–263.
Article Google Scholar
Rissanen, J. 1978. Modelling by the shortest data description. Automatica 14: 465–471.
Article Google Scholar
Ruspini. E. 1970. Numerical methods for fuzzy clustering. Information Science 12: 319–350.
Article Google Scholar
Ruspini, E.H. 2013. Possibility as similarity: the semantics of fuzzy logic. arXiv 1304.1115.
Salakhutdinov, S. and Hinton, G. 2012. An efficient learning procedure for deep Boltzmann machines. Neural Comput. 24: 1967–2006.
Article PubMed Google Scholar
Scholz, M. and Klinkenberg, R. 2005. An ensemble classifier for drifting concepts. In: Gama, J. and Aguilar-Ruiz, J. S. (eds.), Proceedings 2nd International Workshop on Knowledge Discovery in Data Streams, pp. 53–64.
Schöneman, P.H. 1970. On metric multidimensional unfolding. Psychometrika 35: 349–366.
Article Google Scholar
Sharger, J. and Langley, P. 1990. Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufman, San Mateo.
Google Scholar
Shayda, D.O. 2012. Kolmogorov complexity, causality and spin. arXiv 1204.5447.
Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22: 888–905.
Article Google Scholar
Shu, L., Chen, A., Xiong, M. and Meng, W. 2011. Efficient spectral neighborhood blocking for entity resolution. IEEE International Conference on Data Engineering (ICDE), pp. 1067–1078.
Silberschatz, A. and Tuzhilin, A. 1996. What makes patterns interesting in knowledge discovery systems. IEEE Trans. Knowl. Data Eng. 8: 970–974.
Article Google Scholar
Smith, R. L. 1985. Maximum likelihood estimation in a class of nonregular cased. Biometrika 72: 67–90.
Article Google Scholar
Solomonoff, R.J. 2008. Three kinds of probabilistic induction: universal distributions and convergence theorems. Comput. J. 51: 566–570.
Article Google Scholar
Sommer, S., Lauze, F. and Nielsen, M. 2010. Optimization over geodesics for exact principal geodesic analysis. arXiv 1008.1902.
Takane, Y., Young, F.W. and de Leeuw, J. 1977. Nonmetric individual differences in multidimensional scaling: an alternating least squares method with optimal scaling features. Psychometrika 42: 7–67.
Article Google Scholar
Thurstone, L.L. 1935. The Vectors of the Mind. University of Chicago Press, Chicago.
Google Scholar
Timm, H., Borgelt, C., Döring, C. and Kruse, R. 2009. An extension to possibilistic fuzzy cluster analysis. https://doi.org/dx.doi.org/10.1016/j.fss.2003.11.009
Trunk, G. 1976. Statistical estimation of the intrinsic dimensionality of data collections. Inform. Control 12: 508–525.
Article Google Scholar
Ván, P. 2006. Unique additive information measures Boltzman-Gibbs-Shannon, Fisher and beyond. Physica A 365: 28–33.
Article Google Scholar
Vapnik, V.N. and Chervonenkis, A. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16 : 264–280.
Article Google Scholar
Veness, J. Sunehag, P. and Hutter, M. 2012. On ensemble techniques for [AIXI] approximation Lecture Notes Artificial Intelligence 7716: 341–351.
Google Scholar
Vereshchagin, N. and Vitányi, P. 2003. Kolmogorov’s structure functions and model selection. arXiv cc/0204037v5.
Visser, G., Dowe, D.L. and Uotila, J.P. 2009. Enhanced MML clustering using context data with climate applications. Lect. Notes Computer Sci. 5866: 170–179.
Article Google Scholar
Voges, K.E. 2012. Rough clustering using an evolutionary algorithm. Proceedings 45th Hawaii International Conferences on Systems Science (HICSS), pp. 1138–1145.
Vyugin, V.V. 1999. Most sequences are predictable. Tech. Report CLRC-TR-99-01, Computer Learning Research Centre, Royal Hollaway College, University of London, UK.
Wallace, C.S. 1998. Intrinsic classification of spatially-correlated data. Comput. J. 41: 602–611.
Article Google Scholar
Wallace, C.S. 2005. Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin.
Google Scholar
Wallace, C.S. and Boulton, D.M. 1968. An information measure for classification. Comput. J. 11: 185–195.
Article Google Scholar
Wallace, C.S. and Dale, M.B. 2005. Hierarchical clusters of vegetation types. Community Ecol. 6: 65–74.
Article Google Scholar
Wang, L. and Fu, X. 2005. Data mining with computational intelligence. Advanced Information and Knowledge Processing. Springer-Verlag, New York.
Google Scholar
Watanabe, S. 1969. Knowing and Guessing. Wiley, New York.
Google Scholar
Watts, D.J. and Strogatz, S.H. 1998. Collective dynamics of “small world networks. Nature 393: 440–442.
Article CAS PubMed Google Scholar
Webb, L.J., Tracey, J.G., Williams, W.T. and Lance, G.N. 1967. Studies in the numerical analysis of complex rain-forest communities I. A comparison of methods applicable to site/species data. J. Ecol. 55: 171–191.
Article Google Scholar
Werger, M.J.A. and Sprangers, J.Th.M.C. 1982. Comparison of floristic and structural classification of vegetation Vegetatio 50: 175–183.
Article Google Scholar
Whewell, W. 1847. The Philosophy of the Inductive Sciences. Johnson Reprint Co., New York.
Google Scholar
Wille, R. 1989. Knowledge acquisition by methods of formal concept analysis. In: Diday, E. (ed.), Data Analysis, Learning Symbolic and Numerical Knowledge. Nova Science, New York - Budapest, pp. 365–380.
Google Scholar
Williams, W.T. and Lambert, J.M. 1959. Multivariate methods in plant ecology I. Association analysis in plant communities. J. Ecol. 47: 83–101.
Article Google Scholar
Williams, W.T., Lance, G.N., Webb, L.J., Tracey, J.G. and Dale, M.B. 1969. Studies in the numerical classification of complex rain-forest communities VI. The analysis of successional data. J. Ecol. 57: 515–535.
Article Google Scholar
Wittgenstein, L. 1921. Tractatus Logico-Philosophicus. Annalen der Naturphilosophie 5: 36–51.
Google Scholar
Wong, W., Liu, W. and Bennamon, M. 2011. Ontology learning and knowledge discovery using the web: challenges and recent advances. Information Science Reference, Hershey, PA.
Book Google Scholar
Wyndham, M.P. 1985. Numerical classification of proximity data with assignment measures. J. Classif. 2: 157–172.
Article Google Scholar
Wyse, N., Dubes, R. and Jain, A. K. 1980. A critical evaluation of intrinsic dimensionality algorithms. In: Gelsema, E.S. and Kanal, L.N. (eds.), Pattern Recognition in Practice. North Holland, Amsterdam, pp. 415–425.
Google Scholar
Yu, S. and Shi, J. 2003. Multiclass spectral clustering. Proceedings IEEE International Conference Computer Vision. pp. 313–319.
Zadeh, L.A. 1965. Fuzzy sets. Information and Control 8: 338–353.
Article Google Scholar
Zelnik-Manor, L. and Perona, P. 2005. Self-tuning spectral clustering. Advances in Neural Information Processing Systems 17: 1601–1608.
Google Scholar
Zhang, K. and Kwok, J.T. 2010. Clustered Nystrom method for large scale manifold learning and dimension reduction. IEEE Transactions on Neural Networks 21: 1576–1587.
Article PubMed Google Scholar
Zhang, Y. and Li, T. 2011. Consensus clustering + meta clustering = multiple consensus clustering. Proceedings 24th International Florida Artificial Intelligence Research Society Conference. pp. 81–86.

Download references

Author information

Authors and Affiliations

Griffith School of Environment, Griffith University, Nathan, Australia
M. B. Dale

Authors

M. B. Dale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. B. Dale.

Rights and permissions

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Dale, M.B. Compression and knowledge discovery in ecology. COMMUNITY ECOLOGY 14, 196–207 (2013). https://doi.org/10.1556/ComEc.14.2013.2.10

Download citation

Received: 09 March 2013
Revised: 06 May 2013
Accepted: 15 July 2013
Published: 30 December 2013
Issue Date: December 2013
DOI: https://doi.org/10.1556/ComEc.14.2013.2.10

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Compression and knowledge discovery in ecology

Abstract

Article PDF

Similar content being viewed by others

Correspondence analysis, spectral clustering and graph embedding: applications to ecology and economic complexity

An Introduction to KDB: Knowledge Discovery in Biodiversity

Insights in Hierarchical Clustering of Variables for Compositional Data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compression and knowledge discovery in ecology

Abstract

Article PDF

Similar content being viewed by others

Correspondence analysis, spectral clustering and graph embedding: applications to ecology and economic complexity

An Introduction to KDB: Knowledge Discovery in Biodiversity

Insights in Hierarchical Clustering of Variables for Compositional Data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation