Abstract
Knowledge discovery is the non-trivial process of identifying valid, novel, interesting, potentially useful and ultimately understandable patterns in data. It encompasses a wide range of techniques ranging from data cleaning to finding manifolds and separating mixtures. Starting in the early 50’s, ecologists contributed greatly to the development of these methods and applied them to a large number of problems. However, underlying the methodology are some fundamental questions bearing on their choice and function. In addition, other fields, from sociology to quantum mechanics, have developed alternatives or solutions to various problems. In this paper, I want to look at some of the general questions underlying the processes. I shall then briefly examine aspects of 3 areas, manifolds, clustering and networks, specifically for choosing between them using the concept of compression. Finally, I shall briefly examine some of the future possibilities which remain to be examined. These provide methods of possibly improving the results of clustering analysis in vegetation studies.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: the action hierarchy approach. In: Heckerman, D., Mannila, H., Pregibon, D. and Uthurusamy, R. (eds.), Proceedings 3rd International Conference Knowledge Discovery Data Mining. AAAI, pp. 111–114.
Aerts, D. and Gabora, L. 2005. A theory of concepts and their combinations I: The structure of the sets of contexts and properties. Kybernetes 34: 151–175.
Aha, D.W., Kibler, D. and Albert, M.K. 1991. Instance-based learning algorithms. Mach. Learn. 6: 37–66.
Aitchison, J. 1986. The Statistical Analysis of Compositional Data. Chapman & Hall, London.
Akaike, H. 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control 19: 716–723.
Allen, T.F.H. and Hoekstra, T.W. 1990. The confusion between scale-defined levels and conventional levels of organization in ecology. J. Veg. Sci. 1: 5–12.
Anderson, M. Fu, G-S., Phlypo, R. and Adali, T. 2013 Independent vector analysis: identification conditions and performance bounds. arxiv 1303.7474.
Antonelli, P.L. 1990. Applied Volterra-Hamilton systems of the Finsler type: increased species diversity as a non-chemical defense for coral against crown-of-thorns. In: Bradbury, R. H. (ed.), Acanthaster and the Coral Reef: A Theoretical Perspective, Lecture Notes in Biomathematics 8, Springer-Verlag, Berlin. pp. 220–235.
Babušaka, R., van der Venn, P.J. and Kaymak, U. 2002. Improved covariance estimation for Gustafson-Kessel clustering. Proceedings of the 2002 IEEE International Conference on Fuzzy Systems, Honolulu. pp. 1081–1085.
Beals, E.W. 1973 Ordination: mathematical elegance and ecological naiveté. J. Ecol. 61: 23–35.
Béjar, J. 2000. Improving knowledge discovery using domain knowledge in unsupervised learning. Lect. Notes Comput. Sc. 1810: 47–54.
Benzecri, J-P. 1973. L’Analyse des Données. Vol. II. L’Analyse des Correspondances. Dunod, Paris.
Bio, A.M.F., Alkemade, R. and Barendregt, A. 1998. Determining alternative models for vegetation response analysis: a non-parametric approach. J. Veg. Sci. 9: 5–16.
Blumer, A., Ehrenfeucht, A., Haussler, D. and Warmuth, M.K. 1987. Occam’s razor. Inform. Process. Lett. 24: 377–380.
Blumer, A., Ehrenfeucht, A., Haussler, D. and Warmuth, M.K. 1989. Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36: 929–965.
Bolognini, G. and Nimis, P.L. 1993. Phytogeography of Italian deciduous oakwoods based on numerical classification of plant distribution ranges. J. Veg. Sci. 4: 847–860.
Bond, T.G. and Fox, C.M. 2007. Applying the Rasch Model: Fundamental Measurement in the Human Sciences. 2nd ed. (includes Rasch software on CD-ROM). Lawrence Erlbaum, Mahwah, NJ.
Bonnard, C., Berry, V. and Lartillot, N. 2005. Multipolar consensus for phylogenetic trees. Syst. Biol. 55: 837–843.
Borg, I. and Groenen, P. 2005. Modern Multidimensional Scaling: Theory and Applications. 2nd ed. Springer, New York.
Brooks, R.J. and Tobias, A.M. 1996. Choosing the best model: level of detail, complexity and model performance. Math. Comput. Model. 24: 1–14.
Buehrer, D. and Lee, C.-H. 2013 Class algebra for ontology reasoning. arXiv 1302.0334.
Bunitine, W. and Jakulin, A. 2006. Discrete component analysis. arXiv 0604410.
Caruana, R.R. and Freitag, D. 1994. How useful is relevance? Working Notes of the AAAI Fall Symposium on Relevance. AAAI Press, New Orleans, pp. 25–29.
Carroll, J.D. and Chang, J.J. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of ‘Eckhart-Young’ decomposition. Psychometrika 35: 283–319.
Cheeseman, P. 1990. On finding the most probable model. In: Sharger, J. and Langley, P. (eds.), Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmann, San Mateo, pp. 73–96.
Chen, K. 2013. Towards the acquisition of temporal knowledge. arXiv 1304.3079.
Cilibrasi, R. 2006. Statistical inference through data compression. ILLC Dissertation Series DS–2006–08, Institute for Logic, Language and Computation, Universiteit van Amsterdam.
Coscia, M., Giannotti, F. and Pedrechi, D. 2012. A classification of community discovery methods in complex networks. arXiv 1206.3552.
Coombs, C.H. and Kao, R.C. 1955. Nonmetric Factor Analysis. Engineering Research Bulletin 38, Engineering Research Institute, University of Michigan, Ann Arbor.
Crutchfield, J.P. 1990. Information and its metric. In: Lam, L. and Morris, H.C. (eds.), Nonlinear Structures in Physical Systems — Pattern Formation, Chaos, and Waves. Springer, Berlin, pp. 119–130.
Dale, M. 1985. Graph theoretical methods for comparing phytosociological structures. Vegetatio 63: 79–88.
Dale, M.B. 2000. On plexus representation of dissimilarities. Community Ecol. 1: 43–56.
Dale, M.B. and Anderson, D.J. 1973. Inosculate analysis of vegetation data. Austr. J. Bot. 21: 253–276.
Dale, M.B. and Barson, M.M. 1989. Grammars in vegetation analysis. Vegetatio 81: 79–94.
Dale, M.B. and Clifford, H.T. 1976. The effectiveness of higher taxonomic ranks for vegetation analysis. Austr. J. Ecol. 1: 37–62.
Dale, M.B. and Hogeweg, P. 1998. The dynamics of diversity: a cellular automaton approach. Coenoses 13: 3–15.
Dale, P.E.R. 1983. Scale problem in classification: an application of a stochastic method to evaluate the relative heterogeneity of sample units. Austr. J. Ecol. 8: 189–198.
Day, W. H. E. 1988. Consensus methods as tools in data analysis. In: Bock, H.H. (ed.), Classification and Related Methods of Data Analysis. North Holland, Amsterdam, pp. 317–324.
de Leeuw, J. 2005. Multidimensional Unfolding. The Encyclopedia of Statistics in Behavioral Science, Wiley, N.Y.
Diday, E. and Bertrand, P. 1986. An extension to hierarchical clustering: the pyramidal presentation. In: Gelsema E.s. and Kanak, L.N. (eds), Pattern Recognition in Practice. Elsevier Science, Amsterdam, pp. 411–424
Diday, E., and Emilion, R. 1997. Treillis de Galois maximaux et Capacités de Choquet. Comptes Rendus de l’Académie des Sciences. Analyse Mathématique Séries 1, Mathematics. 325: 261–266.
Echenin, M., Peltier, N. and Tourret, S. 2013. An approach to abductive reasoning in equational logic. Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 531–537.
Epstein, S. 2013. All sampling methods produce outliers. arXiv 1304.3872.
Fekete, G. and Lacza, J.Sz. 1970. A survey of plant life form systems and the respective research approaches II. Annals Historico-Naturales Musei Nationalis Hungarici Pars Botanica 62: 115–127.
Feoli, E. and Zuccarello, V. 1986. Ordination based on classification: yet another solution? Abstracta Botanica 10: 203–219.
Feoli, E. and Zuccarello, V. 1994. Naivete of fuzzy system space in vegetation dynamics. Coenoses 9: 25–32.
Foster, D., Kakade, S. and Salakhutdinov, R. 2011. Domain adaptation: overfitting and small sample statistics. ArXiv 105.0857v1.
Gell-Mann, M. 1994 The Quark and the Jaguar. W. H. Freeman, San Francisco.
Gençay, R., Selçuk, F. and Whitcher, B. 2001. An Introduction to Wavelets and Other Filtering Methods in Finance and Economics. Academic Press, N.Y.
Gifi, A. 1990. Nonlinear Multivariate Analysis. Wiley, New York.
Globerson, A. and Tisby, N. 2003 Sufficient dimensionality reduction. J. Machine Learning Res. 3: 1307–1331.
Goodall, D.W. 1952. Objective methods in the classification of vegetation I. The use of positive interspecific correlation. Aust. J. Bot. 1: 39–63.
Gopalakrishna, A.K., Ozcelebi, T., Liotta, A. and Lukkein, J. 2013. Relevance as a metric for evaluating machine learning algorithms. arXiv 1303.7093.
Gorban, A., Sumner, N.R. and Zinovyev, A. 2008. Beyond the concept of manifolds: principal trees, metro maps, and elastic cubic complexes. In: Gorban, A., Kégl, B., Wunsch, D. and Zinovyev, A. (eds.), Principal Manifolds for Data Visualization and Dimension Reduction, Lecture Notes in Computational Science and Engineering 58: 219–237.
Gower, J.C. 1977. The analysis of asymmetry and orthogonality. In: Barra, J. R. et al. (eds.), Recent Developments in Statistics. North Holland, Amsterdam, pp. 109–123.
Grassberger, P. 1991. Information and Complexity Measures. In: Atmanspacher, H. and Scheingraber, H. (eds), Dynamical Systems, Information Dynamics, Plenum Press, New York, pp. 15–33.
Gull, S.F. 1988. Bayesian inductive inference and maximum entropy. In: Erickson, G.J. and Smith, C.R. (eds.), Maximum Entropy and Bayesian Methods in Science and Engineering. 1. Foundations. Kluwer, Dordrecht. pp. 53–74.
Gustafson, E. and Kessel, W. 1979. Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings I. E. E. E. Conference Decision Control. pp. 761 –766.
Hájek, P. and Havránek, T. 1977. On generation of inductive hypotheses. Int. J. Man-Mach. Stud. 9: 415–438.
Heiser, W.J. 1987. Joint ordination of species and sites: the unfolding technique. In: Legendre, P. and Legendre, L. (eds.), Developments in Numerical Ecology. Springer, Berlin. pp. 189–221.
Hernández-Orallo, J. 1998. Consilience as a basis for theory formation. In: Magnani, L. Nersessian, N.J. and Thagard, P. (eds.), Proc. Conf. Model Based Reasoning, Pavia (MBR’98). Kluwer/Plenum. pp. 17–19.
Hernández-Orallo, J. 1999. Computational measures of information gain and reinforcement in inference processes. PhD Thesis, Department of Logic and Philosophy, University of Valencia.
Hill, M.O. 1973. Reciprocal averaging: an eigenvector method of ordination. J. Ecol. 61: 237–249.
Hill, M.O. and Gauch, H.G. Jr. 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47–58.
Hron, K., Templ, M. and Filzmoser, P. 2010. Exploratory compositional data analysis using the R-package robCompositions. In: Aivazian, S., Filzmoser, P. and Kharin, Yu. (eds.), Proceedings 9th International Conference on Computer Data Analysis and Modeling, Belarusian State University, Minsk. 1: 179–186.
Hubert, L., Meulman, J. and Heiser, W. 2000. Two purposes for matrix factorization: a historical appraisal. SIAM Review 42: 68–82.
Hyvärinen, A. and Oja, E. 2000. Independent component analysis: algorithms and applications. Neural Networks 13: 411–430.
Hyvärinen, A. and Pajunen, P. 1999. Nonlinear independent component analysis: existence and uniqueness results. Neural Networks 12: 429–439.
Ihm, P. and van Groenewoud, H. 1984. Correspondence analysis and Gaussian ordination. COMPSTAT lectures 3: 5–60.
Jeffrey, H. 1961. Theory of Probability. Cambridge University Press, Cambridge.
Jiang, J. 2008. A literature survey on domain adaptation. https://doi.org/si-faka.cs.uiuc.edu/jiang4/domain adaptation/survey/da sur-vey.pdf
Joshi, M., Lingras, P., Yiyu Yao, Virendrakumar, C.B. 2010. Rough, fuzzy, interval clustering for web usage mining. In: Lingras, O., Yao, Y. Y. and Virendrakumar, C.B. (eds), 10th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 397–402.
Kadous, M.W. 1995. Expanding the scope of concept learning using meta features. School of Computer Science and Engineering, University of New South Wales. https://doi.org/rexa.info/paper/4ccb84298ff6f0a62f8263c57259cc114cb1b328
Kawakami, H., Akinaga, R., Suto, H. and Katai, O. 2003. Translating novelty of business models into terms of modal logics. Proceedings 16th Australian Conference on AI, Lecture Notes in Computer Science. pp. 821–832.
Kaymak, U. and Setnes, M. 2002. Fuzzy clustering with volume prototypes and adaptive cluster merging. IEEE Transactions on Fuzzy Systems 10(6): 705–712.
Kearns, M., Mansour, Y. and Ng, A.Y. 2013. An information analysis of hard and soft assignment methods for clustering. arXiv 1302.1552.
Kemp, C., Perfors, A. and Tenenbaum, J.B. 2007. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10: 307–321.
Kiers, H.A.L. 1994. SIMPLIMAX: Oblique rotation to an optimal target with simple structure. Psychometrika 59: 567–579.
Keogh, E.J., Lonardi, S., Ratanamahatana, C.A., Wei, L., Lee, S-H. and Handley, J. 2007. Compression-based data mining of sequential data. Data Min. Knowl. Disc. 14: 99–129.
Kodratoff, Y. 1986. Leçons d’apprentissage symbolique, Editions Cépadues, Toulouse.
Kolmogorov, A.N. 1965. Three approaches to the quantitative definition of information. Problems of Information Transmission 1: 4–17.
Koppel, M. and Atlan, H. 1991. An almost machine-independent theory of program-length complexity, sophistication, and induction. Information Sciences 56: 23–33.
Kordon, A. 2009. Computational intelligence marketing. SIGEVO-lution 4: 2–11.
Kourie, D.G. and Oosthuizen, G.D. 1998. Lattices in machine learning: complexity issues. Acta Informatica 35: 289–292.
Krishnapuram, R. and Keller, J. 1993 A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1: 98–110.
Kruskal, J.B. 1964. Multidimensional scaling by optimizing goodness of fit to nonmetric hypothesis. Psychometrika 29: 1–27.
Kušelová, I. and Chytrý, M. 2004. Interspecific associations in phytosociological data sets: how do they change between local and regional scale? Plant Ecol. 173: 247–257.
Lambert, J.M. and Williams, W.T. 1962 Multivariate methods in plant ecology IV. Nodal Analysis. J. Ecol. 50: 775–803.
Lance, G.N. and Williams, W.T. 1967 A general theory of classificatory sorting strategies I. Hierarchical systems. Comput. J. 9: 373–380.
Laurence, S. and Margolis, E. 1999. Concepts: Core Readings. MIT Press, Cambridge.
Lavorel, S., Mcintyre, S., Landsberg, J. and Forbes, T.D.A. 1997. Plant functional classifications: from general groups to specific groups based on disturbance. Trends Ecol. Evol. 12: 474–478.
Lempel, A. and Ziv, J. 1976. On the complexity of finite sequences. IEEE Trans. Inf. Theory 22: 75–81.
Liu, B., Hsu, W., Mun, L-F. and Lee, H.-Y. 1999. Finding interesting patterns using user expectation. I.E.E.E. Transactions Knowledge Data Engineering 11: 817–832.
Lloyd, S. 2001. Measures of complexity: A non-exhaustive list. IEEE Control Systems Magazine 21: 78.
Lopez-Ruiz, R., Sanudo, J., Romera, E. and Calbet, X. 2012 Statistical complexity and Fisher-Shannon Information. Applications. arXiv 1201.2291.
Lugosi, G. and Zeger, K. 1996. Concept learning using complexity regularization. IEEE Transactions Information Theory 42: 48–54.
Macnaughton-Smith, P. 1965. Some statistical and other numerical techniques for classifying individuals. Home Office Res. Unit Rep. 6, HMSO, London.
McQuarrie, A.D.R. and Tsai, C.-L. 1998. Regression and Time Series Model Selection. World Scientific, Singapore.
Mikkelson, G.M. 2001. Complexity and verisimilitude: realism for ecology. Biol. Philos. 16: 533–546.
Mondal, N. and Ghosh, P.P. 2013. On the existence of parallel computation in nature. arXiv 1304.0160.
Moraczewski, I.R. 1993a. Fuzzy logic for phytosociology 1. Syntaxa as vague concepts. Vegetatio 106: 1–11.
Moraczewski, I.R. 1993b. Fuzzy logic for phytosociology 2. Generalizations and prediction. Vegetatio 106: 13–20.
Ng, A., Jordan, M. and Weiss, Y. 2001. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems 14:849–856.
Niven, B.S. 1988. The ecosystem as an algebraic category: a mathematical basis for theory of community and ecosystem in animal ecology. Coenoses 3: 83–88.
Niven, B.S. 1992. Formalization of some basic concepts of plant ecology Coenoses 7: 103–113.
Orlóci, L. 1991. On character-based plant community analysis: choice, arrangement, comparison. Coenoses 5: 103–108.
Pascual-Montano, A., Crazo, J.M., Kochi, K., Lehman, D. and Pascual-Montano, R. 2006. Nonsmooth nonnegative matrix factorisation. IEEE Transactions Pattern Analysis Machine Intelligence 28: 403–415.
Pestov, V. 2010. PAC learnability of a concept class under non-atomic measures: a problem by Vidyasagar. arXiv 1006.5090.
Pestov, V. 2011. PAC learnability versus VC dimension: a footnote to a basic result of statistical learning. arXiv 1104:2097.
Peters, G. 2006. Some refinements of rough k-means clustering. Pattern Recognition 39: 1481–1491.
Podani, J. 1986. Comparisons of partitions in vegetation studies. Abstracta Botanica 10: 235–290.
Podani, J. 1989. A method for generating consensus partitions and its application to community classification. Coenoses 4: 1–10.
Podani, J. 1998. Explanatory variables in classifications and the detection of the optimum number of clusters. In: Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.-H. and Baba, Y. (eds.), Data Science, Classification and Related Methods. Springer, Tokyo, pp. 125–132.
Porter, B.W., Bareiss, E.R. and Holte, R.C. 1990. Concept learning and heuristic classification in weak-theory domains. Artificial Intelligence 45: 229–263.
Rissanen, J. 1978. Modelling by the shortest data description. Automatica 14: 465–471.
Ruspini. E. 1970. Numerical methods for fuzzy clustering. Information Science 12: 319–350.
Ruspini, E.H. 2013. Possibility as similarity: the semantics of fuzzy logic. arXiv 1304.1115.
Salakhutdinov, S. and Hinton, G. 2012. An efficient learning procedure for deep Boltzmann machines. Neural Comput. 24: 1967–2006.
Scholz, M. and Klinkenberg, R. 2005. An ensemble classifier for drifting concepts. In: Gama, J. and Aguilar-Ruiz, J. S. (eds.), Proceedings 2nd International Workshop on Knowledge Discovery in Data Streams, pp. 53–64.
Schöneman, P.H. 1970. On metric multidimensional unfolding. Psychometrika 35: 349–366.
Sharger, J. and Langley, P. 1990. Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufman, San Mateo.
Shayda, D.O. 2012. Kolmogorov complexity, causality and spin. arXiv 1204.5447.
Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22: 888–905.
Shu, L., Chen, A., Xiong, M. and Meng, W. 2011. Efficient spectral neighborhood blocking for entity resolution. IEEE International Conference on Data Engineering (ICDE), pp. 1067–1078.
Silberschatz, A. and Tuzhilin, A. 1996. What makes patterns interesting in knowledge discovery systems. IEEE Trans. Knowl. Data Eng. 8: 970–974.
Smith, R. L. 1985. Maximum likelihood estimation in a class of nonregular cased. Biometrika 72: 67–90.
Solomonoff, R.J. 2008. Three kinds of probabilistic induction: universal distributions and convergence theorems. Comput. J. 51: 566–570.
Sommer, S., Lauze, F. and Nielsen, M. 2010. Optimization over geodesics for exact principal geodesic analysis. arXiv 1008.1902.
Takane, Y., Young, F.W. and de Leeuw, J. 1977. Nonmetric individual differences in multidimensional scaling: an alternating least squares method with optimal scaling features. Psychometrika 42: 7–67.
Thurstone, L.L. 1935. The Vectors of the Mind. University of Chicago Press, Chicago.
Timm, H., Borgelt, C., Döring, C. and Kruse, R. 2009. An extension to possibilistic fuzzy cluster analysis. https://doi.org/dx.doi.org/10.1016/j.fss.2003.11.009
Trunk, G. 1976. Statistical estimation of the intrinsic dimensionality of data collections. Inform. Control 12: 508–525.
Ván, P. 2006. Unique additive information measures Boltzman-Gibbs-Shannon, Fisher and beyond. Physica A 365: 28–33.
Vapnik, V.N. and Chervonenkis, A. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16 : 264–280.
Veness, J. Sunehag, P. and Hutter, M. 2012. On ensemble techniques for [AIXI] approximation Lecture Notes Artificial Intelligence 7716: 341–351.
Vereshchagin, N. and Vitányi, P. 2003. Kolmogorov’s structure functions and model selection. arXiv cc/0204037v5.
Visser, G., Dowe, D.L. and Uotila, J.P. 2009. Enhanced MML clustering using context data with climate applications. Lect. Notes Computer Sci. 5866: 170–179.
Voges, K.E. 2012. Rough clustering using an evolutionary algorithm. Proceedings 45th Hawaii International Conferences on Systems Science (HICSS), pp. 1138–1145.
Vyugin, V.V. 1999. Most sequences are predictable. Tech. Report CLRC-TR-99-01, Computer Learning Research Centre, Royal Hollaway College, University of London, UK.
Wallace, C.S. 1998. Intrinsic classification of spatially-correlated data. Comput. J. 41: 602–611.
Wallace, C.S. 2005. Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin.
Wallace, C.S. and Boulton, D.M. 1968. An information measure for classification. Comput. J. 11: 185–195.
Wallace, C.S. and Dale, M.B. 2005. Hierarchical clusters of vegetation types. Community Ecol. 6: 65–74.
Wang, L. and Fu, X. 2005. Data mining with computational intelligence. Advanced Information and Knowledge Processing. Springer-Verlag, New York.
Watanabe, S. 1969. Knowing and Guessing. Wiley, New York.
Watts, D.J. and Strogatz, S.H. 1998. Collective dynamics of “small world networks. Nature 393: 440–442.
Webb, L.J., Tracey, J.G., Williams, W.T. and Lance, G.N. 1967. Studies in the numerical analysis of complex rain-forest communities I. A comparison of methods applicable to site/species data. J. Ecol. 55: 171–191.
Werger, M.J.A. and Sprangers, J.Th.M.C. 1982. Comparison of floristic and structural classification of vegetation Vegetatio 50: 175–183.
Whewell, W. 1847. The Philosophy of the Inductive Sciences. Johnson Reprint Co., New York.
Wille, R. 1989. Knowledge acquisition by methods of formal concept analysis. In: Diday, E. (ed.), Data Analysis, Learning Symbolic and Numerical Knowledge. Nova Science, New York - Budapest, pp. 365–380.
Williams, W.T. and Lambert, J.M. 1959. Multivariate methods in plant ecology I. Association analysis in plant communities. J. Ecol. 47: 83–101.
Williams, W.T., Lance, G.N., Webb, L.J., Tracey, J.G. and Dale, M.B. 1969. Studies in the numerical classification of complex rain-forest communities VI. The analysis of successional data. J. Ecol. 57: 515–535.
Wittgenstein, L. 1921. Tractatus Logico-Philosophicus. Annalen der Naturphilosophie 5: 36–51.
Wong, W., Liu, W. and Bennamon, M. 2011. Ontology learning and knowledge discovery using the web: challenges and recent advances. Information Science Reference, Hershey, PA.
Wyndham, M.P. 1985. Numerical classification of proximity data with assignment measures. J. Classif. 2: 157–172.
Wyse, N., Dubes, R. and Jain, A. K. 1980. A critical evaluation of intrinsic dimensionality algorithms. In: Gelsema, E.S. and Kanal, L.N. (eds.), Pattern Recognition in Practice. North Holland, Amsterdam, pp. 415–425.
Yu, S. and Shi, J. 2003. Multiclass spectral clustering. Proceedings IEEE International Conference Computer Vision. pp. 313–319.
Zadeh, L.A. 1965. Fuzzy sets. Information and Control 8: 338–353.
Zelnik-Manor, L. and Perona, P. 2005. Self-tuning spectral clustering. Advances in Neural Information Processing Systems 17: 1601–1608.
Zhang, K. and Kwok, J.T. 2010. Clustered Nystrom method for large scale manifold learning and dimension reduction. IEEE Transactions on Neural Networks 21: 1576–1587.
Zhang, Y. and Li, T. 2011. Consensus clustering + meta clustering = multiple consensus clustering. Proceedings 24th International Florida Artificial Intelligence Research Society Conference. pp. 81–86.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Dale, M.B. Compression and knowledge discovery in ecology. COMMUNITY ECOLOGY 14, 196–207 (2013). https://doi.org/10.1556/ComEc.14.2013.2.10
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1556/ComEc.14.2013.2.10