Abstract
In this paper we use the property of diatoms as bioindicators, to indentify which physical-chemical parameters are contained in the taken sample using machine learning algorithm – CN2. Important physical-chemical parameters such as conductivity, saturated oxygen, pH, organic chemical parameters and metals are important in the process of environmental monitoring. These physical-chemical parameters have influence on the entire lake web food chain, thus disturbing the organism’s patterns and interactions between them, such as diatoms community. These communities have high coefficient of indication on certain process such as eutrophication and presence or absence of certain physical-chemical parameters, which means that they can be used as bio-indicators of water quality. The machine learning algorithm – CN2 can produce rules in a form IF-THEN which is suitable for organizing knowledge from diatoms abundance data. In literature the diatoms have ecological preference organized in the same manner. The experimental setup is build to satisfy not only the algorithm properties, but also the ecological knowledge of the diatoms community. We used several modifications of the algorithm, from which then we compare the compactness and coverage of the induced rule. Nevertheless, for regression problems we compare the correlation coefficient, root mean square error (RMSE) and relative root mean square error (RRMSE) or rule quality to point which experiment proved to be most accuracy and more general. Several of the rules are presented in this paper together with the evaluation performance.
Based on modifications of the CN2 algorithm parameters, we were able to extract certain knowledge form the data, which later have proved to be valid, or in some cases is novel for many newly discovered diatoms. In future we plan to investigate more modifications of the CN2 algorithm, also to implement multi-target rule induction and compare these results to the single target.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3(4), 261–283 (1989)
Clark, P., Boswell, R.: Rule induction with CN2: Some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)
Cestnik, B.: Estimating probabilities: A crucial task in machine learning. In: Aiello, L. (ed.) Proceedings of the Ninth European Conference on Artificial Intelligence (ECAI 1990), London, UK/Boston, MA, USA, Pitman, pp. 147–149 (1990)
Džeroski, S., Cestnik, B., Petrovski, I.: Using the m-estimate in rule induction. Journal of Computing and Information Technology 1(1), 37–46 (1993)
Weiss, S.M., Indurkhya, N.: Rule-based regression. In: Bajcsy, R. (ed.) Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI 1993), Chambéry, France, August 28-September 3, pp. 1072–1078. Morgan Kaufmann, San Francisco (1993)
Torgo, L., Gama, J.: Regression by classification. In: Borges, D., Kaestner, C. (eds.) SBIA 1996. LNCS (LNAI), vol. 1159, pp. 51–60. Springer, Heidelberg (1996)
Reid, M.A., Tibby, J.C., Penny, D., Gell, P.A.: The use of diatoms to assess past and present water quality. Australian Journal of Ecology 20(1), 57–64 (1995)
Gold, C., Feurtet-Mazel, A., Coste, M., Boudou, A.: Field transfer of periphytic diatom communities to assess short term structural effects of metals (Cd, Zn) in rivers. Water Research 36, 3654–3664 (2002)
Flach, P., Lavrać, N.: Rule induction. In: Berthold, M.R., Hand, D.J. (eds.) Intelligent Data Analysis, 2nd edn., pp. 229–267. Springer, Berlin (2003)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Michalski, R.S.: On the quasi-minimal solution of the general covering problem. In: Proceedings of the Fifth International Symposium on Information Processing (FCIP 1969), Bled, Yugoslavia. Switching Circuits, vol. A3, pp. 125–128 (1969)
Michalski, R.S., Mozetic, I., Hong, J., Lavrač, N.: The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In: Proceedings of the Fifth National Conference on Artificial Intelligence (AAAI 1986), Philadelphia, PA, USA, pp. 1041–1047. Morgan Kaufmann, San Francisco (1986)
Torgo, L.: Data fitting with rule-based regression. In: Žižika, J., Brazdil, P. (eds.) Proceedings of the Second International Workshop on Artificial Intelligence Techniques (AIT 1995), Brno, Czech Republic (1995)
Friedman, J.H., Fisher, N.I.: Bump hunting in high-dimensional data. Statistics and Computing 9(2), 123–143 (1999)
Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Technical report, Stanford University, Stanford, CA, USA (2005)
TRABOREMA Project WP3, EC FP6-INCO project no. INCO-CT-2004-509177 (2005-2007)
WFD Water Quality - Sampling - Part 2: Guidance on sampling techniques (ISO 5667-2:1991) (1993)
Levkov, Z., Krstič, S., Metzeltin, D., Nakov, T.: Diatoms of Lakes Prespa and Ohrid (Macedonia). Iconographia Diatomologica 16, 603 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Naumoski, A., Mitreski, K. (2010). Rule Induction of Physical-Chemical Water Property from Diatoms Community. In: Davcev, D., Gómez, J.M. (eds) ICT Innovations 2009. ICT Innovations 2009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10781-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-10781-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10780-1
Online ISBN: 978-3-642-10781-8
eBook Packages: EngineeringEngineering (R0)