Predicting Shellfish Farm Closures with Class Balancing Methods

  • Claire D’Este
  • Ashfaqur Rahman
  • Alison Turnbull
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7691)


Real-time environmental monitoring can provide vital situational awareness for effective management of natural resources. Effective operation of Shellfish farms depends on environmental conditions. In this paper we propose a supervised learning approach to predict the farm closures. This is a binary classification problem where farm closure is a function of environmental variables. A problem with this classification approach is that farm closure events occur with small frequency leading to class imbalance problem. Straightforward learning techniques tend to favour the majority class; in this case continually predicting no event. We present a new ensemble class balancing algorithm based on random undersampling to resolve this problem. Experimental results show that the class balancing ensemble performs better than individual and other state of art ensemble classifiers. We have also obtained an understanding of the importance of relevant environmental variables for shellfish farm closure. We have utilized feature ranking algorithms in this regard.


Bayesian Network Minority Class Feature Ranking Class Imbalance Problem Average Vote 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Muttil, N., Chau, K.: Machine-learning paradigms for selecting ecologically significant input variables. Journal Engineering Applications of Artificial Intelligence 20(6), 735–744 (2007)CrossRefGoogle Scholar
  2. 2.
    Rahman, A., Verma, B.: Novel layered clustering-based approach for generating ensemble of classifiers. IEEE Transactions on Neural Networks 22(5), 781–792 (2011)CrossRefGoogle Scholar
  3. 3.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  5. 5.
    Breiman, L.: Pasting small votes for classification in large databases and on-line. Machine Learning 36, 85–103 (1999)CrossRefGoogle Scholar
  6. 6.
    Martinez-Munoz, G., Hernandez-Lobato, D., Suarez, A.: An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans. on Pattern Analysis and Machine Intelligence 31(2), 245–259 (2009)CrossRefGoogle Scholar
  7. 7.
    Chen, L., Kamel, M.: A generalized adaptive ensemble generation and aggregation approach for multiple classifiers systems. Pattern Recognition 42, 629–644 (2009)zbMATHCrossRefGoogle Scholar
  8. 8.
    Nanni, L., Lumini, A.: Fuzzy bagging: a novel ensemble of classifiers. Pattern Recognition 39, 488–490 (2006)zbMATHCrossRefGoogle Scholar
  9. 9.
    Eschrich, S., Hall, L.O.: Soft partitions lead to better learned ensembles, pp. 406–411 (2002)Google Scholar
  10. 10.
    Schapire, R.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990)Google Scholar
  11. 11.
    Freund, Y., Schapire, R.: Decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Drucker, H., Cortes, C., Jackel, L., LeCun, Y., Vapnik, V.: Boosting and other ensemble methods. Neural Computation 6(6), 1289–1301 (1994)zbMATHCrossRefGoogle Scholar
  13. 13.
    Garcia-Pedrajas, N.: Constructing ensembles of classifiers by means of weighted instance selection. IEEE Trans. on Neural Networks 20(2), 258–277 (2009)CrossRefGoogle Scholar
  14. 14.
    Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 341–378 (2002)Google Scholar
  15. 15.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Claire D’Este
    • 1
  • Ashfaqur Rahman
    • 1
  • Alison Turnbull
    • 2
  1. 1.Intelligent Sensing and Systems Laboratory and Food Future FlagshipCSIRO, Castray EsplanadeHobartAustralia
  2. 2.Department of Health and Human ServicesHobartAustralia

Personalised recommendations