Similarity Weighted Ensembles for Relocating Models of Rare Events

  • Claire D’Este
  • Ashfaqur Rahman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7872)


Spatially distributed regions may have different influences that affect the underlying physical processes and make it inappropriate to directly relocate learned models. We may also be aiming to detect rare events for which we have examples in some regions, but not others. A novel method is presented for combining classifiers trained on regions with known sensor data and predicting rare events in new regions, specifically the closure of shellfish farms. The proposed similarity weighted ensemble method demonstrates an average 10 fold improvement in accuracy over One Class classification and 3 fold improvement over rules hand-crafted by an expert.


Matthews Correlation Faecal Bacterium Fold Improvement National Weather Service Practical Salinity Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Muttil, N., Chau, K.: Machine-learning paradigms for selecting ecologically significant input variables. Journal Engineering Applications of Artificial Intelligence 20(6), 735–744 (2007)CrossRefGoogle Scholar
  2. 2.
    Bernard, E., Meinig, C.: History and future of deep-ocean tsunami measurements. In: OCEANS 2011, pp. 1–7. IEEE (2011)Google Scholar
  3. 3.
    D’Este, C., Rahman, A., Turnbull, A.: Predicting shellfish farm closures with class balancing methods. In: Thielscher, M., Zhang, D. (eds.) AI 2012. LNCS, vol. 7691, pp. 39–48. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Chigbu, P., Strange, T., Gordon, S., Jester, K., Baham, J., Young, J., Hughes, R., Remata, R., Martinolich, K., Hilbert, K., Mott, D., Watts, M., McIntosh, M.: Development of decision support tools for aquaculture: the pond experience. Journal of Shellfish Research 25(3), 1091–1099 (2006)Google Scholar
  5. 5.
    Kelsey, R., Scott, G., Porter, D., Siewicki, T., Edwards, D.: Improvements to shellfish harvest area closure decision making using gis, remote sensing and predictive models. Estuaries and Coasts 33, 712–722 (2010)CrossRefGoogle Scholar
  6. 6.
    Choe, W., Ersoy, O., Bina, M.: Neural network schemes for detecting rare events in human genomic dna. Bioinformatics 16(12), 1062–1072 (2000)CrossRefGoogle Scholar
  7. 7.
    Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 341–378 (2002)Google Scholar
  8. 8.
    Tax, D.: One-class classification. PhD thesis, Delft University of Technology (2001)Google Scholar
  9. 9.
    Minku, L.L., Yao, X.: Using unreliable data for creating more reliable online learners. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2012)Google Scholar
  10. 10.
    Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research 11, 169–198 (1999)zbMATHGoogle Scholar
  11. 11.
    Tax, D.M.J., Duin, R.P.W.: Combining one-class classifiers. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 299–308. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)Google Scholar
  13. 13.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The Weka data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar
  14. 14.
    Freund, Y., Schapire, R.: Decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Hempstalk, K., Frank, E., Witten, I.H.: One-class classification by combining density and class probability estimation. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 505–519. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Baldi, P., Brunak, S., Chauvin, Y., Andersen, C., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5), 412–424 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Claire D’Este
    • 1
  • Ashfaqur Rahman
    • 1
  1. 1.Intelligent Sensing and Systems LaboratoryCSIRO Castray EsplanadeHobartAustralia

Personalised recommendations