Bayesian Multiple Imputation Approaches for One-Class Classification

  • Shehroz S. Khan
  • Jesse Hoey
  • Daniel Lizotte
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7310)

Abstract

One-Class Classifiers build classification models in the absence of negative examples, which makes it harder to estimate the class boundary. The predictive accuracy of one-class classifiers can be exacerbated by the presence of missing data in the positive class. In this paper, we propose two approaches based on Bayesian Multiple Imputation (BMI) for imputing missing data in the one-class classification framework called Averaged BMI and Ensemble BMI. We test and compare our approaches against the common method of Mean imputation and Expectation Maximization on several datasets. Our preliminary experiments suggest that as the missingness in the data increases, our proposed imputation approaches can do better on some data sets.

Keywords

True Positive Rate Positive Data Positive Class True Negative Rate Imputation Technique 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)Google Scholar
  2. 2.
    Cohen, G., Hilario, M., Sax, H., Hugonnet, S., Pellegrini, C., Geissbuhler, A.: An application of one-class support vector machines to nosocomial infection detection. In: Fieschi, M., et al. (eds.) Proc. of Medical Informatics. IOS Press (2004)Google Scholar
  3. 3.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  4. 4.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. In: SIGKDD Explorations 1, vol. 11 (2009)Google Scholar
  5. 5.
    Hempstalk, K., Frank, E., Witten, I.H.: One-class classification by combining density and class probability estimation. In: Proc European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, pp. 505–519. Antwerp, Belgium (2008)Google Scholar
  6. 6.
    Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall (1997)Google Scholar
  7. 7.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high dimensional distribution. Tech. Rep. MSR-TR-99-87, Microsoft Research (1999)Google Scholar
  8. 8.
    Su, X., Khoshgoftaar, T.M., Greiner, R.: Using imputation techniques to help learn accurate classifiers. In: 20th IEEE International Conference on Tools with Artificial Intelligence, pp. 437–444 (2008)Google Scholar
  9. 9.
    Tax, D.M.J., Duin, R.P.W.: Support vector data description. Machine Learning 54(1), 45–66 (2004)MATHCrossRefGoogle Scholar
  10. 10.
    Tax, D.: One Class Classification. Ph.D. thesis, Delft University of Technology (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Shehroz S. Khan
    • 1
  • Jesse Hoey
    • 1
  • Daniel Lizotte
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUnivesity of WaterlooCanada

Personalised recommendations