Skip to main content

Improving iForest with Relative Mass

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8444))

Included in the following conference series:

Abstract

iForest uses a collection of isolation trees to detect anomalies. While it is effective in detecting global anomalies, it fails to detect local anomalies in data sets having multiple clusters of normal instances because the local anomalies are masked by normal clusters of similar density and they become less susceptible to isolation. In this paper, we propose a very simple but effective solution to overcome this limitation by replacing the global ranking measure based on path length with a local ranking measure based on relative mass that takes local data distribution into consideration. We demonstrate the utility of relative mass by improving the task specific performance of iForest in anomaly detection and information retrieval tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceedings of the Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)

    Google Scholar 

  2. Ting, K.M., Zhou, G.T., Liu, F.T., Tan, S.C.: Mass estimation. Machine Learning 90(1), 127–160 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  3. Zhou, G.T., Ting, K.M., Liu, F.T., Yin, Y.: Relevance feature mapping for content-based multimedia information retrieval. Pattern Recognition 45(4), 1707–1720 (2012)

    Article  Google Scholar 

  4. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)

    Google Scholar 

  5. Ting, K., Washio, T., Wells, J., Liu, F., Aryal, S.: DEMass: a new density estimator for big data. Knowledge and Information Systems 35(3), 493–524 (2013)

    Article  Google Scholar 

  6. Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Systems for Video Technology 8(5), 644–655 (1998)

    Article  Google Scholar 

  7. He, J., Li, M., Zhang, H.J., Tong, H., Zhang, C.: Manifold-ranking based image retrieval. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, pp. 9–16. ACM, New York (2004)

    Google Scholar 

  8. Giacinto, G., Roli, F.: Instance-based relevance feedback for image retrieval. In: Advances in Neural Information Processing Systems, vol. 17, pp. 489–496 (2005)

    Google Scholar 

  9. Zhou, Z.H., Dai, H.B.: Query-sensitive similarity measure for content-based image retrieval. In: Proceedings of the Sixth International Conference on Data Mining, pp. 1211–1215 (2006)

    Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)

    Google Scholar 

  11. Achtert, E., Hettab, A., Kriegel, H.-P., Schubert, E., Zimek, A.: Spatial outlier detection: Data, algorithms, visualizations. In: Pfoser, D., Tao, Y., Mouratidis, K., Nascimento, M.A., Mokbel, M., Shekhar, S., Huang, Y. (eds.) SSTD 2011. LNCS, vol. 6849, pp. 512–516. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302 (2002)

    Article  Google Scholar 

  13. Zhou, Z.H., Chen, K.J., Dai, H.B.: Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems 24(2), 219–244 (2006)

    Article  Google Scholar 

  14. Ting, K.M., Fernando, T.L., Webb, G.I.: Mass-based Similarity Measure: An Effective Alternative to Distance-based Similarity Measures. Technical Report 2013/276, Calyton School of IT, Monash University, Australia (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Aryal, S., Ting, K.M., Wells, J.R., Washio, T. (2014). Improving iForest with Relative Mass. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8444. Springer, Cham. https://doi.org/10.1007/978-3-319-06605-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06605-9_42

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06604-2

  • Online ISBN: 978-3-319-06605-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics