Skip to main content
Log in

A fast and accurate explicit kernel map

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Kernel functions are powerful techniques that have been used successfully in many machine learning algorithms. Explicit kernel maps have emerged as an alternative to standard kernel functions in order to overcome the latter’s scalability issues. An explicit kernel map such as Random Fourier Features (RFF) is a popular method for approximating shift invariant kernels. However, it requires large run time in order to achieve good accuracy. Faster and more accurate variants of it have also been proposed recently. All these methods are still approximations to a shift invariant kernel. Instead of an approximation, we propose a fast, exact and explicit kernel map called Explicit Cosine Map (ECM). The advantage of this exact map is manifested in the form of performance improvements in kernel based algorithms. Furthermore, its explicit nature enables it to be used in streaming applications. Another explicit kernel map called Euler kernel map is also proposed. The effectiveness of both kernel maps is evaluated in the application of streaming Anomaly Detection (AD). The AD results indicate that ECM based algorithm achieves better AD accuracy than previous algorithms, while being faster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systems. In NIPS ’08, pp 1177–1184

  2. Lee YJ, Mangasarian OL (2001) Rsvm: Reduced support vector machines. In: Proceedings of the 2001 SIAM international conference on data mining. SIAM, pp 1–17

  3. Lee YJ, Huang SY (2007) Reduced support vector machines: a statistical theory. IEEE Trans Neural Netw 18(1):1–13

    Article  Google Scholar 

  4. Ghashami M, Perry DJ, Phillips J (2016) Streaming kernel principal component analysis. In: Artificial intelligence and statistics, pp 1365–1374

  5. Le Q, Sarlós T, Smola A (2013) Fastfood-approximating kernel expansions in loglinear time. In: Proceedings of the international conference on machine learning

  6. Felix XY, Suresh AT, Choromanski KM, Holtmann-Rice DN, Kumar S (2016) Orthogonal random features. In: Advances in neural information processing systems, pp 1975–1983

  7. Lopez-Paz D, Sra S, Smola AJ, Ghahramani Z, Schölkopf B (2014) Randomized nonlinear component analysis. In: ICML, pp 1359–1367

  8. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104

  9. Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 535–548

  10. Lazarevic A, Ertoz L, Kumar V, Ozgur A, Srivastava J (2003) A comparative study of anomaly detection schemes in network intrusion detection. In: Proceedings of the 2003 SIAM international conference on data mining. In ICDM ’03. SIAM, pp 25–36

  11. King SP, King DM, Astley K, Tarassenko L, Hayton P, Utete S (2002) The use of novelty detection techniques for monitoring high-integrity plant. In: Proceedings of the 2002 international conference on control applications. IEEE, pp 221–226

  12. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Proceedings of the 4th international conference on data warehousing and knowledge discovery. Springer, pp 170–180

  13. Nicolau M, McDermott J, et al. (2016) A hybrid autoencoder and density estimation model for anomaly detection. In: International conference on parallel problem solving from nature. Springer, pp 717–726

  14. Ahmed T, Coates M, Lakhina A (2007) Multivariate online anomaly detection using kernel recursive least squares. In: Proceedings of the 26th IEEE international conference on computer communications. In INFOCOM ’07. IEEE, pp 625–633

  15. Manzoor E, Lamba H, Akoglu L (2018) Extremely fast decision tree. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. ACM, pp 1963–1972

  16. Liu FT, Ting KM, Zhou ZH (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data (TKDD) 6(1):3

    Google Scholar 

  17. Guha S, Mishra N, Roy G, Schrijvers O (2016) Robust random cut forest based anomaly detection on streams. In: International conference on machine learning, pp 2712–2721

  18. Huang L, Nguyen X, Garofalakis M, Jordan MI, Joseph A, Taft N (2006) In-network pca and anomaly detection. In: Advances in neural information processing systems. In NIPS ’06, pp 617–624

  19. Huang L, Nguyen X, Garofalakis M, Hellerstein JM, Jordan MI, Joseph AD, Taft N (2007) Communication-efficient online detection of network-wide anomalies. In: Proceedings of the 26th IEEE international conference on computer communications. In INFOCOM ’07. IEEE, pp 134–142

  20. Huang H, Kasiviswanathan SP (2015) Streaming anomaly detection using randomized matrix sketching. Proc VLDB Endow 9(3):192–203

    Article  Google Scholar 

  21. Liberty E (2013) Simple and deterministic matrix sketching. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. In KDD ’13. ACM, pp 581–588

  22. Ghashami M, Desai A, Phillips JM (2014) Improved practical matrix sketching with guarantees. In: European symposium on algorithms, pp 467–479

    Chapter  Google Scholar 

  23. Desai A, Ghashami M, Phillips JM (2016) Improved practical matrix sketching with guarantees. IEEE Trans Knowl Data Eng 28(7):1678–1690

    Article  Google Scholar 

  24. Francis DP, Raimond K (2018) An improvement of the parameterized frequent directions algorithm. Data Min Knowl Disc 32(2):453–482

    Article  MathSciNet  Google Scholar 

  25. Sharan V, Gopalan P, Wieder U (2018) Efficient anomaly detection via matrix sketching. In: Advances in neural information processing systems, pp 8080–8091

  26. Francis DP, Raimond K (2018) A random fourier features based streaming algorithm for anomaly detection in large datasets. In: Advances in big data and cloud computing. Springer, pp 209– 217

  27. Liwicki S, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) Euler principal component analysis. Int J Comput Vis 101(3):498–518

    Article  MathSciNet  Google Scholar 

  28. Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: International colloquium on automata, languages, and programming. Springer, pp 693–703

  29. Clarkson KL, Woodruff DP (2013) Low rank approximation and regression in input sparsity time. In: Proceedings of the forty-fifth annual ACM symposium on theory of computing. ACM, pp 81–90

  30. Schölkopf B, Smola A, Müller KR (1997) Kernel principal component analysis. In: International conference on artificial neural networks. Springer, pp 583–588

  31. Francis DP, Raimond K (2017) Empirical evaluation of kernel pca approximation methods in classification tasks. arXiv:1712.04196

  32. Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. In: Advances in neural information processing systems, pp 681–687

  33. Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556

  34. Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: ICCV, vol 9, pp 2130–2137

  35. Zhang Y, Duchi J, Wainwright M (2013) Divide and conquer kernel ridge regression. In: Conference on learning theory, pp 592–617

  36. Hoffmann H (2007) Kernel pca for novelty detection. Pattern Recogn 40(3):863–874

    Article  Google Scholar 

  37. Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 5:564–575

    Article  Google Scholar 

  38. Yang MH (2002) Kernel eigenfaces vs. kernel fisherfaces: Face recognition using kernel methods. In: Fgr, vol 2, p 215

  39. Cutajar K, Bonilla EV, Michiardi P, Filippone M (2017) Random feature expansions for deep gaussian processes. In: Proceedings of the 34th international conference on machine learning, vol 70, JMLR. org, pp 884–893

  40. Uzilov AV, Keegan JM, Mathews DH (2006) Detection of non-coding rnas on the basis of predicted secondary structure formation free energy change. BMC Bioinform 7(1):173

    Article  Google Scholar 

  41. Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agri 24(3):131–151

    Article  Google Scholar 

  42. Caruana R, Joachims T, Backstrom L (2004) Kdd-cup 2004: results and analysis. ACM SIGKDD Explor Newslett 6(2):95–108

    Article  Google Scholar 

  43. Http (1999) UCI machine learning repository. http://archive.ics.uci.edu/ml/machine-learning-databases/kddcup99-mld/, Accessed 2017-01-03

  44. LeCun Y, Cortes C, Burges CJ (1998) The mnist database of handwritten digits

  45. Zimek A, Gaudet M, Campello RJ, Sander J (2013) Subsampling for efficient and effective unsupervised outlier detection ensembles. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, In KDD ’13. ACM, pp 428–436

  46. Sathe S, Aggarwal C (2016) Lodes: local density meets spectral outlier detection. In: Proceedings of the 2016 SIAM international conference on data mining, In ICDM ’16. SIAM, pp 171–179

  47. Lvd Maaten, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

Download references

Acknowledgements

This publication is an outcome of the R&D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India, being implemented by Digital India Corporation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deena P. Francis.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Francis, D.P., Raimond, K. A fast and accurate explicit kernel map. Appl Intell 50, 647–662 (2020). https://doi.org/10.1007/s10489-019-01538-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01538-w

Keywords

Navigation