A fast and accurate explicit kernel map

Francis, Deena P.; Raimond, Kumudha

doi:10.1007/s10489-019-01538-w

A fast and accurate explicit kernel map

Published: 05 August 2019

Volume 50, pages 647–662, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

316 Accesses
5 Citations
Explore all metrics

Abstract

Kernel functions are powerful techniques that have been used successfully in many machine learning algorithms. Explicit kernel maps have emerged as an alternative to standard kernel functions in order to overcome the latter’s scalability issues. An explicit kernel map such as Random Fourier Features (RFF) is a popular method for approximating shift invariant kernels. However, it requires large run time in order to achieve good accuracy. Faster and more accurate variants of it have also been proposed recently. All these methods are still approximations to a shift invariant kernel. Instead of an approximation, we propose a fast, exact and explicit kernel map called Explicit Cosine Map (ECM). The advantage of this exact map is manifested in the form of performance improvements in kernel based algorithms. Furthermore, its explicit nature enables it to be used in streaming applications. Another explicit kernel map called Euler kernel map is also proposed. The effectiveness of both kernel maps is evaluated in the application of streaming Anomaly Detection (AD). The AD results indicate that ECM based algorithm achieves better AD accuracy than previous algorithms, while being faster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Random Fourier Features based Streaming Algorithm for Anomaly Detection in Large Datasets

Isolation Kernel Estimators

Article 01 October 2022

Expected similarity estimation for large-scale batch and streaming anomaly detection

Article 18 May 2016

References

Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systems. In NIPS ’08, pp 1177–1184
Lee YJ, Mangasarian OL (2001) Rsvm: Reduced support vector machines. In: Proceedings of the 2001 SIAM international conference on data mining. SIAM, pp 1–17
Lee YJ, Huang SY (2007) Reduced support vector machines: a statistical theory. IEEE Trans Neural Netw 18(1):1–13
Article Google Scholar
Ghashami M, Perry DJ, Phillips J (2016) Streaming kernel principal component analysis. In: Artificial intelligence and statistics, pp 1365–1374
Le Q, Sarlós T, Smola A (2013) Fastfood-approximating kernel expansions in loglinear time. In: Proceedings of the international conference on machine learning
Felix XY, Suresh AT, Choromanski KM, Holtmann-Rice DN, Kumar S (2016) Orthogonal random features. In: Advances in neural information processing systems, pp 1975–1983
Lopez-Paz D, Sra S, Smola AJ, Ghahramani Z, Schölkopf B (2014) Randomized nonlinear component analysis. In: ICML, pp 1359–1367
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104
Tang J, Chen Z, Fu AWC, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 535–548
Lazarevic A, Ertoz L, Kumar V, Ozgur A, Srivastava J (2003) A comparative study of anomaly detection schemes in network intrusion detection. In: Proceedings of the 2003 SIAM international conference on data mining. In ICDM ’03. SIAM, pp 25–36
King SP, King DM, Astley K, Tarassenko L, Hayton P, Utete S (2002) The use of novelty detection techniques for monitoring high-integrity plant. In: Proceedings of the 2002 international conference on control applications. IEEE, pp 221–226
Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Proceedings of the 4th international conference on data warehousing and knowledge discovery. Springer, pp 170–180
Nicolau M, McDermott J, et al. (2016) A hybrid autoencoder and density estimation model for anomaly detection. In: International conference on parallel problem solving from nature. Springer, pp 717–726
Ahmed T, Coates M, Lakhina A (2007) Multivariate online anomaly detection using kernel recursive least squares. In: Proceedings of the 26th IEEE international conference on computer communications. In INFOCOM ’07. IEEE, pp 625–633
Manzoor E, Lamba H, Akoglu L (2018) Extremely fast decision tree. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. ACM, pp 1963–1972
Liu FT, Ting KM, Zhou ZH (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data (TKDD) 6(1):3
Google Scholar
Guha S, Mishra N, Roy G, Schrijvers O (2016) Robust random cut forest based anomaly detection on streams. In: International conference on machine learning, pp 2712–2721
Huang L, Nguyen X, Garofalakis M, Jordan MI, Joseph A, Taft N (2006) In-network pca and anomaly detection. In: Advances in neural information processing systems. In NIPS ’06, pp 617–624
Huang L, Nguyen X, Garofalakis M, Hellerstein JM, Jordan MI, Joseph AD, Taft N (2007) Communication-efficient online detection of network-wide anomalies. In: Proceedings of the 26th IEEE international conference on computer communications. In INFOCOM ’07. IEEE, pp 134–142
Huang H, Kasiviswanathan SP (2015) Streaming anomaly detection using randomized matrix sketching. Proc VLDB Endow 9(3):192–203
Article Google Scholar
Liberty E (2013) Simple and deterministic matrix sketching. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. In KDD ’13. ACM, pp 581–588
Ghashami M, Desai A, Phillips JM (2014) Improved practical matrix sketching with guarantees. In: European symposium on algorithms, pp 467–479
Chapter Google Scholar
Desai A, Ghashami M, Phillips JM (2016) Improved practical matrix sketching with guarantees. IEEE Trans Knowl Data Eng 28(7):1678–1690
Article Google Scholar
Francis DP, Raimond K (2018) An improvement of the parameterized frequent directions algorithm. Data Min Knowl Disc 32(2):453–482
Article MathSciNet Google Scholar
Sharan V, Gopalan P, Wieder U (2018) Efficient anomaly detection via matrix sketching. In: Advances in neural information processing systems, pp 8080–8091
Francis DP, Raimond K (2018) A random fourier features based streaming algorithm for anomaly detection in large datasets. In: Advances in big data and cloud computing. Springer, pp 209– 217
Liwicki S, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) Euler principal component analysis. Int J Comput Vis 101(3):498–518
Article MathSciNet Google Scholar
Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: International colloquium on automata, languages, and programming. Springer, pp 693–703
Clarkson KL, Woodruff DP (2013) Low rank approximation and regression in input sparsity time. In: Proceedings of the forty-fifth annual ACM symposium on theory of computing. ACM, pp 81–90
Schölkopf B, Smola A, Müller KR (1997) Kernel principal component analysis. In: International conference on artificial neural networks. Springer, pp 583–588
Francis DP, Raimond K (2017) Empirical evaluation of kernel pca approximation methods in classification tasks. arXiv:1712.04196
Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. In: Advances in neural information processing systems, pp 681–687
Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: ICCV, vol 9, pp 2130–2137
Zhang Y, Duchi J, Wainwright M (2013) Divide and conquer kernel ridge regression. In: Conference on learning theory, pp 592–617
Hoffmann H (2007) Kernel pca for novelty detection. Pattern Recogn 40(3):863–874
Article Google Scholar
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 5:564–575
Article Google Scholar
Yang MH (2002) Kernel eigenfaces vs. kernel fisherfaces: Face recognition using kernel methods. In: Fgr, vol 2, p 215
Cutajar K, Bonilla EV, Michiardi P, Filippone M (2017) Random feature expansions for deep gaussian processes. In: Proceedings of the 34th international conference on machine learning, vol 70, JMLR. org, pp 884–893
Uzilov AV, Keegan JM, Mathews DH (2006) Detection of non-coding rnas on the basis of predicted secondary structure formation free energy change. BMC Bioinform 7(1):173
Article Google Scholar
Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agri 24(3):131–151
Article Google Scholar
Caruana R, Joachims T, Backstrom L (2004) Kdd-cup 2004: results and analysis. ACM SIGKDD Explor Newslett 6(2):95–108
Article Google Scholar
Http (1999) UCI machine learning repository. http://archive.ics.uci.edu/ml/machine-learning-databases/kddcup99-mld/, Accessed 2017-01-03
LeCun Y, Cortes C, Burges CJ (1998) The mnist database of handwritten digits
Zimek A, Gaudet M, Campello RJ, Sander J (2013) Subsampling for efficient and effective unsupervised outlier detection ensembles. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, In KDD ’13. ACM, pp 428–436
Sathe S, Aggarwal C (2016) Lodes: local density meets spectral outlier detection. In: Proceedings of the 2016 SIAM international conference on data mining, In ICDM ’16. SIAM, pp 171–179
Lvd Maaten, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605
MATH Google Scholar

Download references

Acknowledgements

This publication is an outcome of the R&D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India, being implemented by Digital India Corporation.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India
Deena P. Francis & Kumudha Raimond

Authors

Deena P. Francis
View author publications
You can also search for this author in PubMed Google Scholar
Kumudha Raimond
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deena P. Francis.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Francis, D.P., Raimond, K. A fast and accurate explicit kernel map. Appl Intell 50, 647–662 (2020). https://doi.org/10.1007/s10489-019-01538-w

Download citation

Published: 05 August 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s10489-019-01538-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast and accurate explicit kernel map

Abstract

Access this article

Similar content being viewed by others

A Random Fourier Features based Streaming Algorithm for Anomaly Detection in Large Datasets

Isolation Kernel Estimators

Expected similarity estimation for large-scale batch and streaming anomaly detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast and accurate explicit kernel map

Abstract

Access this article

Similar content being viewed by others

A Random Fourier Features based Streaming Algorithm for Anomaly Detection in Large Datasets

Isolation Kernel Estimators

Expected similarity estimation for large-scale batch and streaming anomaly detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation