Tuning N-gram String Kernel SVMs via Meta Learning

  • Nuwan Gunasekara
  • Shaoning Pang
  • Nikola Kasabov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6444)

Abstract

Even though Support Vector Machines (SVMs) are capable of identifying patterns in high dimensional kernel spaces, their performance is determined by two main factors: SVM cost parameter and kernel parameters. This paper identifies a mechanism to extract meta features from string datasets, and derives a n-gram string kernel SVM optimization method. In the method, a meta model is trained over computed string meta-features for each dataset from a string dataset pool, learning algorithm parameters, and accuracy information to predict the optimal parameter combination for a given string classification task. In the experiments, the n-gram SVM were optimized using the proposed algorithm over four string datasets: spam, Reuters-21578, Network Application Detection and e-News Categorization. The experiment results revealed that the proposed algorithm was able to produce parameter combinations which yield good string classification accuracies for n-gram SVM on all string datasets.

Keywords

Meta learning n-gram String Kernels SVM Text Categorization SVM Optimization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zhang, X.L., Chen, X., He, Z.: An ACO-based algorithm for parameter optimization of support vector machines. Expert Systems with Applications (9), 6618–6628 (2010)Google Scholar
  2. 2.
    Shawe-Taylor, J., Cristianini, N.: Kernel methods for pattern analysis. Cambridge University Press, Cambridge (2004)CrossRefMATHGoogle Scholar
  3. 3.
    Lam, W., Lai, K.: A meta-learning approach for text categorization. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 303–309. ACM, New York (2001)Google Scholar
  4. 4.
    Hersh, W.: Information retrieval: A health and biomedical perspective. Springer, New York (2008)Google Scholar
  5. 5.
    Sonnenburg, S., Raetsch, G., Schaefer, C., Schoelkopf, B.: Large scale multiple kernel learning. The Journal of Machine Learning Research 7, 1531–1565 (2006)MathSciNetGoogle Scholar
  6. 6.
    Spam assassin public mail corpus (2002), http://spamassassin.apache.org/publiccorpus/ (Retrieved December 23, 2009)
  7. 7.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. The Journal of Machine Learning Research 2, 419–444 (2002)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Nuwan Gunasekara
    • 1
  • Shaoning Pang
    • 1
  • Nikola Kasabov
    • 1
  1. 1.KEDRIAUT UniversityAucklandNew Zealand

Personalised recommendations