Skip to main content
Log in

Malicious URL detection via spherical classification

Neural Computing and Applications Aims and scope Submit manuscript

Cite this article

Abstract

We introduce and test a binary classification method aimed at detecting malicious URL on the basis of some information on both the URL syntax and its domain properties. Our method belongs to the class of supervised machine learning models, where, in particular, classification is performed by using information coming from a set of URL’s (samples in machine learning parlance) whose class membership is known in advance. The main novelty of our approach is in the use of a spherical separation-based algorithm, instead of SVM-type methods, which are based on hyperplanes as separation surfaces in the sample space. In particular we adopt a simplified spherical separation model which runs in O(tlogt) time (t is the number of samples in the training set), and thus is suitable for large-scale applications. We test our approach using different sets of features and report the results in terms of training correctness according to the well-established tenfold cross-validation paradigm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  1. Astorino A, Gaudioso M (2005) Ellipsoidal separation for classification problems. Optim Methods Softw 20(2–3):261–270

    MathSciNet  MATH  Google Scholar 

  2. Astorino A, Gaudioso M (2009) A fixed-center spherical separation algorithm with kernel transformations for classification problems. CMS 6(3):357–372

    Article  MathSciNet  MATH  Google Scholar 

  3. Astorino A, Fuduli A, Gaudioso M (2010) DC models for spherical separation. J Glob Optim 48(4):657–669

    Article  MathSciNet  MATH  Google Scholar 

  4. Astorino A, Fuduli A, Gaudioso M (2012) Margin maximization in spherical separation. Comput Optim Appl 53(2):301–322

    Article  MathSciNet  MATH  Google Scholar 

  5. Bennett KP, Mangasarian OL (1992) Robust linear programming discrimination of two linearly inseparable sets. Optim Methods Softw 1:23–34

    Article  Google Scholar 

  6. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  7. http://opendirectory.org/

  8. Le Thi HA, Pham Dihn T (2005) The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann Oper Res 133:23–46

    Article  MathSciNet  MATH  Google Scholar 

  9. Le Thi HA, Le HM, Pham Dinh T, Van Huynh N (2013) Binary classification via spherical separator by DC programming and DCA. J Glob Optim 56:1393–1407

    Article  MathSciNet  MATH  Google Scholar 

  10. Ma J, Saul LK, Savage S, Voelker GM (2009) Beyond blacklists: learning to detect malicious web sites from suspicious URLs. KDD’09, June 28–July 1, 2009. France, Paris, pp 1245–1253

  11. Mangasarian OL (1965) Linear and nonlinear separation of patterns by linear programming. Oper Res 13:444–452

    Article  MathSciNet  MATH  Google Scholar 

  12. Palagi L, Sciandrone M (2005) On the convergence of a modified version of \(SVM^{light}\) algorithm. Optim Methods Softw 20(2–3):317–334

    Article  MathSciNet  MATH  Google Scholar 

  13. Pham Dinh T, Le Thi HA (1998) A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J Con Opt 8:476–505

    Article  MathSciNet  MATH  Google Scholar 

  14. PhishTank: http://www.phishtank.com/

  15. Rosen JB (1965) Pattern separation by convex programming. J Math Anal Appl 10:123–134

    Article  MathSciNet  MATH  Google Scholar 

  16. Vapnik V (1995) The nature of the statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  17. Zhang J, Porras P, Ullrich J (2008) Highly predictive blacklisting. USENIX Security Symposium 2008—usenix.org

Download references

Acknowledgments

This work has been partially supported by Italian M.I.U.R. Programma Operativo Nazionale (PON) 2007–2013, Project “Protezione dei servizi digitali e di pagamento elettronico,” PON03PE_00032_2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Gaudioso.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Astorino, A., Chiarello, A., Gaudioso, M. et al. Malicious URL detection via spherical classification. Neural Comput & Applic 28 (Suppl 1), 699–705 (2017). https://doi.org/10.1007/s00521-016-2374-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2374-9

Keywords

Navigation