Skip to main content

Phishing Webpage Detection Using Weighted URL Tokens for Identity Keywords Retrieval

  • Conference paper
  • First Online:
9th International Conference on Robotic, Vision, Signal Processing and Power Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 398))

Abstract

Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is invoked to pinpoint the target domain name, which can be used to determine the legitimacy of the query webpage. Experiments were conducted over 1000 datasets, where 99.20 % true positives and 92.20 % true negatives were achieved. Results suggest that the proposed system can detect phishing webpages effectively without using conventional language-dependent keywords extraction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.iana.org/domains/root/db/.

  2. 2.

    http://www.phishtank.com/.

  3. 3.

    http://www.alexa.com/.

References

  1. Anti-Phishing Working Group: Phishing activity trends report, 2nd quarter 2013 (Nov 2013). http://docs.apwg.org/reports/apwg_trends_report_q2_2013.pdf

  2. Anti-Phishing Working Group: Phishing activity trends report, 2nd quarter 2014 (Aug 2014). http://docs.apwg.org/reports/apwg_trends_report_q2_2014.pdf

  3. EMC Corporation: RSA monthly fraud report (Jan 2015). http://australia.emc.com/collateral/fraud-report/h13929-rsa-fraud-report-jan-2015.pdf

  4. Zhang Y, Hong JI, Cranor LF (2007) CANTINA: a content-based approach to detecting phishing web sites. In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 639–648

    Google Scholar 

  5. He M, Horng SJ, Fan P, Khan MK, Run RS, Lai JL, Chen RJ, Sutanto A (2011) An efficient phishing webpage detector. Expert Syst Appl 38(10):12018–12027

    Article  Google Scholar 

  6. Ramesh G, Krishnamurthi I, Kumar KSS (2014) An efficacious method for detecting phishing webpages through target domain identification. Decis Support Syst 61:12–22

    Article  Google Scholar 

  7. Wenyin L, Fang N, Quan X, Qiu B, Liu G (2010) Discovering phishing target based on semantic link network. Future Gener Comput Syst 26(3):381–388

    Article  Google Scholar 

  8. Fu AY, Wenyin L, Deng X (2006) Detecting phishing web pages with visual similarity assessment based on Earth Mover’s Distance (EMD). IEEE Trans Dependable Secure Comput 3(4):301–311

    Article  Google Scholar 

  9. Huang CY, Ma SP, Yeh WL, Lin CY, Liu CT (2010) Mitigate web phishing using site signatures. In: TENCON 2010-2010 IEEE region 10 conference. IEEE, pp 803–808

    Google Scholar 

Download references

Acknowledgments

The funding for this project is made possible through the research grant obtained from UNIMAS and the Ministry of Education, Malaysia under the Fundamental Research Grant Scheme 2/2013 [Grant No: FRGS/ICT07(01)/1057/2013(03)].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Choon Lin Tan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this paper

Cite this paper

Tan, C.L., Chiew, K.L., Sze, S.N. (2017). Phishing Webpage Detection Using Weighted URL Tokens for Identity Keywords Retrieval. In: Ibrahim, H., Iqbal, S., Teoh, S., Mustaffa, M. (eds) 9th International Conference on Robotic, Vision, Signal Processing and Power Applications. Lecture Notes in Electrical Engineering, vol 398. Springer, Singapore. https://doi.org/10.1007/978-981-10-1721-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-1721-6_15

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-1719-3

  • Online ISBN: 978-981-10-1721-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics