Skip to main content

A Comparative Study of Machine Learning Techniques for Automatic Product Categorisation

  • Conference paper
  • First Online:
Advances in Neural Networks - ISNN 2017 (ISNN 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

Abstract

The revolution of the digital age has resulted in e-commerce where consumers’ shopping is facilitated and flexible such as able to enquire about product availability and get instant response as well as able to search flexibly for products by using specific keywords, hence having an easy and precise search capability along with proper product categorisation through keywords that allow better overall shopping experience. This paper compared the performances of different machine learning techniques on product categorisation in our proposed framework. We measured the performance of each algorithm by an Area Under Receiver Operating Characteristic Curve (AUROC). Furthermore, we also applied Analysis of Variance (ANOVA) to our results to find out whether the differences were significant or not. Naïve Bayes was found to be the most effective algorithm in this investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M., Schulten, E., Fensel, D.: GoldenBullet: automated classification of product data in e-commerce. In: Proceedings of the 5th International Conference on Business Information Systems (BIS 2002) (2002)

    Google Scholar 

  2. Simon, P.: Too Big to Ignore: The Business Case for Big Data. Wiley, Hoboken (2013)

    Google Scholar 

  3. Shankar, S., Lin, I.: Applying machine learning to product categorization. Technical report, Stanford University (2011)

    Google Scholar 

  4. Kozareva, Z.: Everyone likes shopping! multi-class product categorization for e-commerce. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1329–1333 (2015)

    Google Scholar 

  5. Zhang, H., Li, D.: Naïve bayes text classifier. In: Proceedings of the 2007 IEEE International Conference on Granular Computing (GRC 2007), p. 708 (2007)

    Google Scholar 

  6. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)

    MATH  Google Scholar 

  7. Wermter, S.: Neural network agents for learning semantic text classification. Inf. Retr. 3(2), 87–103 (2000)

    Article  Google Scholar 

  8. Wang, Z., Qian, X.: Text categorization based on LDA and SVM. In: 2008 International Conference on Computer Science and Software Engineering, vol. 1, pp. 674–677 (2008)

    Google Scholar 

  9. Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2), 211–225 (2009)

    Article  Google Scholar 

  10. Bishop, C.: Pattern Recognition and Machine Learning, vol. 128, 1st edn. Springer, New York (2006). pp. 1–58, ISSN 1613-9011

    MATH  Google Scholar 

  11. Jurafsky, D., Martin, J.H.: Speech and language processing. Int. Ed. 710, 117–119 (2000)

    Google Scholar 

  12. Lewis, D.D.: Naive (Bayes) at forty: the independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998). doi:10.1007/BFb0026666

    Chapter  Google Scholar 

  13. Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  MATH  Google Scholar 

  14. Yuth, K.: Principle and using logistic regression analysis for research. RMUTSV Res. J. 4(1), 1–12 (2012)

    Google Scholar 

  15. Ling, X.C., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), vol. 3, pp. 519–524 (2003)

    Google Scholar 

  16. Viaene, S., Derrig, R.A., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J. Risk Insur. 69(3), 373–421 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kitsuchart Pasupa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chavaltada, C., Pasupa, K., Hardoon, D.R. (2017). A Comparative Study of Machine Learning Techniques for Automatic Product Categorisation. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59072-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59071-4

  • Online ISBN: 978-3-319-59072-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics