A Comparative Study of Machine Learning Techniques for Automatic Product Categorisation

Chavaltada, Chanawee; Pasupa, Kitsuchart; Hardoon, David R.

doi:10.1007/978-3-319-59072-1_2

Chanawee Chavaltada¹⁶,
Kitsuchart Pasupa¹⁶ &
David R. Hardoon¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

International Symposium on Neural Networks

2714 Accesses
13 Citations

Abstract

The revolution of the digital age has resulted in e-commerce where consumers’ shopping is facilitated and flexible such as able to enquire about product availability and get instant response as well as able to search flexibly for products by using specific keywords, hence having an easy and precise search capability along with proper product categorisation through keywords that allow better overall shopping experience. This paper compared the performances of different machine learning techniques on product categorisation in our proposed framework. We measured the performance of each algorithm by an Area Under Receiver Operating Characteristic Curve (AUROC). Furthermore, we also applied Analysis of Variance (ANOVA) to our results to find out whether the differences were significant or not. Naïve Bayes was found to be the most effective algorithm in this investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M., Schulten, E., Fensel, D.: GoldenBullet: automated classification of product data in e-commerce. In: Proceedings of the 5th International Conference on Business Information Systems (BIS 2002) (2002)
Google Scholar
Simon, P.: Too Big to Ignore: The Business Case for Big Data. Wiley, Hoboken (2013)
Google Scholar
Shankar, S., Lin, I.: Applying machine learning to product categorization. Technical report, Stanford University (2011)
Google Scholar
Kozareva, Z.: Everyone likes shopping! multi-class product categorization for e-commerce. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1329–1333 (2015)
Google Scholar
Zhang, H., Li, D.: Naïve bayes text classifier. In: Proceedings of the 2007 IEEE International Conference on Granular Computing (GRC 2007), p. 708 (2007)
Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)
MATH Google Scholar
Wermter, S.: Neural network agents for learning semantic text classification. Inf. Retr. 3(2), 87–103 (2000)
Article Google Scholar
Wang, Z., Qian, X.: Text categorization based on LDA and SVM. In: 2008 International Conference on Computer Science and Software Engineering, vol. 1, pp. 674–677 (2008)
Google Scholar
Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2), 211–225 (2009)
Article Google Scholar
Bishop, C.: Pattern Recognition and Machine Learning, vol. 128, 1st edn. Springer, New York (2006). pp. 1–58, ISSN 1613-9011
MATH Google Scholar
Jurafsky, D., Martin, J.H.: Speech and language processing. Int. Ed. 710, 117–119 (2000)
Google Scholar
Lewis, D.D.: Naive (Bayes) at forty: the independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998). doi:10.1007/BFb0026666
Chapter Google Scholar
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Article MATH Google Scholar
Yuth, K.: Principle and using logistic regression analysis for research. RMUTSV Res. J. 4(1), 1–12 (2012)
Google Scholar
Ling, X.C., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), vol. 3, pp. 519–524 (2003)
Google Scholar
Viaene, S., Derrig, R.A., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J. Risk Insur. 69(3), 373–421 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, 10520, Thailand
Chanawee Chavaltada & Kitsuchart Pasupa
PriceTrolley Pte. Ltd., 573969, Singapore, Singapore
David R. Hardoon

Authors

Chanawee Chavaltada
View author publications
You can also search for this author in PubMed Google Scholar
Kitsuchart Pasupa
View author publications
You can also search for this author in PubMed Google Scholar
David R. Hardoon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kitsuchart Pasupa .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chavaltada, C., Pasupa, K., Hardoon, D.R. (2017). A Comparative Study of Machine Learning Techniques for Automatic Product Categorisation. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-59072-1_2
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics