Abstract
Products sold in today’s marketplace are very numerous and varied. One of them is the book product. Detail information about the book, such as the title of the book, author, and publisher, is often presented in unstructured format in the product title. In order to be useful for the commercial applications, for example catalogs, search functions, and recommendation systems, the attributes need to be extracted from the product title. In this study, we apply Named-Entity Recognition model in semi-supervised style to extract the attributes of e-commerce products in book domain. We experiment with the number of features extraction, i.e. lexical, position, word shape, and embedding features. We extract the book attributes from near to 30K product title data with F-1 measure 65%.
Keywords
- Book
- Named-Entity Recognition
- Attribute extraction
- Product title
- E-commerce
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
This number is obtained after conducting empirical observation.
References
Dumont, B., Maggio, S., Sidi Said, G., Au, Q.-T.: Who wrote this book? A challenge for e-commerce. In: Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pp. 121–125 (2019)
Ghani, R., et al.: Text mining for product attribute extraction. ACM SIGKDD Explor. Newsl. 8(1), 41–48 (2006)
Joshi, M., et al.: Distributed word representations improve NER for e-commerce. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp. 160–167 (2015)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)
Landis, J.R., Koch, G.: The measurement of observer agreement for categorical data. Biometrics, 159–174 (1977)
More, A.: Attribute extraction from product titles in ecommerce. arXiv preprint arXiv:1608.04670 (2016)
Nadeau, D., Sekine, S.: A Survey of named entity recognition and classification. J. Linguist. Investig. 30(1), 1–20 (2007)
Putthividhya, D., Hu, J.: Bootstrapped named entity recognition for product attribute extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1557–1567 (2011)
Rif’at, M., Mahendra, R., Budi, I.: Towards product attributes extraction in Indonesian e-commerce platform. Computación y Sistemas 22(4) (2018)
Acknowledgements
This research was supported by the research grant from Universitas Indonesia, namely Publikasi Terindeks Internasional (PUTI) Prosiding year 2020 no NKB-854/UN2.RST/HKP.05.00/2020.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Putra, H.S., Priatmadji, F.S., Mahendra, R. (2020). Semi-supervised Named-Entity Recognition for Product Attribute Extraction in Book Domain. In: Ishita, E., Pang, N.L.S., Zhou, L. (eds) Digital Libraries at Times of Massive Societal Transition. ICADL 2020. Lecture Notes in Computer Science(), vol 12504. Springer, Cham. https://doi.org/10.1007/978-3-030-64452-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-64452-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64451-2
Online ISBN: 978-3-030-64452-9
eBook Packages: Computer ScienceComputer Science (R0)