Commodity Image Retrieval Based on Image and Text Data

Zhang, Hongjie; Xu, Jian; Sun, Huadong; Zhao, Zhijie

doi:10.1007/978-3-031-03918-8_10

Hongjie Zhang^6,7,
Jian Xu^6,7,
Huadong Sun^6,7 &
…
Zhijie Zhao^6,7

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 113))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

1071 Accesses

Abstract

With the continuous popularization and development of the Internet, online shopping has gradually become people’s main consumption mode. This paper mainly studies the image retrieval task of goods in e-commerce websites. The main purpose of applying image retrieval in shopping websites is to enable users to search for the expected goods more conveniently and accurately in the massive commodity information. Given the traditional image retrieval process, only the image or text of goods is used as the retrieval object. The query results obtained do not take advantage of the information relevance and complementarity between text and images, which loses the retrieval advantages of goods. To solve the problem, this paper first designs an end-to-end supervised learning algorithm to project heterogeneous data into a common metric space and apply traditional indexing schemes in this space to achieve efficient image retrieval. Secondly, a fusion method is proposed to give better data and higher weight according to the semantic capture quality of input features. Finally, an objective function is proposed, which can correctly embed the fusion features of image and text into their respective feature space, make the fusion features of the same kind of image and text closer to each other, and separate the dissimilar features. The experimental results show that the average accuracy of this method on the test set of commodity data is 70%, which is about 6% higher than the image content-based and text-based image retrieval methods, which proves the effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jo, Y., Wi, J., Kim, M., Lee, J.Y.: Flexible fashion product retrieval using multimodality-based deep learning. Appl. Sci. 10(5), 1569 (2020)
Article Google Scholar
Hamiti, A., Hamiti, A.: A comparative study of text-based image retrieval and content-based image retrieval techniques. J. Capital Normal Univ. (Nat. Sci. Edn.) 33(4), 4 (2012)
Google Scholar
Arevalo, J., Solorio, T., Montes-y-Gómez, M., et al.: Gated multimodal units for information fusion (2017)
Google Scholar
Sun, J., Yuan, F.: Content-based image retrieval technology. Comput. Syst. Appl. 20(8), 5 (2011)
Google Scholar
Hao, X., Zhang, G., Ma, S.: Deep learning. Int. J. Semant. Comput. 10(03), 417–439 (2016)
Article Google Scholar
Lin, K., Yang, H.F., Liu, K.H., et al.: Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM on International Conference on Multimedia Retrieval. ACM (2015)
Google Scholar
Luo, Z.: Combined with the characteristics of different layers of convolutional neural network, package commodity retrieval is carried out. Comput. Appl. Softw. 35(1), 6 (2018)
Google Scholar
Kiapour, M.H., Han, X., Lazebnik, S., et al.: Where to buy it: matching street clothing photos in online shops. In: IEEE International Conference on Computer Vision. IEEE (2015)
Google Scholar
Huang, J., Feris, R.S., Chen, Q., et al.: Cross-domain image retrieval with a dual attribute-aware ranking network. In: IEEE International Conference on Computer Vision. IEEE (2015)
Google Scholar
Qi, W., Teney, D., Wang, P., Shen, C., Dick, A., van den Hengel, A.: Visual Question answering: a survey of methods and datasets. Comput. Vis. Image Underst. 163, 21–40 (2017). https://doi.org/10.1016/j.cviu.2017.05.001
Article Google Scholar
Ren, M., Kiros, R., Zemel, R.: Exploring models and data for image question answering. Litoral Revista De La Poesía Y El Pensamiento, 2953–2961 (2015)
Google Scholar
Fukui, A., Park, D.H., Yang, D., et al.: Multimodal compact bilinear pooling for visual question answering and visual grounding (2016)
Google Scholar
Zahavy, T., Magnani, A., Krishnan, A., et al.: Is a picture worth a thousand words? A deep multimodal fusion architecture for product classification in e-commerce (2016)
Google Scholar
Gallo, I., Calefati, A., Nawaz, S., et al.: Image and encoded text fusion for multimodal classification. In: 2018 Digital Image Computing: Techniques and Applications (DICTA) (2018)
Google Scholar
Kiela, D., Bottou, L.: Learning image embeddings using convolutional neural networks for improved multi-modal semantics. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
Google Scholar
Guo, X., Wu, H., Cheng, Y., et al.: Dialog-based interactive image retrieval. arXiv preprint arXiv:1805.00145 (2018)
Misra, I., Gupta, A., Hebert, M.: From red wine to red tomato: composition with context. In: IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, pp. 1160–1169 (2017)
Google Scholar
Kiela, D., Grave, E., Joulin, A., et al.: Efficient large-scale multi-modal classification (2018)
Google Scholar
Anwaar, M.U., Labintcev, E., Kleinsteuber, M.: Compositional learning of image-text query for image retrieval (2020)
Google Scholar
Narayana, P., Pednekar, A., Krishnamoorthy, A., et al.: HUSE: hierarchical universal semantic embeddings (2019)
Google Scholar
Wang, K., Yin, Q., Wei, W., et al.: A comprehensive survey on cross-modal retrieval (2016)
Google Scholar
Wattenberg, M., Viégas, F., Johnson, I.: How to use t-SNE effectively. Distill 1(10), e2 (2016)
Article Google Scholar
Ting, S., Guohua, G.: Image retrieval method for deep neural network. Int. J. Sig. Process. Image Process. Pattern Recogn. 9(7), 33–42 (2016). NADIA, ISSN 2005-4254 (Print); 2207-970X (Online). https://doi.org/10.14257/ijsip.2016.9.7.04
Bagri, N., Johari, P.K.: A comparative study on feature extraction using texture and shape for content-based image retrieval. Int. J. Adv. Sci. Technol. 80, 41–52 (2015). NADIA, ISSN 2005-4238 (Print); 2207-6360 (Online). https://doi.org/10.14257/ijast.2015.80.04
Yang, D., Grice, S.: Research on the design of E-commerce recommendation system. Int. J. Smart Bus. Technol. 6(1), 15–30 (2018). https://doi.org/10.21742/IJSBT.2018.6.1.02
Article Google Scholar

Download references

Acknowledgment

This paper is supported by the Natural Science Foundation of Heilongjiang Province Project Funding (LH2021F036).

Author information

Authors and Affiliations

Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, China
Hongjie Zhang, Jian Xu, Huadong Sun & Zhijie Zhao
School of Computer and Information Engineering, Harbin University of Commerce, Harbin, China
Hongjie Zhang, Jian Xu, Huadong Sun & Zhijie Zhao

Authors

Hongjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhijie Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhijie Zhao .

Editor information

Editors and Affiliations

Faculty of Computer and AI, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Port Said University, Port Fouad, Egypt
Rawya Y. Rizk
Department of Computer Science, VŠB-TUO, Ostrava-Poruba, Czech Republic
Václav Snášel
Faculty of Engineering, Port Said University, Port Fouad, Egypt
Rehab F. Abdel-Kader

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Xu, J., Sun, H., Zhao, Z. (2022). Commodity Image Retrieval Based on Image and Text Data. In: Hassanien, A.E., Rizk, R.Y., Snášel, V., Abdel-Kader, R.F. (eds) The 8th International Conference on Advanced Machine Learning and Technologies and Applications (AMLTA2022). AMLTA 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 113. Springer, Cham. https://doi.org/10.1007/978-3-031-03918-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-03918-8_10
Published: 17 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-03917-1
Online ISBN: 978-3-031-03918-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics