Entity Resolution with Hybrid Attention-Based Networks

Sun, Chenchen; Shen, Derong

doi:10.1007/978-3-030-73197-7_37

Chenchen Sun¹⁶ &
Derong Shen¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12682))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2708 Accesses
1 Citations

Abstract

Entity resolution (ER) is an important step of data preprocessing. Deep learning based entity resolution is a growing topic in research communities. Considering that record structure is hierarchical: token, attribute, record, we propose a hybrid attention-based network framework for entity resolution. It synthesizes information from different abstract levels of record hierarchy. Systematic attention mechanisms are exploited in several aspects of ER: self-attention for internal dependency capture, inter-attention for alignments, and multi-dimensional weight attention for importance discrimination. Also attribute order is taken into account in ER learning for better similarity representations. Moreover, we tackle ER over low-quality data by hybrid soft token alignments. Extensive experiments on 4 datasets are conducted, and the resultsshow that our approach surpasses existing ER approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng. 19(1), 1–16 (2007)
Google Scholar
Ebraheem, M., Thirumuruganathan, S., Joty, S., Ouzzani, M., Tang, N.: Distributed representations of tuples for entity resolution. Proc. VLDB Endowment 11(11), 1454–1467 (2018)
Article Google Scholar
Mudgal, S., et al.: Deep learning for entity matching: a design space exploration. In: Proceedings of the 2018 International Conference on Management of Data, pp. 19–34 (2018)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Zhang, D., Nie, Y., Wu, S., Shen, Y., Tan, K.L.: Multi-context attention for entity matching. Proc. Web Conf. 2020, 2634–2640 (2020)
Google Scholar
Nie, H, et al.: Deep sequence-to-sequence entity matching for heterogeneous entity resolution. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 629–638 (2019)
Google Scholar
Fu, C., Han, X., He, J., Sun, L.: Hierarchical matching network for heterogeneous entity resolution. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, pp. 3665–3671 (2020)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics, pp. 1480–1489 (2016)
Google Scholar
Jiang, J.Y., Zhang, M., Li, C., Bendersky, M., Golbandi, N., Najork, M.: Semantic text matching for long-form documents. In: The World Wide Web Conference 2019, pp. 795–806 (2019)
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguistics 5, 135–146 (2017)
Article Google Scholar
Cho, K, et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)
Google Scholar
Hu, D.: An introductory survey on attention mechanisms in NLP problems. In: Proceedings of SAI Intelligent Systems Conference, pp. 432–448 (2019)
Google Scholar
Tang, M., Cai, J., Zhuo, H.: Multi-matching network for multiple choice reading comprehension. In: Proceedings of the AAAI Conference on Artificial Intelligence 2019, pp. 7088–7095 (2019)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants (62002262, 61672142, 61602103, 62072086, 62072084), and the National Key Research & Development Project under Grant (2018YFB1003404).

Author information

Authors and Affiliations

Key Laboratory of Computer Vision and System (Ministry of Education), Tianjin University of Technology, Tianjin, China
Chenchen Sun
School of Computer Science and Engineering, Northeastern University, Shenyang, China
Derong Shen

Authors

Chenchen Sun
View author publications
You can also search for this author in PubMed Google Scholar
Derong Shen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Aalborg University, Aalborg, Denmark
Christian S. Jensen
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Academia Sinica, Taipei, Taiwan
De-Nian Yang
The Pennsylvania State University, University Park, PA, USA
Wang-Chien Lee
National Chiao Tung University, Hsinchu, Taiwan
Vincent S. Tseng
Athens University of Economics and Business, Athens, Greece
Vana Kalogeraki
National Cheng Kung University, Tainan City, Taiwan
Jen-Wei Huang
National Tsing Hua University, Hsinchu, Taiwan
Chih-Ya Shen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, C., Shen, D. (2021). Entity Resolution with Hybrid Attention-Based Networks. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12682. Springer, Cham. https://doi.org/10.1007/978-3-030-73197-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-73197-7_37
Published: 06 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73196-0
Online ISBN: 978-3-030-73197-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics