Skip to main content

Entity Resolution with Hybrid Attention-Based Networks

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12682))

Included in the following conference series:

Abstract

Entity resolution (ER) is an important step of data preprocessing. Deep learning based entity resolution is a growing topic in research communities. Considering that record structure is hierarchical: token, attribute, record, we propose a hybrid attention-based network framework for entity resolution. It synthesizes information from different abstract levels of record hierarchy. Systematic attention mechanisms are exploited in several aspects of ER: self-attention for internal dependency capture, inter-attention for alignments, and multi-dimensional weight attention for importance discrimination. Also attribute order is taken into account in ER learning for better similarity representations. Moreover, we tackle ER over low-quality data by hybrid soft token alignments. Extensive experiments on 4 datasets are conducted, and the resultsshow that our approach surpasses existing ER approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng. 19(1), 1–16 (2007)

    Google Scholar 

  2. Ebraheem, M., Thirumuruganathan, S., Joty, S., Ouzzani, M., Tang, N.: Distributed representations of tuples for entity resolution. Proc. VLDB Endowment 11(11), 1454–1467 (2018)

    Article  Google Scholar 

  3. Mudgal, S., et al.: Deep learning for entity matching: a design space exploration. In: Proceedings of the 2018 International Conference on Management of Data, pp. 19–34 (2018)

    Google Scholar 

  4. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  5. Zhang, D., Nie, Y., Wu, S., Shen, Y., Tan, K.L.: Multi-context attention for entity matching. Proc. Web Conf. 2020, 2634–2640 (2020)

    Google Scholar 

  6. Nie, H, et al.: Deep sequence-to-sequence entity matching for heterogeneous entity resolution. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 629–638 (2019)

    Google Scholar 

  7. Fu, C., Han, X., He, J., Sun, L.: Hierarchical matching network for heterogeneous entity resolution. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, pp. 3665–3671 (2020)

    Google Scholar 

  8. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics, pp. 1480–1489 (2016)

    Google Scholar 

  9. Jiang, J.Y., Zhang, M., Li, C., Bendersky, M., Golbandi, N., Najork, M.: Semantic text matching for long-form documents. In: The World Wide Web Conference 2019, pp. 795–806 (2019)

    Google Scholar 

  10. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguistics 5, 135–146 (2017)

    Article  Google Scholar 

  11. Cho, K, et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)

    Google Scholar 

  12. Hu, D.: An introductory survey on attention mechanisms in NLP problems. In: Proceedings of SAI Intelligent Systems Conference, pp. 432–448 (2019)

    Google Scholar 

  13. Tang, M., Cai, J., Zhuo, H.: Multi-matching network for multiple choice reading comprehension. In: Proceedings of the AAAI Conference on Artificial Intelligence 2019, pp. 7088–7095 (2019)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants (62002262, 61672142, 61602103, 62072086, 62072084), and the National Key Research & Development Project under Grant (2018YFB1003404).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, C., Shen, D. (2021). Entity Resolution with Hybrid Attention-Based Networks. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12682. Springer, Cham. https://doi.org/10.1007/978-3-030-73197-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73197-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73196-0

  • Online ISBN: 978-3-030-73197-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics