Advertisement

JedAI: The Force Behind Entity Resolution

  • George Papadakis
  • Leonidas Tsekouras
  • Emmanouil Thanos
  • George Giannakopoulos
  • Themis Palpanas
  • Manolis Koubarakis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10577)

Abstract

We present JedAI, a toolkit for Entity Resolution that can be used in three different ways: as an open-source Java library that implements numerous state-of-the-art, domain-independent methods, as a workbench that facilitates the evaluation of their relative performance and as a desktop application that offers out-of-the-box ER solutions. JedAI bridges the gap between the database and the Semantic Web communities, offering solutions that are applicable to both relational and RDF data. It also conveys a modular architecture that facilitates its extension with more methods and with more comprehensive workflows.

Notes

Acknowledgments

This work has been supported by the project “Your Data Stories”, which is funded by EU Horizon 2020 programme under grant agreement No. 645886. We would also like to thank Oktie Hassanzadeh for sharing with us the implementation in C of the clustering algorithms examined in [3].

References

  1. 1.
    Christophides, V., Efthymiou, V., Stefanidis, K.: Entity Resolution in the Web of Data. Morgan & Claypool, San Rafael (2015)Google Scholar
  2. 2.
    Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for name-matching tasks. In: IIWeb, pp. 73–78 (2003)Google Scholar
  3. 3.
    Hassanzadeh, O., Chiang, F., Miller, R., Lee, H.: Framework for evaluating clustering algorithms in duplicate detection. PVLDB 2(1), 1282–1293 (2009)Google Scholar
  4. 4.
    Köpcke, H., Thor, A., Rahm, E.: Evaluation of entity resolution approaches on real-world match problems. PVLDB 3(1), 484–493 (2010)Google Scholar
  5. 5.
    Nentwig, M., Hartung, M., Ngomo, A., Rahm, E.: A survey of current link discovery frameworks. Semant. Web 8(3), 419–436 (2017)CrossRefGoogle Scholar
  6. 6.
    Papadakis, G., Alexiou, G., Papastefanatos, G., Koutrika, G.: Schema-agnostic vs schema-based configurations for blocking methods on homogeneous data. PVLDB 9(4), 312–323 (2015)Google Scholar
  7. 7.
    Papadakis, G., Svirsky, J., Gal, A., Palpanas, T.: Comparative analysis of approximate blocking techniques for entity resolution. PVLDB 9(9), 684–695 (2016)Google Scholar
  8. 8.
    Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_16CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • George Papadakis
    • 1
  • Leonidas Tsekouras
    • 2
  • Emmanouil Thanos
    • 3
  • George Giannakopoulos
    • 2
  • Themis Palpanas
    • 4
  • Manolis Koubarakis
    • 1
  1. 1.University of AthensAthensGreece
  2. 2.NCSR “Demokritos”AthensGreece
  3. 3.University of LeuvenLeuvenBelgium
  4. 4.Paris Descartes UniversityParisFrance

Personalised recommendations