Benjelloun, O., Garcia-Molina, H., Su, Q., Widom, J.: Swoosh: A generic approach to entity resolution. Stanford University technical report (March 2005)
Google Scholar
Charikar, M.: Similarity estimation techniques from rounding algorithms. In: 34th Annual Symposium on Theory and Computing, Montreal, Quebec, Canada (May 2002)
Google Scholar
Christen, T., Churches, P., Zhu, J.: Probabilistic name and address cleaning and standardization. In: The Australian Data Mining Workshop (November 2002)
Google Scholar
Churches, T., Christen, P., Lu, J., Zhu, J.X.: Preparation of name and address data for record linkage using hidden markov models. BioMed Central Medical Informatics and Decision Making 2(9) (2002)
Google Scholar
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string metrics for matching names and addresses. In: International Joint Conference on Artificial Intelligence, Proceedings of the Workshop on Information Integration on the Web (August 2003)
Google Scholar
Dalrymple, P.W., Young, J.A.: From authority control to informed retrieval: Framing the expanded domain of subject access. College & Research Libraries 52, 139–149 (1991)
CrossRef
Google Scholar
Elmagarmid, A., Ipeirotis, P., Verykios, V.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)
CrossRef
Google Scholar
Fayad, U., Uthurusamy, R.: Evolving data mining into solutions for insights. Communications of the Association of Computing Machinery 45(8), 28–31 (2002)
CrossRef
Google Scholar
Gong, C., Huang, Y., Cheng, X., Bai, S.: Detecting near-duplicates in large-scale short text databases. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 877–883. Springer, Heidelberg (2008)
CrossRef
Google Scholar
Gorman, M.: Authority control in the context of bibliographic control in the electronic environment. In: International Conference Authority Control: Definition and International Experiences, Florence, February 10-12 (2003)
Google Scholar
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 35–40 (2010)
CrossRef
Google Scholar
Manku, G., Jain, A., S.A.D.: Detecting near-duplicates for web crawling. In: 16th International World Wide Conference, Banff, Alberta, Canada (May 2007)
Google Scholar
Rick, B., Hengel-Dittrich, C., O’Neill, E.T., Tillett, B.: Viaf (virtual international authority file): Linking the deutsche nationalbibliothek and library of congress name authority files. International Cataloging and Bibliographic Control 36(1), 12–19 (2007)
Google Scholar
Tejada, S., Knoblock, C., Minton, S.: Learning object identification rules for information extraction. Information Systems 26(8), 607–633 (2001)
CrossRef
MATH
Google Scholar
Tillett, B.T.: Authority control: State of the art and new perspectives. In: Authority Control International Conference, Florence, Italy (2003)
Google Scholar
Wang, C., Wang, J., Lin, X., Wang, W., Wang, H., Li, H., Tian, W., Xu, J., Li, R.: Mapdupreducer: detecting near duplicates over massive datasets. In: Proceedings of the 2010 International Conference on Management of Data, SIGMOD 2010, pp. 1119–1122. ACM, New York (2010)
Google Scholar
Weber, J.: Leaf. linking and exploring authority files. In: International Conference Authority Control: Definition and International Experiences, Florence, February 10-12 (2003)
Google Scholar
Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 354–359 (1990)
Google Scholar
Winkler, W.E.: Overview of record linkage and current research directions. Technical report, Research Report Series, RRS (2006)
Google Scholar