Abstract
The identification problem is concerned with the question whether two objects in an application refer to the same real-world entity. In this paper, the identification problem is investigated from a knowledge modelling point of view. We develop a framework of establishing knowledge-aware identity services by abstracting identity knowledge into an additional identity layer. The knowledge model in the identity service layer provides a capability for combining declarative formulae with concrete data and thus allows us to capture domain-specific identity knowledge at flexible levels of abstraction. By adding validation constraints to the identity service, we are also able to reason about inconsistency of identity knowledge. In doing so, the accuracy of identity knowledge can be improved over time, especially when utilising identity services provided by different communities in a service-oriented architecture. Our experimental study shows the effectiveness of the proposed knowledge modelling approach and the effects of domain-specific identity knowledge on data quality control.
Similar content being viewed by others
Notes
Fudan University, China.
Massey University, New Zealand.
References
Abiteboul S, Hull R, Vianu V (1995) Foundations of databases. Addison-Wesley, Reading
Aho AV, Ullman JD (1979) Universality of data retrieval languages. In: Proceedings of principles of programming languages, ACM, pp 110–119
Arasu A, Chaudhuri S, Kaushik R (2008) Transformation-based framework for record matching. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, IEEE Computer Society, pp 40–49
Arasu A, Kaushik R (2009) A grammar-based entity representation framework for data cleaning. In: Proceedings of the 35th SIGMOD international conference on management of data, pp 233–244
Balasubramaniam S, Lewis G, Morris E, Simanta S, Smith D (2009) Identity management and its impact on federation in a system-of-systems context. In: Systems conference, 2009 3rd annual IEEE, pp 179–182. IEEE
Bhattacharya I, Getoor L (2004) Deduplication and group detection using links. In: Proceedings of the 2004 ACM SIGKDD workshop on link analysis and group detection
Bhattacharya I, Getoor L (2004) Iterative record linkage for cleaning and integration. In: Proceedings of the 9th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, ACM, pp 11–18
Bourdon F, Webb R (1993) International cooperation in the field of authority data: an analytical study with recommendations. KG Saur, München
Dorneles CF, Gonçalves R, dos Santos Mello R (2011) Approximate data instance matching: a survey. Knowl Inf Syst 27(1):1–21
Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 19:1–16
Fellegi I, Sunter A (1969) A theory for record linkage. J Am Stat Assoc 64(328):1183–1210
Fischer international identity: fischer identity as a service: architecture overview for client organizations (2009). http://www.fischerinternational.com/press/white_papers/wp_IaaS__arch_for_client_orgs.pdf
Habibzadeh F, Yadollahie M (2009) The problem of who. Int Inf Libr Rev 41(2):61–62
Hull R, Yap CK (1982) The format model: a theory of database organization. In: Principles of database systems, ACM, pp 205–211
Kaushik N (2007) Revisiting the identity oracle concept. http://blog.talkingidentity.com/2007/10/revisiting_the_identity_oracle.html
Nature Editorial (2009) Credit where credit is due. Nature 462(7275):825
Newcombe H, Kennedy J, Axford S, James A (1959) Automatic linkage of vital records. Science 130(3381):954–959
Oracle Corporation: Oracle virtual directory 11g (2009) http://www.oracle.com/technetwork/middleware/id-mgmt/virtual-directory-wp-129468.pdf
Oztemel E, Arslankaya S (2011) Enterprise knowledge management model: a knowledge tower. Knowl Inf Syst. doi:10.1007/s10115-011-0414-4
Pasula H, Marthi B, Milch B, Russell S, Shpitser I (2002) Identity uncertainty and citation matching. In: Neural information processing systems. MIT Press, pp 1401–1408
Qiu J (2008) Scientific publishing: identity crisis. Nature 451:766–767
Quan X, Liu G, Lu Z, Ni X, Wenyin L (2010) Short text similarity based on probabilistic topics. Knowl Inf Syst 25(3):473–491
Romei A, Turini F (2011) Inductive database languages: requirements and examples. Knowl Inf Syst 26(3):351–384
Sarawagi S, Bhamidipaty A (2002) Interactive deduplication using active learning. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 269–278
SciVerse: Scopus Website. http://www.scopus.com/home.url
Singla P, Domingos P (2005) Object identification with attribute-mediated dependences. In: Knowledge Discovery in Databases: PKDD, Springer, pp 297–308
Slone S, The open group identity management work area: identity management (2004). http://www.opengroup.org/projects/idm/uploads/40/9784/idm_wp.pdf
Suriadi S, Foo E, Jøsang A (2009) A user-centric federated single sign-on system. J Netw Comput Appl 32:388–401
Swan A, Author identification web page Http://repinf.pbworks.com/Author-identification
Tejada S, Knoblock CA, Minton S (2001) Learning object identification rules for information integration. Inf Syst 26:2001
Tillett B (2002) A virtual international authority file. In: Workshop on authority control among Chinese, Korean and Japanese languages, pp 117–139
Wang Q, Noack R (2010) Intelligent author identification. Advances in conceptual modeling-applications and, challenges, pp 96–106
Author information
Authors and Affiliations
Corresponding author
Additional information
The research reported in this article was partially supported by the programme “Regionale Wettbewerbsfähigkeit OÖ 2007–2013” by the European Fund for Regional Development (EFRE) as well as the state of Upper Austria.
Rights and permissions
About this article
Cite this article
Schewe, KD., Wang, Q. Knowledge-aware identity services. Knowl Inf Syst 36, 335–357 (2013). https://doi.org/10.1007/s10115-012-0533-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0533-6