Skip to main content

Advertisement

Log in

Disclosing false identity through hybrid link analysis

  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

Combating the identity problem is crucial and urgent as false identity has become a common denominator of many serious crimes, including mafia trafficking and terrorism. Without correct identification, it is very difficult for law enforcement authority to intervene, or even trace terrorists’ activities. Amongst several identity attributes, personal names are commonly, and effortlessly, falsified or aliased by most criminals. Typical approaches to detecting the use of false identity rely on the similarity measure of textual and other content-based characteristics, which are usually not applicable in the case of highly deceptive, erroneous and unknown descriptions. This barrier can be overcome through analysis of link information displayed by the individual in communication behaviours, financial interactions and social networks. In particular, this paper presents a novel link-based approach that improves existing techniques by integrating multiple link properties in the process of similarity evaluation. It is utilised in a hybrid model that proficiently combines both text-based and link-based measures of examined names to refine the justification of their similarity. This approach is experimentally evaluated against other link-based and text-based techniques, over a terrorist-related dataset, with further generalization to a similar problem occurring in publication databases. The empirical study demonstrates the great potential of this work towards developing an effective identity verification system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Aleman-Meza B, Nagarajan M, Ding L, Sheth AP, Arpinar IB, Joshi A, Finin T (2008) Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection. ACM Trans Web 2(1):1–29

    Article  Google Scholar 

  • Ali AH, Dubois D, Prade H (2003) Qualitative reasoning based on fuzzy relative orders of magnitude. IEEE Trans Fuzzy Syst 11(1):9–23

    Article  Google Scholar 

  • Angheluta R, Moens MF (2007) Cross-document entity tracking. In: Proceedings of European conference on IR research, pp 670–673

  • Ashley KD, Bruninghaus S (2009) Automatically classifying case texts and predicting outcomes. Artif Intell Law 17:125–165

    Article  Google Scholar 

  • Badia A, Kantardzic MM (2005) Link analysis tools for intelligence and counterterrorism. In: Proceedings of IEEE international conference on intelligence and security informatics, Atlanta, pp 49–59 (2005)

  • Bhattacharya I, Getoor L (2007) Collective entity resolution in relational data. ACM Trans KDD 1(1):5-ex

    Google Scholar 

  • Bilenko M, Mooney RJ (2003) Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 39–48

  • Bilenko M, Mooney R, Cohen W, Ravikumar P, Fienberg S (2003) Adaptive name matching in information integration. IEEE Intell Syst 18(5):16–23

    Article  Google Scholar 

  • Boongoen T, Shen Q (2008a) Clus-DOWA: a new dependent OWA operator. In: Proceedings of IEEE international conference on fuzzy sets and systems, pp 1057–1063

  • Boongoen T, Shen Q (2008b) Detecting false identity through behavioural patterns. In: Proceedings of international crime science conference, London

  • Boongoen T, Shen Q (2009a) Order-of-magnitude based link analysis for false identity detection. In: Proceedings of the 23rd international workshop on qualitative reasoning, pp 7–15

  • Boongoen T, Shen Q (2009b) Semi-supervised OWA aggregation for link-based similarity evaluation and alias detection. In: Proceedings of IEEE international conference on fuzzy sets and systems, pp 288–293

  • Branting K (2003) A comparative evaluation of name matching algorithms. In: Proceedings of international conference on AI and law, pp 224–232

  • Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117

    Article  Google Scholar 

  • Calado P, Cristo M, Gonçalves MA, de Moura ES, Ribeiro-Neto BA, Ziviani N (2006) Link based similarity measures for the classification of web documents. J Am Soc Inform Sci Technol 57(2):208–221

    Article  Google Scholar 

  • Clarke R (1994) Human identification in information systems: management challenges and public policy issues. IT People 7(4):6–37

    Article  Google Scholar 

  • Fellegi I, Sunter A (1969) Theory of record linkage. J Am Stat Assoc 64:1183–1210

    Article  Google Scholar 

  • Fouss F, Pirotte A, Renders JM, Saerens M (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19(3):355–369

    Article  Google Scholar 

  • Getoor L, Diehl CP (2005) Link mining: a survey. ACM SIGKDD Explor Newslett 7(2):3–12

    Article  Google Scholar 

  • Hou J, Zhang Y (2003) Effectively finding relevant web pages from linkage information. IEEE Trans Knowl Data Eng 15(4):940–951

    Article  Google Scholar 

  • Hsiung P, Moore A, Neill D, Schneider J (2005) Alias detection in link data sets. In: Proceedings of international conference on intelligence analysis

  • Jaro MA (1995) Probabilistic linkage of large public health data files. Stat Med 14(5–7):491–498

    Article  Google Scholar 

  • Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 538–543

  • Klink S, Reuther P, Weber A, Walter B, Ley M (2006) Analysing social networks within bibliographical data. In: Proceedings of international conference on database and expert systems applications, Poland, pp 234–243

  • Kukich K (1992) Techniques for automatically correcting words in text. ACM Comput Surv 24(4):377–439

    Article  Google Scholar 

  • Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58(7):1019–1031

    Article  Google Scholar 

  • Lin Z, King I, Lyu MR (2006) Pagesim: a novel link-based similarity measure for the world wide web. In: Proceedings of IEEE/WIC/ACM international conference on web intelligence, pp 687–693

  • Minkov E, Cohen WW, Ng AY (2006) Contextual search and name disambiguation in email using graphs. In: Proceedings of international conference on research and development in IR, pp 27–34

  • Murata T, Moriyasu S (2008) Link prediction based on structural properties of online social networks. New Gener Comput 26:245–257

    Article  Google Scholar 

  • Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88

    Article  Google Scholar 

  • Oatley GC, Zeleznikow J, Ewart BW (2005) Criminal networks and spatial density. In: Proceedings of international conference on artificial intelligence and law, pp 246–247

  • Oskamp A, Lauritsen M (2002) AI in law practice? So far, not much. Artif Intell Law 10:227–236

    Article  Google Scholar 

  • Pantel P (2006) Alias detection in malicious environments. In: Proceedings of AAAI fall symposium on capturing and using patterns for evidence detection, Washington, D.C., pp 14–20

  • Pasula H, Marthi B, Milch B, Russell S, Shpitser I (2003) Identity uncertainty and citation matching. Adv Neural Inform Process Syst 15:1425–1432

    Google Scholar 

  • Philipps L, Sartor G (1999) Introduction: from legal theories to neural networks and fuzzy reasoning. Artif Intell Law 7:115–128

    Article  Google Scholar 

  • Porter G (2008) Crying (iranian) wolf in argentina. Asia Times Online

  • Raiman O (1991) Order of magnitude reasoning. Artif Intell 51(1–3):11–38

    Article  Google Scholar 

  • Reuther P, Walter B (2006) Survey on test collections and techniques for personal name matching. Int J Metadata Semant Ontol 1(2):89–99

    Article  Google Scholar 

  • Schwartz ME, Wood DCM (1993) Discovering shared interests using graph analysis. Commun ACM 36(8):78–89

    Article  Google Scholar 

  • Shen Q, Leitch R (1992) On extending the quantity space in qualitative reasoning. Artif Intell Eng 7:167–173

    Article  Google Scholar 

  • Shen Q, Keppens J, Aitken C, Schafer B, Lee M (2006) A scenario-driven decision support system for serious crime investigation. Law Probab Risk 5(2):87–117

    Article  Google Scholar 

  • Small H (1973) Co-citation in the scientific literature: a new measure of the relationship between two documents. J Am Soc Inform Sci 24:265–269

    Article  Google Scholar 

  • Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Relevance search and anomaly detection in bipartite graphs. ACM SIGKDD Explor Newslett 7(2):48–55

    Article  Google Scholar 

  • Torvik V, Weeber M, Swanson DW, Smalheiser NR (2004) A probabilistic similarity metric for medline records: a model of author name disambiguation. J Am Soc Inform Sci Technol 56(2):140–158

    Article  Google Scholar 

  • Wang GA, Chen H, Atabakhsh H (2004) Automatically detecting deceptive criminal identities. Commun ACM 47(3):71–76

    Article  Google Scholar 

  • Wang GA, Atabakhsh H, Petersen T, Chen H (2005) Discovering identity problems: a case study. In: Proceedings of IEEE international conference on intelligence and security informatics, Atlanta, pp 368–373

  • Wang GA, Chen H, Xu JJ, Atabakhsh H (2006) Automatically detecting criminal identity deception: an adaptive detection algorithm. IEEE Trans Syst Man Cybern Part A 36(5):988–999

    Article  Google Scholar 

  • Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

    Google Scholar 

  • Yager RR (2007) Using stress functions to obtain OWA operators. IEEE Trans Fuzzy Syst 15(6):1122–1129

    Article  MathSciNet  Google Scholar 

  • Zadeh LA (1965) Fuzzy sets. Inform Control 8:338–353

    Article  MATH  MathSciNet  Google Scholar 

  • Zhang P, Koppaka L (2007) Semantics-based legal citation network. In: Proceedings of international conference on artificial intelligence and law, pp 123–130

Download references

Acknowledgments

This work is sponsored by UK EPSRC grant EP/D057086. The authors are grateful to the members of the project team for their contribution, whilst taking full responsibility for the views expressed in this paper. The authors would also like to thank the anonymous referees for their constructive comments which have helped considerably in revising this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boongoen, T., Shen, Q. & Price, C. Disclosing false identity through hybrid link analysis. Artif Intell Law 18, 77–102 (2010). https://doi.org/10.1007/s10506-010-9085-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-010-9085-9

Keywords

Navigation