1 Introduction

In the tapestry of a resilient society, the threads of law and order are woven with precision. Crime, an unwelcome but persistent facet of our modern existence, continues its ascent day by day. As the wheels of justice turn, new cases unfurl even as ongoing ones await their final chapter, creating an ever-swelling reservoir of pending matters. In this symphony of urgency, the pursuit of swift action becomes the anthem of criminal investigation. To navigate these intricate rhythms, our quest must embrace methodologies and modeling that offer the guiding notes to investigators as they tread the path of inquiry, harmonizing the pursuit of justice with the evolving cadence of our times.

Crime casts a profound shadow on society, leaving in its wake a trail of losses and disruptions. In the realm of criminology, the study of crime and criminal behavior takes on a technical precision under the vigilant eye of law enforcement. Here, the potent tools of data mining emerge as beacons of potential, offering the promise of substantial insights. With a sprawling landscape of criminal data sets, each intertwined in intricate relationships, criminology emerges as a fertile ground for the application of data mining techniques. At its core, crime analysis, a pivotal discipline within criminology, delves into the intricate tapestry of criminal occurrences, unveiling not only the crimes themselves but also the intricate connections that bind them to one another and to the individuals who commit them.

Uncertainty casts a haunting shadow over the intricate choreography of criminal investigations, adding layers of complexity to the pursuit of justice. In this delicate realm, where the quest for reliable evidence, steadfast eyewitness accounts, and unequivocal forensics can be akin to chasing phantoms, criminal investigators find themselves navigating a labyrinth of decisions. Each choice, swayed by the presence of uncertain and enigmatic information harvested from crime scenes, becomes a pivotal note in the symphony of solving crimes. It is within this tapestry of uncertainty that criminal investigations metamorphose into a captivating landscape of decision-making intricacies.

Uncertainty, an intrinsic companion in the realm of decision-making, refuses to be sidelined. From its intricate tapestry, a multitude of strategies have emerged, each a response to the ever-present specter of doubt. Among these, the illustrious Fuzzy Set Theory (FST) by the visionary Zadeh (1965) has assumed a position of eminence. Fuzzy Sets (FS) have unfurled new vistas in the art of decision-making, offering a nuanced and superior alternative to conventional approaches when confronting uncertainty’s enigma. Their prowess extends its reach, offering solace to myriad challenges, from the Multi-Criteria Decision Making (MCDM) Problem to the Multi-Attribute Decision Making (MADM) Problems. FST’s influence reverberates through the corridors of academia, breathing life into diverse domains. Further, many other works were developed based on the extensions of Fuzzy Sets (Akram and Zahid 2023; Akram et al. 2023; Habib et al. 2022; Fatima et al. 2023; Sarwar et al. 2023; Akram et al. 2023; Abbas et al. 2023; Ali and Al-kenani 2023; Ali et al. 2022; Ali and Naeem 2022a, b; Ali 2022).

While fuzzy sets have proven their mettle in resolving decision-making quandaries, they bear a limitation: their focus solely on membership degrees (MD), a constraint that tempers their applicability. However, the latter half of the preceding century witnessed a transformative evolution in the realm of Fuzzy Set Theories (FSTs). Atanassov etched a new paradigm with the birth of Intuitionistic Fuzzy Sets (IFS) (Atanassov 1983; Atanassov and Stoeva 1986; Atanassov 1999, 2012), breathing life into this innovative concept by introducing the concept of non-membership degrees (ND) alongside the existing membership degrees, with the stipulation that their cumulative sum remains below unity. The notion of hesitancy degree (HD) was also unveiled, further enriching the tapestry of IFS. In this realm, each element of an IFS is endowed with an MD, ND, and HD, a contrast to traditional fuzzy sets, where only MDs reign supreme. The groundwork laid by Atanassov resonated with Gau and Buehrer’s conception of vague sets Gau and Buehrer (1993), a sentiment that Bustince and Burillo (1996) echoed, recognizing the kinship between vague sets and IFSs. Zhang (2014) further explored the territory by introducing the Linguistic Intuitionistic Fuzzy Set (LIFS), a fusion of linguistic values and IFS, a singular expression that encapsulates both qualitative and quantitative insights. In consequence, IFS has emerged as a potent instrument in the realm of uncertainty, surpassing the confines of FST. Its versatility has found expression across multifarious domains, accompanied by the development of a cadre of information measures aimed at amplifying the utility of IFSs in diverse decision-making conundrums (Augustine 2021; Ejegwa and Onyeke 2021; Xuan Thao 2018; Thao et al. 2019; Garg et al. 2020; Garg and Kumar 2018).

The world of uncertainty finds its compass in the realm of similarity and distance measures, where the art of quantifying closeness and disparity among fuzzy sets unfolds. Like twin forces, these measures are entrusted with the noble task of unraveling the complexities of decision-making. Often, they dance in harmony, with Distance Measure becoming the twin sibling of Similarity Measure, closely entwined through the equation Distance Measure = 1 - Similarity Measure. Across the annals of literature, a rich tapestry of endeavors comes to life. Chen’s eloquent descriptions Chen (1997) and Hong and Kim’s insightful contributions Hong and Kim (1999) echo through time, shaping our understanding of IFS similarity measures. Ye, inspired by the profound cosine measure, crafted a symphony of IF likeness Ye (2011). Chu et al. (2020) broke new ground by establishing the fourth axiom of similarity measures and reshaping Ye’s cosine measure, while Luo et al. (2018) painted patterns with intuitive distances woven in IFSs. Ngan et al. (2018), recognizing the limitations of existing distance measures, offered fresh perspectives grounded in IFSs. Garg and Kumar (2018) embarked on a sophisticated exploration, unveiling the intricacies of set pair analysis theory-based similarity measures. Hwang et al. (2018) brought forth innovation with a novel similarity measure rooted in the Jaccard index, addressing the intricate domain of clustering. Jiang et al. (2019) embarked on a transformative journey, conceiving similarity measures for IFSs through the prism of transformed isosceles triangles. Dhivya and Sridevi (2019), guided by the midpoints of transformed triangular fuzzy numbers, crafted a novel measure of similarity between IFSs, illuminating the path for applications in pattern recognition and medical diagnostics. Chen and Deng (2020), inquisitive in their exploration of hesitation degree within distance measures, charted new territories. Garg and Rani (2021), refining the work of Jiang et al., unveiled yet another facet of IFS-based similarity measures. Mahanta and Panda’s Mahanta and Panda (2021) ingenuity led to the creation of an IFS-based distance measure, a key to unlocking solutions for the mask selection problem. Gohain et al. (2021, 2022a, b), in tandem, unveiled a symphony of measures anchored in IFSs, building upon the foundation laid by Ngan et al. (2018). Garg and Rani (2022), exploring the diverse concepts of centers, birthed original distance measures grounded in IFSs, each with its unique character. Such is the allure of IFS that a chorus of analysts has resounded, offering a symphony of alternative similarity and distance measures, each adding its own melody to this ever-evolving symphony (Ejegwa and Onyeke 2018; Ohlan 2016; Garg 2018; Dengfeng and Chuntian 2002).

In the intricate tapestry of justice, criminal law and criminology stand as weavers of wisdom, drawing threads from an eclectic array of disciplines that span sociology, economics, law, anthropology, medicine, psychology, and philosophy. Together, they embark on a quest to unravel the profound mysteries surrounding crime. Within the realm of criminology, specialized branches have blossomed, each embracing a diverse bouquet of insights from these fields to craft theories that illuminate the origins of criminal behavior. Eminent scholars like Beasley (2004), O’Brien (2014), and Keppel (2010) have undertaken profound journeys into the heart of darkness, meticulously documenting the chilling narratives of serial killers. As the world hurtles forward into the digital age, a technological renaissance unfolds, ushering in a revolution that applies cutting-edge tools to the domains of criminal justice, crime prevention, and law enforcement. Emerging from this cauldron of innovation are remarkable tools such as data mining and face recognition technologies, standing alongside a cadre of avant-garde methods for preventing and penalizing crime. Within this dynamic landscape, a symphony of research (Brayne 2020; Kotsoglou and Oswald 2020; Sreedevi et al. 2018; Gupta et al. 2022; Aziz et al. 2022) resonates, bearing witness to the transformative power of technology in the pursuit of justice.

In the intricate tapestry of criminal investigations, the art of crime linkage emerges as a guiding star, helping illuminate the path towards resolving cases. Through the lens of forensic science, a meticulous analysis of a cluster of crimes unfolds, revealing the intricate web that connects them to a common perpetrator. Yet, this pursuit is far from straightforward, as it often traverses treacherous terrain marked by the absence of reliable evidence, be it DNA, fingerprints, or other forensic clues. The specter of uncertainty looms large, demanding a nuanced investigative approach. Here, a shimmering beacon of hope emerges-a clustering method rooted in the fuzzy realm. It stands ready to navigate the uncertain waters, offering a means to link crimes even when evidence is veiled in ambiguity.

Within the realm of data organization, the art of clustering emerges as a symphony of arrangement, orchestrating a set of N objects into harmonious clusters or groups. In this intricate choreography, the objective is clear: to foster kinship among objects within each cluster, uniting them in similarity while distinguishing them from others. Herein lies the elegance of clustering algorithms steeped in fuzzy logic, a symphony of finesse that excels in the realm of decision-making. In contrast to their classical counterparts, which navigate the terrain of numerical information, these fuzzy set-based clustering algorithms (Bezdek et al. 1984; Wang et al. 2019, 2020a, b) have sparked a renaissance in the field of clustering. A crowning jewel in this treasure trove is the methodology crafted by Xu et al. (2008), tailored to address the unique challenges posed by clustering problems within the realm of Intuitionistic Fuzzy Sets (IFSs).

In the intricate tapestry of criminal investigations, psychological profiling stands as a meticulous endeavor, a methodical dance to unearth the enigmatic facets of a criminal’s mind. This artful process, as guided by the astute FBI profilers Douglas and Burgess (1986), delves deep into the labyrinthine corridors of crime scenes to unveil the elusive personality traits and behavioral quirks of unknown wrongdoers. Within its embrace, psychological profiling serves as a compass, helping investigators navigate the fog of uncertainty when solid evidence is scarce. While a handful of FBI agents have penned insights into their craft, the intricacies of profiling strategies remain largely concealed Muller (2000). Here, in the realm of vagueness, the fuzzy approaches emerge as silent orchestrators, lending clarity and depth to the art of profiling analysis, an indispensable tool in the pursuit of justice.

In the realm of fuzzy mathematics, a symphony of research has unfurled, harmonizing the intricate domains of linkage analysis, serial crime prediction, and crime prevention. Among these virtuoso researchers, Goala et al. (2019) have etched their mark, presenting a masterful composition in the form of an MCDM method rooted in Intuitionistic Fuzzy Sets, a discerning tool to distinguish serial crimes from their brethren. Also, Goala (2019) made a notable advancements in the field of criminal investigation in his work. In another study of Goala and Dutta (2018), they cast a spotlight on urban landscapes, unveiling a ranking of vulnerable areas, their technique akin to the strokes of a skilled painter wielding generalized triangular fuzzy numbers. And in a resounding crescendo, Goala et al. (2022) unveiled a novel aggregation operator, born of the very essence of IFSs, a beacon illuminating the path to a decision support system tailored for the vigilance of smart cities. In this grand symphony, mathematics dances hand in hand with criminology, composing an ode to the power of fuzzy logic in unraveling the mysteries of crime.

1.1 Problem statement

The problem statement of this study revolves around addressing the limitations and drawbacks of existing similarity measures within the context of Intuitionistic Fuzzy Sets (IFS). Specifically, the study aims to develop and introduce novel similarity measures based on IFS that overcome the shortcomings of current metrics. Additionally, the research endeavors to apply these newly proposed measures to practical scenarios within the domain of criminal investigation, particularly in crime linkage and psychological profiling.

1.2 Gap in the existing research

Several drawbacks of the existing similarity measure such as Chen (1997), Hong and Kim (1999), Ye (2011), Luo et al. (2018), Ngan et al. (2018), Garg and Kumar (2018), Hwang et al. (2018), Jiang et al. (2019), Dhivya and Sridevi (2019), Chen and Deng (2020), Garg and Rani (2021), Mahanta and Panda (2021), Gohain et al. (2021, 2022a, b), Garg and Rani (2022) could be seen.

Considering the two profiles such that in Profile 1, \(A_1=(0.3,0.3), A_2=(0.4,0.4)\) and in Profile 2, \(B_1=(0.3,0.4),B_2=(0.4,0.3)\). From just the MD and ND perspectives, Profile 1’s IFSs appear more comparable to each other than Profile 2’s as \(B_1=B_2^c\). Nevertheless, when the concept of hesitancy is taken into consideration, it is clear that the IFSs for Profile 2 is more comparable to those of Profile 1 as HD(\(A_1\))=0.4, HD(\(A_2\))=0.2, HD(\(B_1\))=0.3 and HD(\(B_2\))=0.3. The similarity measure (Chen 1997; Ye 2011) violates the Property 2 of the similarity measure. Similarity measures such as Hong and Kim (1999), Mahanta and Panda (2021) do not differentiate the IFSs, whereas the similarity measures (Ngan et al. 2018; Garg and Kumar 2018; Hwang et al. 2018; Jiang et al. 2019; Luo et al. 2018; Chen and Deng 2020; Garg and Rani 2021; Dhivya and Sridevi 2019; Gohain et al. 2021, 2022a, b; Garg and Rani 2022) makes the incorrect choice.

Again, considering Profile 3, \(A_3=(0.3,0.7), A_4=(0.4,0.6)\) and in Profile 4, \(B_3=(0.3,0.7),B_4=(0.2,0.8)\), the similarity measure such Chen (1997), Hong and Kim (1999), Ngan et al. (2018), Jiang et al. (2019), Mahanta and Panda (2021), Luo et al. (2018), Chen and Deng (2020), Garg and Rani (2021), Dhivya and Sridevi (2019), Gohain et al. (2022b); Garg and Rani (2022) fails to distinguish between positive and negative differences.

Furthermore, when comparing Profiles 5 and 6, \(A_5=(0.4,0.2), A_6=(0.5,0.3)\), and \(B_5=(0.4,0.2),B_6=(0.5,0.2)\), it is evident that Profile 6’s IFSs are more comparable to Profile 5’s since the membership values of both profiles’ IFSs follow an exact pattern, with only non-membership values differing from one another. Numerous approaches, such as Ye (2011), Garg and Kumar (2018), Jiang et al. (2019), Dhivya and Sridevi (2019), Garg and Rani (2021), Luo et al. (2018), Garg and Rani (2022) failed to identify this fact. Further, Chen (1997) violated property 2 of the similarity measure.

In the realm of similarity measures for Intuitionistic Fuzzy Sets (IFSs), the existing approaches, represented by notable works such as Chen (1997), Hong and Kim (1999), Ye (2011), and a host of others, exhibit notable drawbacks. These limitations cast a spotlight on crucial research gaps that merit exploration and innovation. This could be summarized as:

  1. 1.

    1. Comprehensive Comparison: The failure of numerous approaches, including (Hong and Kim 1999; Mahanta and Panda 2021; Ngan et al. 2018; Hwang et al. 2018; Chen and Deng 2020; Gohain et al. 2021, 2022a, b; Ye 2011; Garg and Kumar 2018; Jiang et al. 2019; Dhivya and Sridevi 2019; Garg and Rani 2021; Luo et al. 2018; Garg and Rani 2022) to correctly differentiate the IFSs constitutes a substantial research gap, necessitating the development of innovative measures that can identify and leverage these patterns to deliver more accurate similarity assessments.

  2. 2.

    2. Positive and Negative Differences: Another conspicuous gap emerges when assessing the ability of existing measures to differentiate between positive and negative differences in IFSs. Notably, measures such as Chen (1997), Hong and Kim (1999), Ngan et al. (2018), Jiang et al. (2019), Mahanta and Panda (2021), Luo et al. (2018), Chen and Deng (2020), Garg and Rani (2021), Dhivya and Sridevi (2019), Gohain et al. (2022b); Garg and Rani (2022) fail in this regard. Bridging this gap calls for the creation of measures that can effectively discern and utilize both positive and negative differences, enhancing the precision of similarity assessments.

  3. 3.

    3. Property Adherence: A persistent gap revolves around the adherence to essential properties of similarity measures. Notably, violations of Property 2, as seen in Chen (1997), Ye (2011), underscore the need for measures that strictly adhere to these fundamental properties, ensuring the reliability and validity of similarity assessments.

Addressing these research gaps presents an opportunity for scholars in the field to pioneer new approaches and methodologies, ultimately advancing the accuracy and applicability of similarity measures for IFSs in diverse domains, including decision-making and crime prediction.

1.3 Motivation of the research

In any society, the occurrence of crime presents a formidable challenge to its social stability. Despite the existence of stringent laws, criminal activities persist, necessitating swift and effective justice for victims and the appropriate punishment for wrongdoers. Achieving this demands a highly proficient investigative team. Ideally, criminal investigations would be straightforward if concrete and reliable evidence were readily available. However, in many cases, this is far from reality, as the available evidence often possesses inherent uncertainties. Criminal investigators frequently find themselves making critical decisions based on uncertain and vague information collected from crime scenes. This pervasive uncertainty becomes a central issue in the realm of criminal investigations, transforming them into complex decision-making problems. Enhancing the investigation of crimes hinges upon two invaluable tools: crime linkage and psychological profiling.

Crime linkage involves the meticulous examination of a series of crimes to identify patterns and connections among those attributed to a common offender. On the other hand, psychological profiling entails constructing a behavioral and psychological profile of an unidentified perpetrator based on available information about their past crimes. Both techniques prove invaluable in cases lacking solid evidence, as they serve to reduce the pool of potential suspects. Given the inherently ambiguous nature of crime scene evidence, the application of fuzzy methods has emerged as a critical approach in this field. Fuzzy methods differ from traditional crisp value-based approaches by considering membership values, making them more adept at addressing problems intertwined with uncertainty. However, it became apparent that solely relying on membership degree (MD) was insufficient, leading to the introduction of Intuitionistic Fuzzy Sets (IFSs), which encompass both MD and non-membership degree (ND).

Similarity measures play a crucial role in addressing decision-making problems of criminal investigation, with numerous measures developed to assess the similarity between Intuitionistic Fuzzy Sets (IFSs). While some studies have compared these measures in specific contexts, there is a notable absence of a comprehensive evaluation of their performance across diverse applications, particularly in the realm of criminal investigation. Moreover, existing measures often rely on specific assumptions or conditions applicable only in certain situations, introducing limitations. For example: if we consider, \(J=(0.3,0.7), K=(0.4,0.6)\), \(L=(0.3,0.7), M=(0.2,0.8)\), the similarity measure such Chen (1997), Hong and Kim (1999), Ngan et al. (2018), Jiang et al. (2019), Mahanta and Panda (2021), Luo et al. (2018), Chen and Deng (2020), Garg and Rani (2021), Dhivya and Sridevi (2019), Gohain et al. (2022b); Garg and Rani (2022) fails to distinguish between positive and negative differences. A lot more are discussed in the research gap and in more details in Sects. 3.3, and 6.1. The discussed research gap addresses the need for novel IFS-based decision-making methods tailored for criminal investigation. Consequently, there is a call to develop and assess new measures that can overcome these limitations, delivering accurate and reliable outcomes in various applications, including crime linkage analysis and psychological profiling. Such endeavors aim to enhance the precision and efficiency of decision-making processes involving IFSs within the domain of criminal investigations. The identified challenges and gaps outlined above serve as the primary motivation driving our work.

1.4 Objective of the work

Based on the motivation discussed above, the study’s key objectives are as follows:

  1. 1.

    Identify and understand constraints and issues with existing IFS-based similarity measures.

  2. 2.

    Design and introduce novel IFS-based similarity measures that surpass existing measures.

  3. 3.

    To employ this novel similarity measure in the task of crime clustering, facilitating crime linkage with a suitable case study showing its applicability.

  4. 4.

    To leverage this similarity measure in the realm of psychological profiling with a suitable case study, thereby enhancing the effectiveness of crime investigations.

1.5 Novelty of the research

The research introduces novelty in several significant aspects:

1. Development of Novel Similarity Measures::

This study’s primary contribution is the creation of innovative similarity measures grounded in Intuitionistic Fuzzy Sets (IFS). These measures surpass existing approaches by addressing limitations and drawbacks associated with existing IFS-based similarity measures. By introducing a novel measure, the research provides a fresh perspective on quantifying similarity in situations involving uncertainty and vagueness.

2. Practical Application in Criminal Investigation::

Moving beyond theoretical developments, the study applies the newly proposed similarity measure to practical scenarios within the field of criminal investigation, focusing on crime linkage and psychological profiling. Demonstrating the practical utility of the measures in aiding criminal investigators bridges the gap between theoretical advancements and real-world problem-solving.

3. Enhancing Decision-Making in Uncertainty::

In the realm of criminal investigation, decision-making is often challenging due to the presence of uncertain and vague information. The research contributes to the field by offering improved tools for decision support. The novel similarity measure help investigators make more accurate assessments and link crimes effectively, even when solid evidence is lacking. This has the potential to significantly enhance the efficiency and success rate of criminal investigations.

4. Interdisciplinary Approach::

The research adopts an interdisciplinary approach, drawing on concepts from various domains, including fuzzy mathematics, criminology, and data analysis. This comprehensive exploration allows for unique solutions that bridge the gap between these disciplines.

5. Addressing Research Gaps::

By identifying and addressing the limitations of existing similarity measures, the research fills a crucial research gap. It systematically discusses the shortcomings of existing measures, providing a foundation for the proposed improvements. This critical evaluation of the state of the art in IFS-based similarity measures adds depth and relevance to the study.

In summary, the novelty of this research lies in its development of novel similarity measure, their practical application in criminal investigation, and their potential to enhance decision-making in the face of uncertainty. This multidimensional approach contributes to both theoretical advancements and practical solutions within the context of Intuitionistic Fuzzy Sets.

1.6 Structure of the paper

The paper’s structure is outlined as follows:

The Sect. 1 which is the Introduction consists of the fundamental concepts of the criminal investigative process. Also given in the Introduction is a comprehensive literature review, including an analysis of prior research and an overview of similarity measures utilizing Intuitionistic Fuzzy Sets (IFS). Finally, the Introduction includes the Motivation, Problem Statement, Gap in Existing Research, Objective of the Paper, Novelty of the Research, and an outline of the paper’s structure. Then in Sect. 2 we have the essential definitions of Fuzzy Sets (FS), Intuitionistic Fuzzy Sets (IFS), along with their fundamental operations and the definition of similarity measure. In Sect. 3, at first we make an examination of existing similarity measures within the context of IFS. Then, we have proposed a novel generalized similarity measure grounded in IFS along with some properties. Then the advantages of this new measure is discussed by addressing the limitations of existing similarity measures. Then, finally in the Section, new innovative types of similarity measures are proposed. Section 4, introduces a modified methodology and algorithm designed to address clustering problems in the context of crime linkage. The flowchart and time complexity of the algorithm related to crime linkage is also discussed in the Section. A practical case study focusing on resolving challenges related to crime linkage is discussed showing the applicability of the method. Section 5 offers an in-depth analysis of how the introduced similarity measure contributes to psychological profiling within criminal investigations via a methodology and algorithm. The flowchart and time complexity of the algorithm related to psychological profiling is discussed in this Section. Section 6 comprises of discussion and comparative analysis of our study. Given in the Section is a comparative analysis on the proposed similarity measure. Also given is a comparative analysis of similarity measures for crime linkage and psychological profiling. Finally, in the Section, we have sensitivity analysis for different values of \(\lambda \). Ultimately, in Sect. 7, a well-suited conclusion is presented, where we have summarized the key findings and contributions of the research and discussed the limitations, significance and future scope of the study.

The experimental flow of the study is given in Fig. 1.

Fig. 1
figure 1

The experimental flow of the study

2 Preliminaries

In this section, we delve into the foundational aspects of our study. We begin by elucidating the essential concepts and definitions that form the groundwork for our research. This includes an exploration of the relevant terminology and theoretical underpinnings necessary to understand the subsequent discussions.

Definition 2.1

(Fuzzy Set): Zadeh (1965) We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Then a Fuzzy set \(\mathbb {F}\) is defined by \(\mathbb {F}=\{\langle \delta _i,{\mathbb {M}_\mathbb {F}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) where the function \({\mathbb {M}_{\mathbb {F}}}(\delta _i):\Delta \rightarrow [0,1]\) define the membership degree.

Definition 2.2

(Intuitionistic Fuzzy Set): Atanassov (1983) We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Then an Intuitionistic Fuzzy Set \(\mathbb {I}\) is defined by \(\mathbb {I}=\{\langle \delta _i,{\mathbb {M}_\mathbb {I}}(\delta _i),{\mathbb {N}_\mathbb {I}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) where the functions \({\mathbb {M}_\mathbb {I}}(\delta _i):\Delta \rightarrow [0,1]\) and \({\mathbb {N}_\mathbb {I}}(\delta _i):\Delta \rightarrow [0,1]\) define the MD and ND respectively and for every \(\delta _i \in \Delta \), \(0\le {\mathbb {M}_\mathbb {I}}(\delta _i)+{\mathbb {N}_\mathbb {I}}(\delta _i)\le 1\).

The HD is defined by \({\mathbb {O}_{\mathbb {I}}}(\delta _i)=1- [{\mathbb {M}_\mathbb {I}}(\delta _i)+{\mathbb {N}_\mathbb {I}}(\delta _i)]\) and \({\mathbb {O}_{\mathbb {I}}}(\delta _i)\in [0,1]\) such that \({\mathbb {M}_\mathbb {I}}(\delta _i)+{\mathbb {N}_\mathbb {I}}(\delta _i)+{\mathbb {O}_{\mathbb {I}}}(\delta _i)=1\)

Definition 2.3

: Atanassov and Stoeva (1986) Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). Accordingly, the subsequent operations can be described as follows:

  1. i)

    \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}}\) iff \({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)\le {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)\) and \({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\ge {\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\)

  2. ii)

    \({\mathbb {I}_{\text {1}}}^c=\{\langle \delta _i,{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i), {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\)

  3. iii)

    \({\mathbb {I}_{\text {1}}}\cup {\mathbb {I}_{\text {2}}}=\{\langle \delta _i,max({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)),min({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i))\rangle ;\delta _i \in \Delta \}\)

  4. iv)

    \({\mathbb {I}_{\text {1}}}\cap {\mathbb {I}_{\text {2}}}=\{\langle \delta _i,min({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)),max({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i))\rangle ;\delta _i \in \Delta \}\)

  5. v)

    \({\mathbb {I}_{\text {1}}}+ {\mathbb {I}_{\text {2}}}=\{\langle \delta _i,({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)),({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i))\rangle ;\delta _i \in \Delta \}\)

  6. vi)

    \({\mathbb {I}_{\text {1}}}\times {\mathbb {I}_{\text {2}}}=\{\langle \delta _i,({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)),({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)-{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i))\rangle ;\delta _i \in \Delta \}\)

Definition 2.4

: Dengfeng and Chuntian (2002) We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set and \({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}} \in \) IFS(\(\Delta \)). The similarity measure S between \({\mathbb {I}_{\text {1}}}\) and \({\mathbb {I}_{\text {2}}}\) is a function S:IFS \(\times \) IFS\(\rightarrow [0,1]\) satisfies the following axioms:

  1. 1.

    \(0\le S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\).

  2. 2.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\) iff \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\).

  3. 3.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}})\).

  4. 4.

    If \({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}} \in \) IFS\((\Delta )\) such that \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), then, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

and \(S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}).\)

3 Similarity measure based on IFS

In this part, we talk about the existing similarity measures. Then, based on IFS, we provide a novel generalized similarity measure. We have made a comparison between our suggested measure and the various similarity measures. The propagation of different similarity measures is also attempted.

3.1 Existing IFS-based similarity measures

Table 1 lists the existing similarity measures.

Table 1 Existing Similarity Measures

3.2 Novel generalized similarity measure

The existing distance measures, exhibit shortcomings and failures in certain situations, as discussed earlier in the research gap. We suggest a novel generalized similarity measure based on IFS in this section. This new measure overcomes the shortcomings of existing approaches and enables a more refined approach. The proposed novel generalized similarity measure is described as follows:

Definition 3.1

: Let’s consider two vectors of length n, \(H=(h_1,h_2,....,h_n)\) and \(K=(k_1,k_2,....,k_n)\), where all the coordinates are positive real integers. So, the following is how the Generalized Similarity Measure is defined:

$$\begin{aligned} G(H,K)=\dfrac{\lambda H.K}{||H||^2_2+||K||^2_2+(\lambda -2) H.K}=\dfrac{\lambda \sum \limits _{j=1}^nh_jk_j}{\sum \limits _{j=1}^n(h_j)^2+\sum \limits _{j=1}^n(k_j)^2+(\lambda -2)\sum \limits _{j=1}^nh_jk_j} \end{aligned}$$
(1)

where \(\lambda (>0)\) is a generalized parameter, \(H.K=\sum \limits _{j=1}^nh_jk_j\) is called the inner product of the vector H and K and \(||H||_2=(\sum \limits _{j=1}^n(h_j)^2)^{1/2}\) and \(||K||_2=(\sum \limits _{j=1}^n(k_j)^2)^{1/2}\) are the Euclidean norms of H and K.

Using the aforementioned definition as a foundation, we now propose a novel generalized similarity measure built on IFS, which is described as:

Definition 3.2

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). The generalized similarity measure of \({\mathbb {I}_{\text {1}}}\) and \({\mathbb {I}_{\text {2}}}\) is defined by

$$\begin{aligned} S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})= & {} \frac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i) {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}}, \lambda >0\nonumber \\{} & {} +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)] \end{aligned}$$
(2)

It is now necessary to prove that the suggested similarity measure meets each of the four similarity measure axioms. Before this we consider a lemma proposed by Chu et.al. Chu et al. (2020) followed by some corollaries.

Lemma: If \(\dfrac{1}{2}\ge \dfrac{t_1}{T_1}\ge \dfrac{t_2}{T_2}\) and \(\dfrac{1}{2}\ge \dfrac{v_1}{V_1}\ge \dfrac{v_2}{V_2}\), then \(\dfrac{t_1+v_1}{T_1+V_1}\ge \dfrac{t_2+v_2}{T_2+V_2}\) where \(t_1,t_2,v_1,v_2,T_1,\)

\(T_2,V_1,V_2\) are positive real numbers.

Corollary 1

: If \(\dfrac{1}{2}\ge \dfrac{t_1}{T_1}\ge \dfrac{t_2}{T_2}\) and \(\dfrac{1}{2}\ge \dfrac{v_1}{V_1}\ge \dfrac{v_2}{V_2}\), then \(\dfrac{1}{2}\ge \dfrac{t_1+v_1}{T_1+V_1}\ge \dfrac{t_2+v_2}{T_2+V_2}\) where \(t_1,t_2,v_1,v_2,T_1,T_2,V_1,V_2\) are positive real numbers.

Corollary 2

: If \(\dfrac{1}{2}\ge \dfrac{t_1}{T_1}\ge \dfrac{t_2}{T_2}\), \(\dfrac{1}{2}\ge \dfrac{v_1}{V_1}\ge \dfrac{v_2}{V_2}\) and \(\dfrac{1}{2}\ge \dfrac{z_1}{Z_1}\ge \dfrac{z_2}{Z_2}\), then \(\dfrac{t_1+v_1+z_1}{T_1+V_1+Z_1}\ge \dfrac{t_2+v_2+z_2}{T_2+V_2+Z_2}\) where \(t_1,t_2,v_1,v_2,z_1,z_2,T_1,T_2,V_1,V_2,Z_1,Z_2\) are positive real numbers.

Theorem: The similarity measure \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) meets each of the similarity measure’s axioms.

Proof

 

  1. i)

    As \({\mathbb {M}_{\mathbb {I}_{\text {i}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {i}}}}(\delta _i),{\mathbb {O}_{\mathbb {I}_{\text {i}}}}(\delta _i) \in [0,1]\), where, \(i=1,2\), then clearly, \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge 0\)

    Also,\([{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2+[{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2+[{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2\ge 0\)

    $$\begin{aligned}{} & {} \implies [{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \quad \quad -2[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\ge 0 \\{} & {} \implies [{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \quad \quad +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \quad \ge \lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)],\lambda>0\\{} & {} \implies \dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}}\le 1,\lambda >0 \\{} & {} \quad \quad +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+ {\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)] \end{aligned}$$

    Hence, \(0\le S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\le 1\)

  2. ii)

    If, \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\), then, \(\dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i) +{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}}=1\) \(\quad +(\lambda -2) [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}} (\delta _i)]\)

    $$\begin{aligned}{} & {} \implies [{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \quad \quad +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \quad = \lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\\{} & {} \implies [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2+[{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2+[{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i)-{\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]^2= 0\\{} & {} \implies {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)\\{} & {} \implies {\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}} \end{aligned}$$

    Conversely, \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\), then, \(\implies {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i)={\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)\)

    Then, \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\)

    Hence, \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\) iff \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\)

  3. iii)

    Clearly, \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S_P({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}})\)

  4. iv)

    As, \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), then, \({\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)\le {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)\le {\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)\) and \({\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\ge {\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\ge {\mathbb {N}_{\mathbb {I}_{\text {3}}}}(\delta _i)\)

Then, \({\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)\ge 0, {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)-{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)\ge 0, {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)\ge 0\)

Thus, \(({\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i))( {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)-{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)) {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i)\ge 0\)

$$\begin{aligned}{} & {} \implies {\mathbb {M}^{\text {3}}_{\mathbb {I}_{\text {1}}}}(\delta _i)({\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i))+{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)({\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)-{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i))\ge 0\\{} & {} \implies {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)({\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+ {\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i))\ge {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)({\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i))\\{} & {} \implies \lambda {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)[({\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i))+(\lambda -2){\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)]\\{} & {} \quad \ge \lambda {\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)[({\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i))+(\lambda -2){\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)] \\{} & {} \implies \dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]+(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]} \\{} & {} \quad \ge \dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)]}{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i)]+(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)]} \end{aligned}$$
$$\begin{aligned} {Further}, \dfrac{1}{2}\ge & {} \dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]+(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}\nonumber \\\ge & {} \dfrac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)]}{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i)]+(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)]} \end{aligned}$$
(3)
$$\begin{aligned} {Similarly}, \dfrac{1}{2}\ge & {} \dfrac{\lambda [{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{[{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]+(\lambda -2)[{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}\nonumber \\\ge & {} \dfrac{\lambda [{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {3}}}}(\delta _i)]}{[{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+ {\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i)]+(\lambda -2)[{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {3}}}}(\delta _i)]} \end{aligned}$$
(4)
$$\begin{aligned} {And}, \dfrac{1}{2}\ge & {} \dfrac{\lambda [{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{[{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]+(\lambda -2)[{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}\nonumber \\\ge & {} \dfrac{\lambda [{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {3}}}}(\delta _i)]}{[{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+ {\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {3}}}}(\delta _i)]+(\lambda -2)[{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {3}}}}(\delta _i)]} \end{aligned}$$
(5)

Using, inequalities 3,4,5 and Corollary 2, we have

$$\begin{aligned}{} & {} \frac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}}\nonumber \\{} & {} \quad \quad +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]\nonumber \\{} & {} \quad \ge \frac{\lambda [{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {3}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {3}}}} (\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {3}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {3}}}} (\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {3}}}} (\delta _i)]}}\nonumber \\{} & {} \quad \quad +(\lambda -2)[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {3}}}} (\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {3}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {O}_{\mathbb {I}_{\text {3}}}}(\delta _i)] \end{aligned}$$
(6)

Thus, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}).\)

Similarly,\(S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}).\)

Hence, The similarity measure \(S_P({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) satisfies all the axioms of similarity measure.

The proof of the aforementioned theorem establishes that the similarity formula provided in Definition 3.2 is a similarity measure. The suggested similarity measure has nonlinear properties, as seen in Fig. 2. \(\square \)

Fig. 2
figure 2

Nonlinear characteristics of proposed similarity measure for \(\lambda =1.5\)

Further, we can define

For, \(\lambda =1\),

$$\begin{aligned} S_{P1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=&\frac{[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i) {\mathbb {M}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+ {\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}{{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}} (\delta _i)]}},\lambda >0 \nonumber \\&-[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i) {\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)] \end{aligned}$$
(7)

which is Jaccard Similarity.

For, \(\lambda =2\),

$$\begin{aligned} S_{P2}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{2[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]},\lambda >0 \end{aligned}$$
(8)

which is Dice Similarity.

For, \(\lambda \rightarrow \infty \),

$$\begin{aligned}{} & {} S_{P\lambda }({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\nonumber \\{} & {} \quad =\lim _{\lambda \rightarrow \infty } \frac{[{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i) {\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}{{\dfrac{1}{\lambda }[{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {1}}}} (\delta _i)+{\mathbb {M}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {1}}}} (\delta _i)+{\mathbb {N}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {1}}}} (\delta _i)+{\mathbb {O}^{\text {2}}_{\mathbb {I}_{\text {2}}}}(\delta _i)]}}=1 \nonumber \\{} & {} \quad \quad +(1-\dfrac{2}{\lambda })[{\mathbb {M}_{\mathbb {I}_{\text {1}}}} (\delta _i){\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i)+{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {2}}}} (\delta _i)+{\mathbb {O}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {O}_{\mathbb {I}_{\text {2}}}}(\delta _i)] \end{aligned}$$
(9)

Now, we define some basic propositions based on the proposed similarity measure which are as follows:

Proposition 1

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,\beta \}\) and \({\mathbb {I}_{\text {2}}}=\{\beta ,\alpha \}\), then,

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{\lambda [\alpha ^2+\beta ^2+4\alpha \beta +1-2(\alpha +\beta )]}{2(\alpha -\beta )^2+\lambda [\alpha ^2+\beta ^2+4\alpha \beta +1-2(\alpha +\beta )]},\lambda >0 \end{aligned}$$
(10)

The diagrammatic representation is given in Fig. 3.

Fig. 3
figure 3

Variation of \(S_P\) w.r.t \(\alpha ,\beta \) for Proposition 1; i) Left:3-D, ii) Right:2-D

Proposition 2

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,\alpha \}\), \({\mathbb {I}_{\text {2}}}=\{\beta ,1-\beta \}\) and \({\mathbb {I}_{\text {3}}}=\{1-\beta ,\beta \}\), then,

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{\lambda \alpha }{2\alpha ^2+\beta ^2+(1-\beta )^2+(1-2\alpha )^2+(\lambda -2) \alpha }=S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}) \end{aligned}$$
(11)

The diagrammatic representation is given in Fig. 4.

Fig. 4
figure 4

Variation of \(S_P\) w.r.t \(\alpha ,\beta \) for Proposition 2; i) Left:3-D, ii) Right:2-D

Proposition 3

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,\beta ,\gamma \}\), \({\mathbb {I}_{\text {2}}}=\{\beta ,\gamma ,\alpha \}\) and \({\mathbb {I}_{\text {3}}}=\{\gamma ,\alpha ,\beta \}\), then,

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{\lambda (\alpha \beta +\beta \gamma +\alpha \gamma )}{2\alpha ^2+2\beta ^2+2\gamma ^2+(\lambda -2)(\alpha \beta +\beta \gamma +\alpha \gamma )}=S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})=S_{P}({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}}) \end{aligned}$$
(12)

The diagrammatic representation is given in Fig. 5.

Fig. 5
figure 5

Variation of \(S_P\) w.r.t \(\alpha ,\beta \) for Proposition 3; i) Left:3-D, ii) Right:2-D

Proposition 4

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,1-\alpha \}\), \({\mathbb {I}_{\text {2}}}=\{\beta ,1-\beta \}\), then,

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{\lambda [1+2\alpha \beta -\alpha -\beta ]}{\alpha ^2+\beta ^2+(1-\alpha )^2+(1-\beta )^2+(\lambda -2) [1+2\alpha \beta -\alpha -\beta ]} \end{aligned}$$
(13)

The diagrammatic representation is given in Fig. 6.

Fig. 6
figure 6

Variation of \(S_P\) w.r.t \(\alpha ,\beta \) for Proposition 4; i) Left:3-D, ii) Right:2-D

Proposition 5

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,\beta \}\), \({\mathbb {I}_{\text {2}}}=\{1-\alpha ,1-\beta \}\), then,

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\frac{\lambda [\alpha +\beta -\alpha ^2-\beta ^2]}{\alpha ^2+\beta ^2+(1-\alpha )^2+(1-\beta )^2+2(1-\alpha -\beta )^2+(\lambda -2)[\alpha +\beta -\alpha ^2-\beta ^2]} \end{aligned}$$
(14)

The diagrammatic representation is given in Fig. 7.

Fig. 7
figure 7

Variation of \(S_P\) w.r.t \(\alpha ,\beta \) for Proposition 5; i) Left:3-D, ii) Right:2-D

Proposition 6

: If \({\mathbb {I}_{\text {1}}}=\{\alpha ,1-\alpha \}\) and \({\mathbb {I}_{\text {2}}}=\{0,0\}\), then, \(S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=0\)

Proposition 6 can be easily obtained from Proposition 2.

Proposition 7

: If \({\mathbb {I}_{\text {1}}}=\{\alpha _1,\beta _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}=\{\alpha _2,\beta _2,\gamma _2\}\), then, \(S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S_{P}({\mathbb {I}_{\text {1}}}^c,{\mathbb {I}_{\text {2}}}^c)\)

Proof

Given, \({\mathbb {I}_{\text {1}}}=\{\alpha _1,\beta _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}=\{\alpha _2,\beta _2,\gamma _2\}\), then, \({\mathbb {I}_{\text {1}}}^c=\{\beta _1,\alpha _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}^c=\{\beta _2,\alpha _2,\gamma _2\}\)

$$\begin{aligned}{} & {} S_{P}({\mathbb {I}_{\text {1}}}^c,{\mathbb {I}_{\text {2}}}^c)\\{} & {} \quad =\frac{\lambda [\beta _1\beta _2+\alpha _1\alpha _2+\gamma _1\gamma _2]}{\beta _1^2+\beta _2^2+\alpha _1^2+\alpha _2^2+\gamma _1^2+\gamma _2^2+(\lambda -2)[\beta _1\beta _2+\alpha _1\alpha _2+\gamma _1\gamma _2]}\\{} & {} \quad =\frac{\lambda [\alpha _1\alpha _2+\beta _1\beta _2+\gamma _1\gamma _2]}{\alpha _1^2+\alpha _2^2+\beta _1^2+\beta _2^2+\gamma _1^2+\gamma _2^2+(\lambda -2)[\alpha _1\alpha _2+\beta _1\beta _2+\gamma _1\gamma _2]}=S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \end{aligned}$$

\(\square \)

Proposition 8

: If \({\mathbb {I}_{\text {1}}}=\{\alpha _1,\beta _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}=\{\alpha _2,\beta _2,\gamma _2\}\), then, \(S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}^c)=S_{P}({\mathbb {I}_{\text {1}}}^c,{\mathbb {I}_{\text {2}}})\)

Proof

Given, \({\mathbb {I}_{\text {1}}}=\{\alpha _1,\beta _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}=\{\alpha _2,\beta _2,\gamma _2\}\), then, \({\mathbb {I}_{\text {1}}}^c=\{\beta _1,\alpha _1,\gamma _1\}\) and \({\mathbb {I}_{\text {2}}}^c=\{\beta _2,\alpha _2,\gamma _2\}\)

$$\begin{aligned} S_{P}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}^c)= & {} \frac{\lambda [\alpha _1\beta _2+\beta _1\alpha _2+\gamma _1\gamma _2]}{\alpha _1^2+\beta _2^2+\beta _1^2+\alpha _2^2+\gamma _1^2+\gamma _2^2+(\lambda -2)[\alpha _1\beta _2+\beta _1\alpha _2+\gamma _1\gamma _2]}\\= & {} \frac{\lambda [\beta _1\alpha _2+\alpha _1\beta _2+\gamma _1\gamma _2]}{\beta _1^2+\alpha _2^2+\alpha _1^2+\beta _2^2+\gamma _1^2+\gamma _2^2+(\lambda -2)[\beta _1\alpha _2+\alpha _1\beta _2+\gamma _1\gamma _2]}=S_{P}({\mathbb {I}_{\text {1}}}^c,{\mathbb {I}_{\text {2}}}) \end{aligned}$$

\(\square \)

Proposition 9

: If \(\mathbb {I}=\{\alpha ,\beta ,\gamma \}\), then, \(S_{P}(\mathbb {I},\mathbb {I}^c)=0 \implies \mathbb {I}\)=(1,0) or (0,1)

Proof

Given, \(\mathbb {I}=\{\alpha ,\beta ,\gamma \}\), then, \(\mathbb {I}^c=\{\beta ,\alpha ,\gamma \}\)

Now, \(S_{P}(\mathbb {I},\mathbb {I}^c)=\dfrac{\lambda [\alpha \beta +\beta \alpha +\gamma \gamma ]}{\alpha ^2+\beta ^2+\beta ^2+\alpha ^2+\gamma ^2+\gamma ^2+(\lambda -2)[\alpha \beta +\beta \alpha +\gamma \gamma ]}=0\)

$$\begin{aligned}{} & {} \implies \alpha \beta +\beta \alpha +\gamma \gamma =0\\{} & {} \implies \alpha \beta =0\hbox { and }\gamma =0 \end{aligned}$$

Now, \(\gamma =0\implies \alpha +\beta =1\implies \alpha =1-\beta \)

Then, \(\implies \alpha \beta =0\) \(\implies \beta (1-\beta )=0\) \(\implies \beta =1\) or 0

When, \(\beta =1\), then, \(\alpha =0\) and when \(\beta =0\), then, \(\alpha =1\)

Hence, \(S_{P}(\mathbb {I},\mathbb {I}^c)=0 \implies \mathbb {I}\)=(1,0) or (0,1) \(\square \)

Proposition 10

: If \(\mathbb {I}=\{\alpha ,\beta ,\gamma \}\), then, \(S_{P}(\mathbb {I},\mathbb {I}^c)=1 \implies \alpha =\beta \)

Proof

Given, \(\mathbb {I}=\{\alpha ,\beta ,\gamma \}\), then, \(\mathbb {I}^c=\{\beta ,\alpha ,\gamma \}\)

Now, \(S_{P}(\mathbb {I},\mathbb {I}^c)=\dfrac{\lambda [\alpha \beta +\beta \alpha +\gamma \gamma ]}{\alpha ^2+\beta ^2+\beta ^2+\alpha ^2+\gamma ^2+\gamma ^2+(\lambda -2)[\alpha \beta +\beta \alpha +\gamma \gamma ]}=1\)

$$\begin{aligned}{} & {} \implies \alpha ^2+\beta ^2+\beta ^2+\alpha ^2+\gamma ^2+\gamma ^2-2[\alpha \beta +\beta \alpha +\gamma \gamma ]\\{} & {} \implies (\alpha -\beta )^2=0\\{} & {} \implies \alpha =\beta =0 \end{aligned}$$

Hence, \(S_{P}(\mathbb {I},\mathbb {I}^c)=1 \implies \alpha =\beta \) \(\square \)

3.3 Comparative analysis of similarity measure

In this section, we present several examples to demonstrate the advantages of our proposed measure. These examples serve as a preliminary study to showcase the effectiveness and applicability of our measure. By examining these examples, we can gain insights into the improved performance and capabilities of our measure compared to existing approaches.

Example 1

: Consider the problem of a criminal investigation. Let C = (0.4, 0.2) be the expectation set of the offender given by the investigator. The assessment of two suspects is given by the IFSs A = (0.5, 0.3) and B = (0.5, 0.2). The challenge is identifying the perpetrator based on the investigator’s judgment of them.

Table 2 Analysis of Example 1’s Similarity Measures

The assessment of offender B by IFSs is often thought to be closer to that of offender C than that of offender A because A and B’s MD exactly follow the same pattern in both IFSs, whilst ND is the only one that differs. ND of B and C are exactly the same, but it differs for A and C. This viewpoint makes it logical and appropriate to claim that B is more comparable to C than A. Therefore, it makes sense that B is the desired offender rather than A. Numerous methods fell short of revealing this fact.

It can be seen in Table 2 that similarity measures such as \(S_{Y},S_{GK},S_{JJ},S_{DS},S_{GR},S_{L},S_{GR1},S_{GR2},S_{GR3},S_{GR4}\) make a counter-intuitive selection of the candidate. Further, \(S_C\) violated property 2 of the similarity measure. By favoring B over A, the proposed similarity measure chooses logically.

Example 2

: We consider another problem with IFSs E = (0.3, 0.3), F = (0.4, 0.4), G=(0.3,0.4), H=(0.4,0.3). It is general intuition that E and F are of similar nature, whereas G and H complement of each other. It becomes clear that E and F’s similarity values are more similar than G and H. It is general intuition that it seems like the similarity values of E and F is more than G and H. However, it becomes abundantly evident that the similarity values of E and F are lower than G and H if the idea of IFS hesitation is taken into account. As \( {\mathbb {O}}_E=0.4, {\mathbb {O}}_F=0.2\). Moreover, \({\mathbb {O}}_G=0.3,{\mathbb {O}}_H=0.3\), i.e., hesitancy parts of E and F are less similar than G and H. Therefore, it can be opined that the similarity values of E and F are less than G and H.

Table 3 Analysis of Example 2’s Similarity Measures

Table 3 illustrates that similarity measures such as \(S_{HK},S_{MP}\), the similarity values between G and H are exactly the same as between E and F, whereas in \(S_{GK},S_{HW},S_{JJ},S_{Ng1},S_{Ng2},S_{DS},S_{GR},S_{L},S_{CD}^1,S_{CD}^3,S_{GR1},S_{GR2},S_{GR3},S_{GR4},S_{G1},\)

\(S_{G2},S_{G3},S_{G4}\) the similarity values between G and H are less than E and F. Further, \(S_C,S_Y\) violated property 2 of the similarity measure. However, in our proposed measure, similarity values between G and H are more than E and F, thus making a logical selection.

Example 3

: We consider the problem of arresting offenders based on the characteristics of the crime, I = (0.82,0.09,0.09) set by the investigator considering MD, ND and HD. The evaluation of the offenders is presented by IFSs J =(0.78,0.12,0.10), K=(0.76,0.06,0.18), L=(0.77,0.09,0.14).

We take into account how the offenders J and L were evaluated. It can be seen that

$$\begin{aligned} |({\mathbb {M}}_I-{\mathbb {M}}_J)|=0.04, |({\mathbb {M}}_I-{\mathbb {M}}_L)|=0.05\hbox { and }|({\mathbb {N}}_I-{\mathbb {N}}_J)|=0.03, |({\mathbb {N}}_I-{\mathbb {N}}_L)|=0.0 \end{aligned}$$

i.e. \(|({\mathbb {M}}_I-{\mathbb {M}}_J)|+|({\mathbb {N}}_I-{\mathbb {N}}_J)|=0.07>|({\mathbb {M}}_I-{\mathbb {M}}_L)|+|({\mathbb {N}}_I-{\mathbb {N}}_L)|=0.05\)

But, if we consider hesitancy, we have, \(|({\mathbb {O}}_I-{\mathbb {O}}_J)|=0.01, |({\mathbb {O}}_I-{\mathbb {O}}_L)|=0.05\)

i.e. \(|({\mathbb {M}}_I-{\mathbb {M}}_J)|+|({\mathbb {N}}_I-{\mathbb {N}}_J)|+|({\mathbb {O}}_I-{\mathbb {O}}_J)|=0.08<|({\mathbb {M}}_I-{\mathbb {M}}_L)|+|({\mathbb {N}}_I-{\mathbb {N}}_L)|+|({\mathbb {O}}_I-{\mathbb {O}}_L)|=0.1\)

Therefore, it makes sense to assume that I and J share more similarities than I do with L. Consequently, J is a more likely offender than L.

Again, considering the assessment of the offenders K and L. We have \(|({\mathbb {M}}_I-{\mathbb {M}}_K)|=0.06, |({\mathbb {M}}_I-{\mathbb {M}}_L)|=0.05\); \(|({\mathbb {N}}_I-{\mathbb {N}}_K)|=0.03, |({\mathbb {N}}_I-{\mathbb {N}}_L)|=0.0\) and \(|({\mathbb {O}}_I-{\mathbb {O}}_K)|=0.09, |({\mathbb {O}}_I-{\mathbb {O}}_L)|=0.05\)

i.e. \(|({\mathbb {M}}_I-{\mathbb {M}}_K)|+|({\mathbb {N}}_I-{\mathbb {N}}_K)|+|({\mathbb {O}}_I-{\mathbb {O}}_K)|=0.18>|({\mathbb {M}}_I-{\mathbb {M}}_L)|+|({\mathbb {N}}_I-{\mathbb {N}}_L)|+|({\mathbb {O}}_I-{\mathbb {O}}_L)|=0.1\)

Therefore, it makes sense to assume that I and L share more similarities than I do with K. L is therefore a more likely offender than K. Hence, the ordering of the offenders should be J >L>K.

Table 4 Analysis of Example 3’s Similarity Measures

Table 4 illustrates that, similarity measures such as \(S_C,S_{HK},S_{GK},S_{HW},S_{JJ},S_{Ng1},S_{Ng2},S_{MP},S_{DS},S_{CD}^1,S_{CD}^2,S_{CD}^3,S_{GR1},S_{GR2},S_{GR3},\)

\(S_{GR4}, S_{G1},S_{G2},S_{G3},S_{G4}\) the similarity values could not give logically correct results. Further, \(S_Y\) violated property 2 of similarity measure. But, in our proposed measure, similarity values give desired result.

From the above three examples, we could see the limitations of the existing measures. In case of Example 1, It can be seen that similarity measures such as \(S_{Y},S_{GK},S_{JJ},S_{DS},S_{GR},S_{L},S_{GR1},S_{GR2},S_{GR3},S_{GR4},S_C\) suffers from limitations. In case of Example 2, \(S_C, S_{HK}, S_Y, S_{MP},S_{NC},S_{GK},\) \(S_{HW},S_{JJ},S_{Ng1},S_{Ng2},S_{DS},S_{GR},S_{L},S_{CD}^1,S_{CD}^3,S_{GR1}, S_{GR2}, S_{GR3}, S_{GR4},S_{G1},S_{G2},\) \(S_{G3},S_{G4}\) suffers from limitations. In case of Example 3, \(S_C, S_Y, S_{MP},S_{GK},S_{HW},S_{JJ},\) \(S_{Ng1},S_{Ng2},S_{DS},S_{CD}^1,S_{CD}^2, S_{CD}^3,S_{GR1}, S_{GR2}, S_{GR3}, S_{GR4},S_{G1},S_{G2},S_{G3},S_{G4}\) suffers from limitations.

So, we can see that in Example 1, existing measures such as \(S_{HK},S_{HW}, S_{Ng1}, S_{Ng2}, S_{MP}, S_{CD}^1,S_{CD}^2, S_{CD}^3, S_{G1}, S_{G2}, S_{G3}, S_{G4}\) makes correct assumption. In case of Example 2 only \(S_{CD}^2\) makes correct assumption among the existing measure. In case of Example 3, only \(S_{HK},S_{GR}, S_{L}\) makes correct assumption among all the existing measures. In all the examples, our proposed measure makes correct assumption. Hence, we can say that our proposed measure outperforms the existing measures which can be seen through the above analysis.

3.4 Propagation of similarity measure

In this section, an attempt has been made to propagate some similarity measures. These gives a deeper approach in the understanding of measures.

Propagation 1

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). Then we can define a similarity measure between \({\mathbb {I}_{\text {1}}}\) and \({\mathbb {I}_{\text {2}}}\) which is given by

\(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^n-(2^n-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})} \), where \(S^0({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

Proof

Let, \(S^0({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) be a similarity measure.

Then, we need to prove, \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^n-(2^n-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})} \) is a similarity measure.

We, prove this by mathematical induction. \(\square \)

First we prove that, \(S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure.

Since, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

  1. 1.

    \(0\le S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\).

  2. 2.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=0\) iff \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\).

  3. 3.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}})\).

  4. 4.

    If \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), then, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\) and \(S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}).\)

Now, 1) \(0\le S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1 \implies 1\le 2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 2\)

So, \(0\le S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\)

2) Clearly, \( S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})= S^1({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}}) \)

3) Let, \(S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies \dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\implies S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies {\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\)

Conversely, let, \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\), then, \(S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

4) For \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\implies 2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\le 2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

So, \(S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\ge \dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}=S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

Similarly, \(S^1({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

Hence, \(S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure. So it is true for n=1.

Further, it could be easily proven that \(S^2({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure. So, it is true for n=2 also.

And, \(S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^{k-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{k-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure for n = k.

Now, we are to prove, \(S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure.

Now, 1) \(0\le S^{k}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1 \implies 1\le 2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 2\)

So, \(0\le S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\)

2) Clearly, \( S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})= S^{k+1}({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}}) \)

3) Let, \(S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies \dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\implies S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies {\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\)

Conversely, let, \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\), then, \(S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

4) For \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), \(S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\implies 2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\le 2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

So, \(S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\ge \dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}=S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

Similarly, \(S^{k+1}({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

So, \(S^{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

Hence, by mathematical induction, \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure.

Now, \(S^2({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}{2-\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{4-3S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^2-(2^2-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\)

Again, \(S^3({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^2({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^2({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{4-3S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}{2-\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{4-3S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{8-7S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^3-(2^3-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\)

And, \(S^4({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^3({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^3({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{8-7S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}{2-\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{8-7S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{16-15S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^4-(2^4-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\)

Therefore, \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^n-(2^n-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})} \) is a similarity measure.

Hence, \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2^n-(2^n-1)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})} \) is a similarity measure.

Fig. 8
figure 8

Comparision of similarity measure \(S^n(\lambda =1.5)\) for different values of n

In Fig. 8, we can see that with the increase in the value of n, the \(S^n\) graph tends to accelerate rapidly to 1 in the first graph, whereas it rapidly decelerates to 0 in the second graph. In the third graph shows a very slight deviation with the increase of n. We use our proposed measure as S(A,B).

Propagation 2

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). If \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is a similarity measure. Then the convex combination of \(S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) and \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is also a similarity measure. i.e.

\(S_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )S^{n}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is a similarity measure \((0\le \gamma \le 1)\).

Proof

Let, \(S^0({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) be a similarity measure.

Then, we need to prove, \(S_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})},(0\le \gamma \le 1)\) is a similarity measure.

We, prove this by mathematical induction. \(\square \)

First we prove that, \(S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})},(0\le \gamma \le 1)\) is also a similarity measure.

Since, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

  1. 1.

    \(0\le S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\).

  2. 2.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=0\) iff \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\).

  3. 3.

    \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}})\).

  4. 4.

    If \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), then, \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\) and \(S({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}).\)

  1. 1)

    Clearly, \(0\le S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\)

  2. 2)

    Clearly, \( S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})= S_1({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}}) \)

  3. 3)

    Let, \(S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies \gamma S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

    \(\implies \gamma (S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}))^2-(\gamma +2)S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+2=0\)

    \(\implies S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\) or \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{2}{\gamma }>1\) which is a contradiction.

    \(\implies S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies {\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\)

    Conversely, let, \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\), then, \(S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

  4. 4)

    For \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), \(S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\implies 2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\le 2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

So,

$$\begin{aligned}{} & {} S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\\{} & {} \quad \ge \gamma S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})+(1-\gamma )\dfrac{S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}{2-S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}=S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}) \end{aligned}$$

Similarly, \(S_1({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

Hence, \(S_1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure. So it is true for n=1.

Further, it could be easily proven that \(S_2({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^1({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure. So it is true for n=2 also.

And, \(S_k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{k-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^{k-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{k-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure for n=k.

Now, we are to prove, \(S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is also a similarity measure.

  1. 1)

    Clearly, \(0\le S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}) \le 1\)

  2. 2)

    Clearly, \( S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})= S_{k+1}({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {1}}}) \)

  3. 3)

    Let, \(S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies \gamma S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

    \(\implies \gamma (S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}}))^2-(\gamma +2)S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+2=0\)

    \(\implies S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\) or \(S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{2}{\gamma }>1\) which is a contradiction.

    \(\implies S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=1\implies {\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\)

    Conversely, let, \({\mathbb {I}_{\text {1}}}={\mathbb {I}_{\text {2}}}\), then, \(S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}=1\)

  4. 4)

    For \({\mathbb {I}_{\text {1}}} \subseteq {\mathbb {I}_{\text {2}}} \subseteq {\mathbb {I}_{\text {3}}}\), \(S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\ge S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\implies 2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\le 2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

So,

$$\begin{aligned}{} & {} S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\\{} & {} \quad \ge \gamma S^({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})+(1-\gamma )\dfrac{S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}{2-S^k({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})}=S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}}) \end{aligned}$$

Similarly, \(S_{k+1}({\mathbb {I}_{\text {2}}},{\mathbb {I}_{\text {3}}})\ge S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {3}}})\)

Therefore, \(S_{k+1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

Hence, by mathematical induction, \(S_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )S^{n}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{2-S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is a similarity measure.

For, \(\gamma =0\), \(S_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S^{n}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\), whereas, for \(\gamma =1\), \(S_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\)

Propagation 3

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). Then we can define a similarity measure between \({\mathbb {I}_{\text {1}}}\) and \({\mathbb {I}_{\text {2}}}\) which is given by

\(\widetilde{S}^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{1+\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})} \), where \(\widetilde{S}^0({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=S({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure.

Proof

The proof is similar to the Propagation 1 and could be proven by mathematical induction. \(\square \)

Propagation 4

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i){\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). If \(\widetilde{S}^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\dfrac{\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{1+\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is a similarity measure. Then the convex combination of \(\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) and \(\widetilde{S}^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is also a similarity measure. i.e.

\(\widetilde{S}_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma \widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\widetilde{S}^{n}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\gamma \widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\gamma )\dfrac{\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}{1+\widetilde{S}^{n-1}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})}\) is a similarity measure \((0\le \gamma \le 1)\).

Proof

The proof is similar to the Propagation 2 and could be easily proven. \(\square \)

Propagation 5

: We consider \(\Delta =\{\delta _i: i=1,2,...,n\}\) as a universal set. Let, \({\mathbb {I}_{\text {1}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {1}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {1}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) and \({\mathbb {I}_{\text {2}}}=\{\langle \delta _i,{\mathbb {M}_{\mathbb {I}_{\text {2}}}}(\delta _i),{\mathbb {N}_{\mathbb {I}_{\text {2}}}}(\delta _i)\rangle ;\delta _i \in \Delta \}\) be two IFSs defined in \(\Delta \). If \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) and \(\widetilde{S}^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) are similarity measures. Then the convex combination of \(S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) and \(\widetilde{S}^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is also a similarity measure. i.e.

\(\overline{S}_n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})=\eta S^n({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})+(1-\eta )\widetilde{S}^{n}({\mathbb {I}_{\text {1}}},{\mathbb {I}_{\text {2}}})\) is a similarity measure \((0\le \eta \le 1)\).

Proof

This can be easily proven using the above results. \(\square \)

4 Application in clustering problem for crime linkage

A well-functioning legal system is crucial for the modern society’s stability. Unfortunately, crimes like robbery and murder have been on the rise. To maintain law and order effectively, it is essential to analyze and understand these crimes thoroughly. One significant challenge arises when trying to determine if the same perpetrator is behind multiple similar crimes. Forensic science addresses this issue through crime linkage analysis, which involves examining a series of crimes to identify potential connections to a common criminal. Typically, when solid evidence like forensic findings, DNA, or fingerprints is available from crime scenes, linking the crimes becomes relatively straightforward. However, without such conclusive evidence, the process of connecting crimes becomes considerably more challenging Muller (2000).

As the utilization of automated systems for crime tracking and criminal identification has surged, analysts have collaborated closely with law enforcement and detectives to expedite crime resolution. The need to link criminal investigations becomes nearly inevitable when substantial evidence, such as forensics, DNA, fingerprints, or pertinent digital data, is available. However, in the absence of this kind of evidence, conducting a criminal investigation becomes a formidable challenge. While these attributes are essential for resolving decision-making complexities, conventional information processing systems often struggle to handle the ambiguous and uncertain aspects of real-world problems. Offering a more effective approach, Zadeh (1965) introduced the concept of Fuzzy Sets, which has been pivotal in addressing these challenges. Since Zadeh’s groundbreaking work, Fuzzy Set theory, including Intuitionistic Fuzzy Sets (IFS) Atanassov (1983) comprising MD, ND, and HD, has undergone significant advancements. These fuzzy techniques and their adaptations have proven highly effective in mitigating uncertainties stemming from insufficient data or information.

The aim of clustering problems is to categorize data into different groups depending on their shared characteristics. Similarity measures have played a highly effective role in this regard. A fuzzy similarity relation exhibits reflexive, symmetry, and max-min transitivity properties. To tackle clustering challenges involving Intuitionistic Fuzzy Sets (IFSs), Xu et al. introduced the Intuitionistic Fuzzy Set Clustering (IFSC) technique, as described in their work Xu et al. (2008). Its updated approach is used in our study which utilizes the proposed similarity measure.

In this section we propose a methodology for crime linkage. We then propose an algorithm based on the methodology. The time complexity of the algorithm is analyzed. Further, a case study is considered where we have used our proposed approach showing its applicability.

4.1 Methodology

Let’s assume that there are n crimes in a collection \(\{G_1,G_2,...,G_n\}\) and that we need to identify the crimes that are committed by common criminals. Let the activities of the unidentified perpetrator discovered by the crime scene investigation be \(\{B_1,B_2,...,B_m\}\). The following matrix serves as a representation of the decision situation:

$$\begin{aligned} R_{n\times m}= \begin{bmatrix} {\mathbb {A}_{\text {11}}} &{} {\mathbb {A}_{\text {12}}} &{} ............ &{} {\mathbb {A}_{ 1m }}\\ {\mathbb {A}_{ 21 }} &{} {\mathbb {A}_{ 22 }}&{} ............ &{} {\mathbb {A}_{ 2m }}\\ . &{} . &{} . &{} .\\ . &{} . &{} . &{} .\\ . &{} . &{} . &{} .\\ {\mathbb {A}_{ n1 }}&{} {\mathbb {A}_{ n2 }}&{} ............ &{} {\mathbb {A}_{ nm }}\\ \end{bmatrix} \end{aligned}$$
(15)

where \({\mathbb {A}_{ ij }}\) are the IFSs expressed using linguistic variables, which indicate the crime \(G_i\) w.r.t the severity of the actions \(B_j\). As a result, each crime \(G_i\) can be described using action items discovered at the scene of the crime and corresponding intuitionistic fuzzy values, such as

\(G_i=\{\langle B_j, {\mathbb {A}_{ ij }}\rangle | B_j \in B\}\) where \(i=1,2,...,n\) and j=1,2,...,m.

Linguistic variables are used as they express the information in a better way. In the methodology, we use linguistic values to better express the information and then put the corresponding values in IFS form based on Table 5. In the Table we have considered 5 linguistic variables: Very High (VH), High (H), Medium (M), Low (L), and Very Low (VL). The assignment of numerical values to linguistic labels in Table 5 was determined through a calculated approach. These linguistic variables can fully express the information in IFS form.

Table 5 Linguistic variables and their corresponding IF values

Now, a clustering method is applied in finding the link between the crimes to find a common offender. So, we discuss some terms, definitions and theorems to better understand the method.

After putting the linguistic values, we get the decision matrix. Then, we have to obtain the association matrix which is defined as:

Definition 4.1

: Let, \(G_i\) be n IFSs, then \(C=(c_{ij})_{n\times n}\) is an association matrix, where \(c_{ij}=S_P(G_i,G_j)\) is the association coefficients of \(G_i\) and \(G_j\) with the following properties:

  1. 1.

    \(0\le c_{ij}\le 1\), where, \(i,j=1,2,...,n\)

  2. 2.

    \(c_{ij}=1\) iff \(G_i=G_j\)

  3. 3.

    \(c_{ij}=c_{ji}\), where, \(i,j=1,2,...,n\)

The properties are obvious as \(S_P(G_i,G_j)\) satisfies all the properties of similarity measure.

Then, we have to get composition matrix which could be defined as:

Definition 4.2

: Let, \(C=(c_{ij})_{n\times n}\) is an association matrix, if \(C^2=CoC=(\overline{c_{ij}})_{n\times n}\), then \(C^2\) is called composition matrix of C, where \(\overline{c_{ij}}=max\{min\{c_{ik},c_{kj}\}\}\) \((i,j=1,2,....,n)\)

Now, we discuss two important theorems to understand the basics of the method.

Theorem 4.1

: Let, \(C=(c_{ij})_{n\times n}\) is an association matrix, then the composition matrix \(C^2\) is also an association matrix.

Proof

 

  1. 1.

    Since, \(C=(c_{ij})_{n\times n}\) is an association matrix, then, \(0\le c_{ij}\le 1\)

    Thus, \(0\le \overline{c_{ij}}=max\{min\{c_{ik},c_{kj}\}\}\le 1\)

  2. 2.

    Since, \(c_{ij}=1\) iff \(G_i=G_j\), then \(\overline{c_{ij}}\)=\(max\{min\{c_{ik},c_{kj}\}\}\le 1\) iff \(G_i\)=\(G_j\)=\(G_k\) for some k.

  3. 3.

    Since, \(c_{ij}=c_{ji}\), then, \(\overline{c_{ij}}=max\{min\{c_{ik},c_{kj}\}\}=max\{min\{c_{ki},c_{jk}\}\}=\) \(max\{min\{c_{jk},c_{ki}\}\}=\overline{c_{ji}}\)

Hence, composition matrix \(C^2\) is also an association matrix.

Based on the above theorem a more generalized is given as follows:

Theorem 4.2

: Let, \(C=(c_{ij})_{n\times n}\) is an association matrix, then for any non-negative integer l the composition matrix \(C^{2^{l+1}}=C^{2^{l}}oC^{2^{l}}\) is also an association matrix.

Similar to the above theorem, and using mathematical induction it can be easily proven.

Definition 4.3

: Let, \(C=(c_{ij})_{n\times n}\) is an association matrix, if \(C^2\subseteq C\), where \(c_{ij}\ge \) \(max\{min\{c_{ik},c_{kj}\}\};i,j = 1,2,....,n\), then C is called an equivalent composition matrix.

But if \(C^2\subseteq C\) does not hold, we have the following theorem which is given as follows:

Theorem 4.3

: Let, \(C=(c_{ij})_{n\times n}\) is an association matrix, then after finite compositions: \(C\rightarrow C^2\rightarrow C^4 \rightarrow .... \rightarrow C^{2^{l}}\rightarrow ....\) there exist a positive integer l such that \(C^{2^{l}}=C^{2^{l+1}}\), and \(C^{2^{l}}\) is also an equivalent association matrix.

The proof is trivial and can be easily proven.

Now, we need to understand the \(\alpha \)-cutting matrix which can be defined as follows:

Definition 4.4

: Let, \(C=(c_{ij})_{n\times n}\) is an equivalent association matrix, then \(\alpha \)-cutting matrix of C is given by \(C_\alpha =(\alpha c_{ij})_{n\times n}\) where, \(\alpha c_{ij}=\left\{ \begin{array}{ll} 0&{} if, c_{ij}<\alpha \\ 1&{} if, c_{ij}\ge \alpha \end{array} \right. \)    \(i,j=1,2,...,n\)

Based on the above studies, the following is an explanation of the crime linkage algorithm:

Step 1::

Create the decision matrix \(R_{n\times m}=[{\mathbb {A}_{ ij }}]_{n\times m}\), (\(i=1,2,...n; j=1,2,...,m\)), where \({\mathbb {A}_{ ij }}\) represents Intuitionistic Fuzzy Sets (IFSs) via linguistic variables. These IFSs quantify the extent to which the actions \(B_j\) are associated with the crime \(G_i\) using linguistic expressions within the IFS framework.

Step 2::

Represent the relevant linguistic values within the decision matrix \(R_{n\times m}\) according to the values provided in Table 5.

Step 3::

Compute association coefficients denoted as \(c_{ij}=S_P(G_i,G_j)\), and create an association matrix \(C=(c_{ij})_{n\times n}\).

Step 4::

Formulate an association matrix as \(CoC=C^2=(c_{ij})\) where \(c_{ij}=max\{min\{c_{ik},c_{kj}\}\};\)

\(i,j=1,2,....,n\)

Step 5::

An association matrix C is considered an equivalent association matrix if it satisfies the condition \(C^2\subseteq C\) where \(c_{ij}\ge max\{min\{c_{ik},c_{kj}\}\};i,j = 1,2,....,n\). \(C^2\) is not an equivalent matrix if \(C^2\subseteq C\) does not hold. As a result, \(C^{2^l} (l=1,2,..)\) is calculated yet again and until we arrive at an equivalent association matrix.

Step 6::

Then, by selecting the \(\alpha \in [0,1]\), create the \(\alpha \)-cutting matrix \(C_\alpha =(\alpha c_{ij})_{n\times n}\) where, \(\alpha c_{ij}=\left\{ \begin{array}{ll} 0&{} if, c_{ij}<\alpha \\ 1&{} if, c_{ij}\ge \alpha \end{array} \right. \)   \(i,j=1,2,...,n\)

Step 7::

If the ith and jth lines of \(C_{\alpha }\) have the same components then \(G_i\) and \(G_j\) are of same type. This idea allows us to categorize all of these IFSs \(G_i (i=1,2,...,n)\).

The flowchart of the algorithm is given Fig. 9.

Fig. 9
figure 9

Flowchart of algorithm 1

The time complexity of each step of the algorithm is given as follows:

Step 1: Creating the Decision Matrix R::

In this step, we are creating a matrix with dimensions \(n\times m\), where ’m’ represents the number of actions \((B_j)\) and ’n’ represents the number of crimes \((G_i)\). To initialize each cell in the matrix, we need to set its value, which takes constant time. Since we have ’n’ rows and ’m’ columns to initialize, the overall time complexity is \(O(n\times m)\).

Step 2: Encoding Linguistic Values into Matrix R::

This step involves assigning linguistic values from a table to the decision matrix R. Assigning values to each cell in the matrix is a constant time operation since it depends on predetermined values from the table. Therefore, the time complexity is O(1), or constant time.

Step 3: Calculating Association Coefficients and Forming Matrix C: :

Here, we calculate association coefficients \(c_{ij}\) for each pair of crimes \(G_i\) and \(G_j\). Since we have ’n’ crimes, we perform ’\(n\times n\)’ calculations, resulting in a time complexity of \(O(n^2)\).

Step 4: Creating the Refined Association Matrix \({\textbf {CoC}} ={\textbf {C}}^{\textbf{2}}\)::

In this step, we refine the association matrix which is of order \(n\times n\). This results in a time complexity of \(O(n^2)\).

Step 5: Checking for Equivalent Association Matrix C::

This step involves checking if the matrix C is equivalent, and it might require multiple iterations (k, assuming k is constant) where we calculate higher powers of the matrix (\(C^2, C^4, C^8\), etc.). Therefore, the overall time complexity is \(O(n^2)\).

Step 6: Generating the \({\alpha }\)-Cutting Matrix \({\textbf{C}}_{\alpha }\)::

Creating the \(\alpha \)-cutting matrix involves iterating through each element of the matrix C, which has ’\(n\times n\)’ elements. Since we perform a simple comparison for each element once, the time complexity is proportional to the number of elements, which is \(O(n^2)\).

Step 7: Analyzing Components of \({\textbf{C}}_{{\alpha }}\)::

In this step, we compare pairs of crimes for similarity, which involves iterating through all possible pairs of crimes \(G_i\) and \(G_j\) (\(n\times n\) comparisons). For each pair, we may need to iterate through n elements in their rows. Thus, the overall time complexity is \(O(n^2)\).

The overall time complexity is influenced by Step 5, where the complexity is \(O(n^2)\). The other steps have lower complexities and do not significantly impact the overall complexity.

The algorithm’s reliability and stability in obtaining an equivalent association matrix for crime linkage analysis can be firmly established through the application of several crucial theorems.

Firstly, Theorem 4.1 demonstrates that when we compute higher-order composition matrices, starting with \(C^2\), these matrices remain association matrices. This is grounded in the fact that the original association matrix C contains elements bounded between 0 and 1, and the composition operation preserves these essential boundaries.

Moreover, Theorem 4.2, supported by mathematical induction, extends this idea by confirming that for any non-negative integer l, the composition matrix \(C^{2^{l+1}}=C^{2^{l}}oC^{2^{l}}\) remains an association matrix. This establishes a powerful pattern of stability in the computation of these matrices.

Additionally, Theorem 4.3 plays a pivotal role in assessing the equivalence and stability of the association matrix. It defines an equivalent association matrix as one where \(C^2\subseteq C\), meaning that no new associations emerge. However, if this condition initially fails, Theorem 4.3 ensures that, through a series of finite compositions (\(C\rightarrow C^2\rightarrow C^4\rightarrow ...\rightarrow C^{2^{l}}\rightarrow ...\)), a positive integer l eventually emerges where \(C^{2^{l}}=C^{2^{l+1}}\). This critical point signifies the convergence to a stable equivalent association matrix \(C^{2^{l}}\) that faithfully captures and represents the intricate web of crime associations.

In summary, these theorems collectively underpin the algorithm’s remarkable ability to converge to a stable and equivalent association matrix, thereby ensuring the utmost accuracy and reliability in characterizing the complex associations between crimes.

4.2 Case study on crime linkage

In the case study, we take into account the real data of crime scenes of victims killed by two sets of serial killers (O’Brien 2014; Keppel 2010). To examine the crime linkage, we take into account five crime scenes of the victims before the suspects were caught. The first commits two crimes, \(G_1\),\(G_2\) while the second commits \(G_3\), \(G_4\) and \(G_5\).

Following the investigation of crimes \(G_1\) and \(G_2\), the offender apprehended was George Waterfield Russell Jr., also known as The Charmer-an American thief and serial killer accountable for the deaths of three women in Seattle during the summer of 1990. Russell engaged in the sexual assault and murder of three women between June and August 1990. After taking the lives of his victims, he subjected their corpses to horrifying mutilation and violation, arranging them in various grotesque positions before departing from the crime scenes. As a consequence of his actions, Russell received two life imprisonment terms and is presently incarcerated at Clallam Bay Corrections Center. Our focus in the case study revolves around the crime scenes involving two victims, namely Mary Anne Pohlreich and Carol Ann Beethe.

Subsequent to the investigation of crimes \(G_3\), \(G_4\) and \(G_5\), offenders implicated in the crimes are The Hillside Strangler (later referred to as the Hillside Stranglers). This is the media label given to a pair of American serial killers who instilled fear in Los Angeles, California, from October 1977 to February 1978. The moniker is derived from the discovery of numerous victims’ bodies in the hills surrounding the city. Initially, the belief was that a singular individual was behind the killings. However, law enforcement determined, based on the positioning of the bodies, that two perpetrators were collaborating, although this information was not disclosed to the media. The culprits were later identified as cousins Kenneth Bianchi and Angelo Buono Jr. They were subsequently convicted of the abduction, sexual assault, torture, and murder of 10 women and girls, ranging in age from 12 to 28. In our case study, we examine the crime scenes related to three victims: Judith Miller, Yolanda Washington, and Kimberly Martin.

The crime scenes are described as follows:

Crime scene 1 (\(G_1\))::

The fact that the body was found in a busy passageway suggests that the murderer wanted it found quickly. The corpse was set up in a certain manner to imply sexual degradation. The victim’s valuables were left behind as they were of little value. The victim showed strangulation wounds as well as defense wounds, indicating that she had resisted being attacked. The autopsy revealed a severe impact injury to the head. After a brutal rape and sodomization, seminal fluids were found on the victim. The facts surrounding the murder suggest that an inappropriate sexual interaction was the cause. After more investigation, it was discovered that her truck had been stolen.

Crime scene 2 (\(G_2\))::

The victim’s body was found in her bedroom. The murderer purposefully made the crime scene unpleasant. The victim had been brutally beaten with an unknown blunt object. She appeared to have resisted because she had two defense wounds. The victim was not just raped; objects were also inserted in her privates. Although the bedroom was not ransacked, jewelry and cash were taken. The perpetrator entered and exited the victim’s bedroom through a sliding glass door left open; the house entrance to the victim’s bedroom was closed and secured. Investigators also recovered the killer’s hair in some underwear lying on the ground.

Crime scene 3 (\(G_3\))::

A female’s naked body was discovered on a parkway. She was bound and strangled, evidenced by the ligature marks on her neck, wrists, and ankles. She must have been slain somewhere else because the body had been dumped. There were no indications of theft. She was both sodomized and raped. There was no evidence to suggest that the victim had been dragged to her position. So it was conceivable that more than one person had helped carry her from a car. The evidence points to a planned method of committing the crime.

Crime scene 4 (\(G_4\))::

The body was found on a road that followed a mountainside. The corpse was cleaned and then disposed of. There were slender indications of the rope’s presence where it had been wrapped around her neck, wrists, and ankles, where she had been tied. The victim was raped before being killed by strangulation. There was no indication that anything had been stolen. The evidence found there suggests that the crime was planned before it was committed.

Crime scene 5 (\(G_5\))::

The victim’s naked corpse was found on a parkway. The ligature marks on her neck, wrists, and ankles proved she had been bound and strangled. The fact that the body had been left behind proved that she had been murdered somewhere. No evidence of stealing was present. She was raped and sodomized. As far as was known, the victim had not been dragged to her position. This made it possible for more than one person to have assisted in carrying her from a vehicle. The evidence suggests that the crime was committed using a deliberate strategy.

The following six actions or six criteria, are taken for the linkage analysis:

  1. (a)

    Probabilities that the victim’s cash or valuables was taken: \(B_1\)

  2. (b)

    Likelihood of planned actions: \(B_2\)

  3. (c)

    Savagery in their actions: \(B_3\)

  4. (d)

    Possibility of employing own weapon: \(B_4\)

  5. (e)

    Likelihood of being killed fast by being overpowered: \(B_5\)

  6. (f)

    Likelihood of having forensic knowledge: \(B_6\)

Now, we have the real data of the crime scenes in the case study. We can explain the crime scenes using linguistic variables.

For example the Crime scene 1 can be explained as follows:

  1. (a)

    \(B_1\): Probabilities that the victim’s cash or valuables was taken: The crime suggests a high probability that the victim’s cash or valuables were taken during the crime. It implies that theft or robbery was likely a significant motive for the offender. So, we use the linguistic variable ’High=H’ for this characteristic.

  2. (b)

    \(B_2\): Likelihood of planned actions: The crime scene indicates a very low likelihood that the actions of the offender were planned in advance. It implies that the crime may have been impulsive or not carefully premeditated. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

  3. (c)

    \(B_3\): Savagery in their actions: The crime scene suggests that the actions of the offender were characterized by extreme savagery. It implies that the crime involved brutal and violent behavior. Now the above algorithm is applied in crime linkage. So, we use the linguistic variable ’Very High=VH’ for this characteristic.

  4. (d)

    \(B_4\): Possibility of employing own weapon: The crime scene indicates a very low possibility that the offender used their own weapon during the crime. It implies that the weapon may have been improvised or obtained at the scene. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

  5. (e)

    \(B_5\): Likelihood of being killed fast by being overpowered: The crime scene implies a high likelihood that the victim was killed quickly due to being overpowered by the offender. It suggests that the offender had physical dominance. So, we use the linguistic variable ’High=H’ for this characteristic.

  6. (f)

    \(B_6\): Likelihood of having forensic knowledge: The crime scene indicates a very low likelihood that the offender had significant forensic knowledge. It implies that the offender may not have taken steps to conceal evidence or avoid leaving traces behind. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

These linguistic variables provide qualitative interpretations of the criteria related to each crime scene. They help in characterizing and assessing the key elements of each scene based on the specified criteria.

Then using the algorithm based on the proposed similarity measure we can link crimes which is given as follows:

The decision matrix for Step 1 is provided in Table 6, where the five offenses are simply summarized using the six criteria in tabular form in IFS form using linguistic variables based on Table 5.

Table 6 Matrix form of five crimes w.r.t each attribute via IFS

In Step 2, the linguistic values are encoded which is given in Table 7.

Table 7 Matrix form of five crimes w.r.t each attribute after plotting the linguistic values

The third step entails creating an association matrix with the form \(C=(c_{ij})_{5\times 5}\) where \(c_{ij}=S_P(A_i,A_j);i,j=1,2,3,4,5\) represents the association coefficients. The association matrix is

$$\begin{aligned} C= \begin{bmatrix} 1.0000 &{} 0.9660 &{} 0.4315&{}0.4753&{}0.4532\\ 0.9660 &{} 1.0000 &{} 0.5278&{}0.5747&{}0.5463\\ 0.4315 &{} 0.5278 &{} 1.0000&{}0.9845&{}0.5748\\ 0.4753 &{} 0.5747 &{} 0.9845&{}1.0000&{}0.9903\\ 0.4532 &{} 0.5463 &{} 0.9748&{}0.9903&{}1.0000\\ \end{bmatrix} \end{aligned}$$

By using the operation composition for Step 4, the following association matrix is produced.

$$\begin{aligned} C^2=CoC= \begin{bmatrix} 1.0000 &{} 0.9660 &{} 0.5278&{}0.5747&{}0.5463\\ 0.9660 &{} 1.0000 &{} 0.5747&{}0.5747&{}0.5747\\ 0.5278 &{} 0.5747 &{} 1.0000&{}0.9845&{}0.9845\\ 0.5747 &{} 0.5747 &{} 0.9845&{}1.0000&{}0.9903\\ 0.5463 &{} 0.5747 &{} 0.9845&{}0.9903&{}1.0000\\ \end{bmatrix} \end{aligned}$$

As \(C^2\subseteq C\), so by 5th Step,

$$\begin{aligned} C^4= & {} C^2oC^2= \begin{bmatrix} 1.0000 &{} 0.9660 &{} 0.5747&{}0.5747&{}0.5747\\ 0.9660 &{} 1.0000 &{} 0.5747&{}0.5747&{}0.5747\\ 0.5747 &{} 0.5747 &{} 1.0000&{}0.9845&{}0.9845\\ 0.5747 &{} 0.5747 &{} 0.9845&{}1.0000&{}0.9903\\ 0.5747 &{} 0.5747 &{} 0.9845&{}0.9903&{}1.0000\\ \end{bmatrix}\\ C^8= & {} C^4oC^4= \begin{bmatrix} 1.0000 &{} 0.9660 &{} 0.5747&{}0.5747&{}0.5747\\ 0.9660 &{} 1.0000 &{} 0.5747&{}0.5747&{}0.5747\\ 0.5747 &{} 0.5747 &{} 1.0000&{}0.9845&{}0.9845\\ 0.5747 &{} 0.5747 &{} 0.9845&{}1.0000&{}0.9903\\ 0.5747 &{} 0.5747 &{} 0.9845&{}0.9903&{}1.0000\\ \end{bmatrix} =C^4 \end{aligned}$$

Thus, \(C^8=C^4\). Therefore, \(C^4\) is an equivalent association matrix.

Steps 6 and 7 are then used to choose different \(\alpha \)-level sets and characterize the clusters for the IFSs. Following is a description of the clusters.

  1. i)

    For \(0.0000<\alpha \le 0.5747\), all the crime belong to single cluster \(\{G_1,G_2,G_3,G_4,G_5\}\)

  2. ii)

    For \(0.5747< \alpha \le 0.9660\), all the crime belong to two clusters \(\{G_1,G_2\},\{G_3,G_4,G_5\}\)

  3. iii)

    For \(0.9660<\alpha \le 0.9845\), all the crime belong to three clusters \(\{G_1\},\{G_2\},\{G_3,G_4,G_5\}\)

  4. iv)

    For \(0.9845<\alpha \le 0.9903\), all the crime belong to four clusters \(\{G_1\},\{G_2\},\{G_3\},\{G_4,G_5\}\)

  5. v)

    For \(0.9903<\alpha \le 1.0000\), all the crime belong to five clusters \(\{G_1\},\{G_2\},\{G_3\},\{G_4\},\{G_5\}\)

It is clear that the suggested updated algorithm correctly clusters the offenses just based on the details of crime scenes before the killers are caught. The first commits two crimes, \(G_1\),\(G_2\) while the second commits \(G_3\), \(G_4\) and \(G_5\) as we have considered in the case study of real crime scenes of the victims killed by the serial killers.

5 Application in psychological profiling

Psychological profiling involves a systematic and informed effort to uncover and assess particular details about a suspect. It entails the examination of behavioral patterns, trends, and inclinations of an unknown offender, utilizing evidence gathered from crime scenes to create a comprehensive psychological profile. Following the methods employed by FBI profilers, the process of generating a profile serves as a means to identify essential personality traits and behavioral characteristics of an unidentified perpetrator through a thorough analysis of crime scenes Douglas and Burgess (1986).

To build a profile, there are basically four steps Douglas and Burgess (1986):-

  1. (a)

    Input profiling

  2. (b)

    Decision-making process model

  3. (c)

    Analysis of crime

  4. (d)

    Assessment of the criminal profile

The profiler initiates the process by gathering and meticulously analyzing all pertinent information, facts, context, and the initial police report related to the crime scene. For example, in a homicide case where the victim was fatally shot, the investigator assembles all available evidence and examines details such as the cause of death, the type of weapon employed, the nature of injuries sustained, and so on. These pieces of information, essentially serving as the inputs for profiling, are then organized into meaningful patterns during the second step to enable accurate and effective interpretation. The third step involves reconstructing the sequence or patterns of events, encompassing an understanding of how the perpetrators planned, plotted, and ultimately executed the crime based on the data collected in the preceding stage. The fourth and final stage of creating a criminal profile entails the assessment of the potential suspect’s personality traits. This evaluation encompasses various factors, including physical attributes, routines, convictions, and behavioral patterns leading up to the commission of the offense.

Muller’s research Muller (2000), extensively explored the differentiation between organized and disorganized criminal behavior. This analysis took into consideration six specific behaviors associated with both organized and disorganized conduct, which are detailed in Table 8. It was observed that a highly organized criminal, as depicted in the table, typically engages in meticulous planning when committing their crimes. Conversely, a highly disorganized criminal lacks such planning and exhibits a more erratic crime pattern. Forensic expertise indicates that organized criminals tend to possess higher intelligence compared to their disorganized counterparts. In contrast, a highly disorganized criminal is more likely to select a victim randomly, swiftly commit the crime using available resources at the scene, and leave both the body and the murder weapon behind. In contrast, a highly organized criminal takes measures to minimize evidence left at the crime scene, reducing the likelihood of quick apprehension.

Table 8 Activities associated with organized and disorganized behavior

Fuzzy methodologies, integrating fuzzy logic and fuzzy set theory, prove invaluable in handling inherently uncertain or imprecise evidence in criminal investigations, allowing investigators to consider degrees of membership and possibility. This approach, exemplified by the study of Goala (2019), significantly advances the field of criminal investigation. Its updated approach is used in our study which utilizes the proposed similarity measure. In cases of serious criminal acts with a dearth of reliable evidence, profiling techniques provide crucial insights into the personality traits of unidentified perpetrators, aiding in suspect identification and the investigative process. In the realm of crime scene analysis, where evidence is often characterized by inherent uncertainty and linguistic terms, intuitionistic fuzzy linguistic term sets offer a fitting means of description. Additionally, contemporary investigative psychology theories enable the creation of optimal crime conditions for specific types of crimes, allowing investigators to deduce offender behavior by comparing crime scene actions with representations in Intuitionistic Fuzzy Sets (IFSs). These techniques facilitate a nuanced understanding of complex criminal scenarios, enhancing the efficiency of criminal investigations. This is the primary justification for the use of similarity measures based on IFSs in the portrayal of crimes.

In this section we propose a methodology for psychological profiling. We then propose an algorithm based on the methodology. The time complexity of the algorithm is analyzed. Further, a case study is considered where we have used our proposed approach showing its applicability.

5.1 Methodology

In light of the information obtained from the crime scenes, the profiling of the criminal is generated. Here, we will determine if the criminal had organized or disorganized behavior. Muller (2000) detailedly analyzed the difference between a criminal’s organized and disorganized behavior. He considered six activities associated with organized and disorganized behavior which is given in Table 8.

For our study to determine the behavior of the offender in the case instances, the following six behaviors based on Table 8 are used as criteria:

  1. a)

    Planning: \(a_1\)

  2. b)

    Use personal weapon: \(a_2\)

  3. c)

    Targeted victim: \(a_3\)

  4. d)

    Manipulation to exert control: \(a_4\)

  5. e)

    Body transport: \(a_5\)

  6. f)

    Knowledge of forensics: \(a_6\)

The weighted vector of the activities is taken to be \(w= (w_1,w_2,....,w_6)^T\), where \(0\le w_i\le 1\) and \(\sum \limits _{i=1}^6 w_i=1\). The importance of each activity is viewed generally as being equal. i.e. \(w_i=1/6; i=1,2,..,6\).

The information about the activities obtained from the crime scene is uncertain in nature. In this situation, the information could be better explained in IFS form by using linguistic variables. The linguistic variables are used to make it easier to get values from the case study. The corresponding IF values of the linguistic labels can then be plotted using Table 9.

Table 9 Linguistic variables and their corresponding IF values

If we consider a case E for study, the activities are represented by IFS as follows by linguistic labels (LL).

$$\begin{aligned} E=\{(a_1,LL),(a_2,LL),(a_3,LL),(a_4,LL),(a_5,LL),(a_6,LL)\} \end{aligned}$$

Then, the corresponding IF values of the linguistic labels are plotted based on Table 9.

Further, the ideal organized and disorganized activities could be represented using Table 8 as suggested by Muller (2000)

$$\begin{aligned} O= & {} \{(a_1,VH),(a_2,VH),(a_3,VH),(a_4,VH),(a_5,VH),(a_6,VH)\}\\ D= & {} \{(a_1,VL),(a_2,VL),(a_3,VL),(a_4,VL),(a_5,VL),(a_6,VL)\} \end{aligned}$$

After, plotting the corresponding IF values of the linguistic labels based on Table 9, we get,

$$\begin{aligned} O= & {} \{(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1)\} \\ D= & {} \{(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8)\} \end{aligned}$$

Then we find the similarity measure, \(S_P(E,O)\) and \(S_P(E,D)\). We use the similarity measure to find the similarity between two IFSs E, O and E, D.

Then, find the Relative similarity, \(S_R(O)=\dfrac{S_P(E,O)}{S_P(E,O)+S_P(E,D)}\) and

$$\begin{aligned} S_R(D)=\dfrac{S_P(E,D)}{S_P(E,O)+S_P(E,D)} \end{aligned}$$

If, \(S_R(O)>S_R(D)\), then the criminal has organized behavior.

If, \(S_R(D)>S_R(O)\), then the criminal has disorganized behavior.

The following is an explanation of the algorithm:

Step 1::

Define the criteria for assessing the offender’s behavior.

Step 2::

Construct the offender’s profile (E) using linguistic terms.

Step 3::

Put the linguistic values of the offender’s profile (E), the ideal organized profile (O), and the ideal disorganized profile (D).

Step 4::

Calculate the similarity measure between the offender’s profile (E) and the ideal organized profile (O) as \(S_P(E, O)\) and between E and the ideal disorganized profile (D) as \(S_P(K, D)\).

Step 5::

Calculate relative similarities:

$$\begin{aligned} S_R(O)=\dfrac{S_P(E,O)}{S_P(E,O)+S_P(E,D)}\hbox { and }S_R(D)=\dfrac{S_P(E,D)}{S_P(E,O)+S_P(E,D)} \end{aligned}$$
Step 6::

Classify the offender:

If \(S_R(O)>S_R(D)\), classify the offender as having organized behavior.

If \(S_R(D)>S_R(O)\), classify the offender as having disorganized behavior.

The flowchart is given in Fig. 10.

Fig. 10
figure 10

Flowchart of the algorithm 2

The time complexity of each step of the algorithm is given as follows:

Step 1: Define the criteria for assessing the offender’s behavior::

This step does not involve any significant computational operations. It typically involves defining criteria, which is a one-time setup process. Therefore, it can be considered O(1) or constant time.

Step 2: Construct the offender’s profile (E) using linguistic terms::

Constructing the profile likely involves assigning linguistic values to specific attributes or features. This step is generally straightforward and involves simple assignments. It can be considered O(1) or constant time.

Step 3: Put the linguistic values of the offender’s profile (E), the ideal organized profile (O), and the ideal disorganized profile (D)::

Similar to step 2, this step involves assigning linguistic values to profiles. It doesn’t involve complex computations and can be considered O(1) or constant time.

Step 4: Calculate the similarity measure \(\textbf{S}_{\textbf{P}}(\textbf{E},\, \textbf{O})\) and \(\textbf{S}_{\textbf{P}}(\textbf{E},\,\textbf{D})\)::

If there are n linguistic terms or attributes in each profile, the time complexity of calculating similarity measures is generally O(n), as we need to perform computations for each attribute. Therefore, the time complexity is O(n).

Step 5: Calculate relative similarities \(\textbf{S}_{\textbf{R}}(\textbf{O})\) and \({\textbf{S}}_{\textbf{R}}(\textbf{D})\)::

Calculating relative similarities involves basic arithmetic operations such as division. It doesn’t depend on the size of the profiles and can be considered O(1) or constant time.

Step 6: Classify the offender based on relative similarities::

Classification based on relative similarities is a simple decision-making process. It doesn’t depend on the size of the profiles and can be considered O(1) or constant time.

Overall, the time complexity of the entire algorithm is primarily influenced by step 4, which involves calculating similarity measures. In the worst case, where the number of linguistic terms or attributes (n) is significant, the time complexity of this step is O(n). The other steps have constant time complexities (O(1)).

5.2 A case study on psychological profiling

Special Supervisory Agents (SSAs) from the Child Abduction and Serial Murder Investigative Resources Center (CASMIRC), established under the National Center for the Analysis of Violent Crime (NCAVC), conducted a comprehensive study on seven criminals. The creation of NCAVC’s Child Abduction and Serial Murder Investigative Resources Center (CASMIRC) in the early 2000s was prompted by a Congressional mandate, specifically the Protection of Children From Sexual Predators Act of 1998. Beasley (2004) meticulously examined seven American serial killers, aiming to consolidate and analyze information. His research forms part of an ongoing project focused on understanding the commonalities and distinctions among these criminals, thereby enhancing our collective understanding of serial murder dynamics. The study delves into the perpetrators’ backgrounds and explores their unique perspectives on themselves and the world around them.

In our case study, we explore the activities of the sixth offender who manifested a lack of self-control, potentially indicative of impulsive self-preservation-a characteristic often associated with disorganized killers. The ensuing outline provides insight into the criminal behavior exhibited by the sixth offender.

The perpetrator was a Black male who had a very troubled upbringing. He had a 68 IQ and speech impairment (mentally deficient). For a burglary, he was sentenced to time in state prison. He killed two males (one Black and one White) and three women (two White and one Hispanic) over the course of 18 months when he was 33. Although his low IQ was noted in jail records, he surprisingly exhibited, in some restricted circumstances, a stunning level of criminal knowledge. For instance, he used the same weapon for all five crimes but changed it after each murder to prevent ballistic similarities from being utilized to connect the cases. His criminal action displays a variety of motivations, including retaliation, profit, and sexual assault. His victims’ ages, ranging from 38 to 87, seem to have been chosen randomly. It appears that rather than genuinely intending each murder, he killed only as a result of the circumstances that developed while he was committing robberies, rapes, and break-ins. All of the victims died from gunshot wounds, and after the murder was committed, he neglected to move the victims’ bodies. However, he was smart enough to conceal his murder weapon after the killings in the attic of his father’s house.

Now, we have the real data of the outline of offender’s criminal activity in the case study. We can explain the criteria using linguistic variables.

  1. (a)

    \(a_1\): Planning: The offender exhibited low planning in his crimes. So, we use the linguistic variable ’Low=L’ for this characteristic.

  2. (b)

    \(a_2\): Use personal weapon: The offender used the same weapon for all five crimes but changed it after each murder. All of the victims died from gunshot wounds. So, we use the linguistic variable ’Very High=VH’ for this characteristic.

  3. (c)

    \(a_3\): Targeted victim: The offender rather than genuinely intending each murder, killed only as a result of the circumstances that developed while he was committing robberies, rapes, and break-ins. His victims’ ages, ranging from 38 to 87, seem to have been chosen randomly. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

  4. (d)

    \(a_4\): Manipulation to exert control: The offender exhibited his lack of self-control, more likely they were simply indicative of impulsive self-preservation. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

  5. (e)

    \(a_5\): Body transport: The offender neglected to move the victims’ bodies after the murder was committed. So, we use the linguistic variable ’Very Low=VL’ for this characteristic.

  6. (f)

    \(a_6\): Knowledge of forensics: The offender used the same weapon for all five crimes but changed it after each murder to prevent ballistic similarities from being utilized to connect the cases and was smart enough to conceal his murder weapon after the killings in the attic of his father’s house. So, we use the linguistic variable ’Very High=VH’ for this characteristic.

These linguistic variables provide qualitative interpretations of the criteria related to criminal behavior. They help in characterizing and assessing the key elements of criminal behavior based on the specified criteria.

Based on outline, the activities are represented by IFS as follows by linguistic labels.

$$\begin{aligned} E=\{(a_1,L),(a_2,VH),(a_3,VL),(a_4,VL),(a_5,VL),(a_6,VH)\} \end{aligned}$$

Then, the corresponding IF values of the linguistic labels are plotted as

$$\begin{aligned} E=\{(0.3,0.6),(0.9,0.1),(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.9,0.1)\} \end{aligned}$$

Further, the ideal organized and disorganized activities could be represented by

$$\begin{aligned} O= & {} \{(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1),(0.9,0.1)\}\\ D= & {} \{(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8),(0.1,0.8)\} \end{aligned}$$

Now, Relative similarity, \(S_R(O)=\dfrac{S_P(E,O)}{S_P(E,O)+S_P(E,D)}=0.4119\)

And, \(S_R(D)=\dfrac{S_P(E,D)}{S_P(E,O)+S_P(E,D)}=0.5881\)

i.e. The ranking on the basis of relative similarity is \(D>O\)

Hence, it can be concluded based on our proposed approach that the criminal had disorganized behavior just based on the outline of offender’s criminal activity. The result is validated by the same conclusion founded by CASMIRC and Beasley about the offender after thorough investigation.

6 Discussion and comparative analysis

Here, in this section, we make a discussion about results obtained from the above findings. First we have made a further analysis to show the superiority of our proposed measure. Then, we have have made a comparative analysis of our proposed measure and several existing measures for the crime linkage and psychological profiling methodology, showing the applicability of both the methodologies. Finally, we make a sensitivity analysis study of the crime linkage and psychological profiling methodology.

6.1 Analysis on the proposed similarity measure

From the three examples considered in Sect. 3.3, we could see the limitations of the existing measures. Similarity measure \(S_{CD}^2\) gives correct result in Example 1 and 2 but gives incorrect result in Example 3. Similarity measures \(S_{HK}\) gives correct result in Example 1 and 3 but gives incorrect result in Example 2. Similarity measures such as \(S_{GR}, S_{L}\) gives correct result in Example 3 but gives incorrect result in Example 1 and 2. Similarity measure such as \(S_{HW}, S_{Ng1}, S_{Ng2}, S_{MP}, S_{CD}^1, S_{CD}^3, S_{G1}, S_{G2}, S_{G3}, S_{G4}\) gives correct result in Example 1 but gives incorrect result in Example 2 and 3. Similarity measures such as \(S_C, S_Y, S_{GK}, S_{JJ}, S_{DS}, S_{GR1},S_{GR2},S_{GR3},S_{GR4} \) gives incorrect result in all the three examples.

Now in this section, we delve deeper into the further analysis of our proposed measure and several existing measures, further solidifying its status as a superior alternative. The comprehensive results are summarized in the table below, encompassing the effectiveness of each measure in capturing similarity and dissimilarity.

Table 10 Evaluation of similarity measures

The study mentions a few recent analyses that have verified the reliability and effectiveness of the proposed similarity measures. As a result, comparisons between the proposed similarity measure and existing measures are made in this segment. The data set is displayed in Table 10, and the six non-similar pairs of IFSs are used as instructive examples. Table 10 presents our proposed measure along with the results of various other measures. In the Table, several drawbacks of the existing measure can be seen. \(S_C\) produces a unit result for Profile 1. \(S_Y\) produces a unit result for Profile 1,4.

When only the MD and ND are considered, Profile 1’s IFSs appears to be more comparable than those of Profile 2, which are complements of one another. However, it is clear that when the HD is considered, the IFSs for Profile 2 resembles more than those of Profile 1. In the case of \(S_{HK},S_{MP}\), the similarity values are found to be equal whereas in \(S_{GK},S_{HW}S_{JJ},S_{Ng1},S_{Ng2},S_{DS},S_{GR},S_L,S_{CD}^1,S_{CD}^2,S_{CD}^3,S_{G1},S_{G2},S_{G3},S_{G4},S_{GR1},\) \(S_{GR2},S_{GR3},S_{GR4}\) the similarity values in Profile 2 are less than in Profile 1. \(S_C,S_Y\) produces a unit result for Profile 1. Nevertheless, in our proposed measure, similarity values in Profile 2 are more than in Profile 1.

Again, it is seen that the IFSs of Profile 3 and 4 give unreasonable ordering for some similarity measures. It can be seen that for Profile 3, \(|({\mathbb {M}}_A-{\mathbb {M}}_B)|+|({\mathbb {N}}_A-{\mathbb {N}}_B)|+|({\mathbb {O}}_A-{\mathbb {O}}_B)|=0.3\), whereas for Profile 4, \(|({\mathbb {M}}_A-{\mathbb {M}}_B)|+|({\mathbb {N}}_A-{\mathbb {N}}_B)|+|({\mathbb {O}}_A-{\mathbb {O}}_B)|=0.6\). As such, it is logical to think that the IFSs of Profile 3 is more similar to Profile 4. In the case of \(S_{C},S_{HK},S_{JJ}\), the similarity values are found to be equal, whereas in \(S_{CD}^1,S_{CD}^2,S_{G5},S_{G6},S_{GR1},S_{GR2},S_{GR3},S_{GR4}\), the similarity values in Profile 2 are less than in Profile 1. \(S_Y\) produces a unit result for Profile 1. However, in our proposed measure, similarity values in Profile 3 are more than in Profile 4.

Again, it is seen that the IFSs of Profile 5 and 6 give the same similarity values for some similarity measures. The similarity measures such as \(S_C,S_{HK},S_{HW},S_{JJ},S_{Ng1},S_{Ng2},S_{MP},S_{DS},S_{GR},S_L,S_{CD}^1,S_{CD}^2,S_{CD}^3,S_{G3},S_{G4},S_{GR1},\) \(S_{GR2},S_{GR3},S_{GR4}\) gave same similarity values. Thus, it implies that the similarity measure fails to distinguish between positive and negative differences.

Hence, our proposed measure outperforms other similarity measures and can serve as a better alternative to the existing measures.

6.2 Comparative analysis of similarity measures for crime linkage and psychological profiling

Now in this section, comparative analysis of our proposed measure and several existing measures for the crime linkage and psychological methodology, showing the applicability of both the methodologies.

At first, we consider the methodology of crime linkage w.r.t different measures where we consider the Case Study in Sect. 4.2 for the comparative analysis. In the analysis, we use the existing most recent measures in our proposed methodology. The comprehensive results are summarized in the table below, showing the effectiveness of our proposed methodology.

Table 11 Comparison of similarity measures for crime linkage in case study

In the study we have considered the recent measures to check the validity of the methodology. In the Table 11, we can see that the existing measures give the correct clustering result as does our proposed measure. These results have verified the reliability and effectiveness of the proposed methodology.

Now, a comparative analysis of our proposed measure and several existing most recent measures for the psychological methodology, showing the applicability of the methodology. We consider the Case Study in Sect. 5.2 for the comparative analysis. In the analysis, we use the different measures in our proposed methodology. The comprehensive results are summarized in the table below, showing the effectiveness of our proposed methodology.

Table 12 Comparison of similarity measures for psychological profiling in Case study

In the study we have considered the recent measures to check the validity of the methodology. In the Table 12, we can see that the existing measures along our proposed measure give the correct analysis of the criminal as disorganized offender. These results have verified the reliability and effectiveness of the proposed methodology.

6.3 Sensitive analysis for different values of \({{\varvec{\lambda }}}\)

This section comprises sensitive analysis for various values of \(\lambda \) for crime linkage and psychological profiling.

We first take into account the case study of five crime scenes of two serial killers (O’Brien 2014; Keppel 2010) for crime linkage. The first commits two crimes, \(G_1\),\(G_2\) while the second commits \(G_3\), \(G_4\) and \(G_5\). Through our proposed methodology we have obtained the correct result. The sensitive analysis for various values of \(\lambda \), is shown in Table 13.

Table 13 Sensitivity analysis for different values of \(\lambda \)

For \(\lambda =1,2\), the offenses might be grouped into 1, 2, 3, 4, and 5 clusters for various values of \(\alpha \), as shown in Table 13. So, up to 5 clusters could be identified for \(\lambda \le 307\). For \(\lambda =308\), the crimes could be clustered into only 1,2,3 and 4 clusters and could not cluster all the 5 crimes for any values of \(\alpha \). This result remains till \(\lambda \le 512\). Again, for \(\lambda =513\), up to 3 clustering could be done, which remains till \(\lambda \le 1116\). Similarly, up to 2 clustering could be done for \(\lambda \ge 1117\), which remains till \(\lambda \le 38135\). Finally, for \(\lambda \ge 38136\), only a single cluster could be identified.

The choice of the \(\lambda \) parameter in the proposed similarity measure significantly impacts the clustering results, as demonstrated by the sensitivity analysis of crime linkage. It suggests that the choice of \(\lambda \) should be made based on the desired level of granularity in clustering. Lower values of \(\lambda \) provide finer distinctions with more clusters, while higher values lead to more consolidated clusters. Therefore, investigators should select \(\lambda \) according to the specific requirements of their crime linkage tasks, considering the trade-off between granularity and consolidation in clustering results. Thus, the value of \(\lambda \) influences the clustering results, as seen in the above study.

We now take into account the case study of sixth offender’s criminal activity Beasley (2004) for psychological profiling. In the case study we have found that the criminal had disorganized behavior which is validated by the same conclusion founded by CASMIRC and Beasley about the offender.

A sensitive analysis is done for different values of \(\lambda \) which could be seen in Table 14. It is seen that on increasing the values of \(\lambda \) the difference between \(S_R(O)\) and \(S_R(D)\) decreases and finally becomes equal for \(\lambda =7023\). So, for values of \(\lambda \ge 7023\), \(S_R(O)=S_R(D)\), and so could not distinguish organized and disorganized offender.

Table 14 Sensitivity analysis for different values of \(\lambda \)

The choice of the \(\lambda \) parameter in the proposed similarity measure significantly impacts the profiling results, as demonstrated by the sensitivity analysis of psychological profiling. It suggests that the choice of \(\lambda \) can significantly influence the distinction between offender and crime scene representations, and the sensitivity analysis highlights that as lambda increases, this distinction diminishes, ultimately becoming equal at a specific value. This observation implies that there may be an optimal or stable parameter setting for profiling tasks, and further investigation is needed to determine the practical implications and generalization of this finding across different scenarios or data sets.

Selecting the \(\lambda \) parameter in the proposed similarity measure is a nuanced process influenced by sensitivity analyses conducted in crime linkage and psychological profiling. In summary, \(\lambda \) should be chosen thoughtfully, considering the unique objectives and trade-offs inherent in each task to achieve the desired clustering or profiling outcomes. Nevertheless, further exploration is warranted to fully grasp the practical implications and applicability of this discovery across various scenarios and data sets.

7 Conclusion

This study addresses the pressing challenge of expediting criminal investigations in the face of pervasive criminal activities, where the scarcity of reliable evidence often hampers law enforcement agencies. Leveraging the tools of crime linkage and psychological profiling, we introduce a novel generalized similarity measure rooted in the Intuitionistic Fuzzy Sets (IFS) framework. Positioned to surpass existing methodologies in precision and practicality, our proposed similarity measure offers heightened accuracy and applicability. Beyond theoretical constructs, our research extends the reach of this innovative measure into crime linkage clustering and psychological profiling, contributing to the advancement of efficiency and depth of insight in the pursuit of justice.

Examining the advantages and limitations of our work, the strengths lie in the development of a robust similarity measure and its practical applications in crime linkage and psychological profiling. These applications showcase the measure’s potential in real-world scenarios. However, it is essential to recognize the study’s limitations. While promising, the potential can only be fully realized through ongoing research, addressing identified limitations, and exploring nuances in criteria or attribute weights. Expert input should be sought for a holistic approach, and the development of new measures and methodologies remains crucial for effective decision-making in criminal investigation.

Even though the proposed measure has a meagre flaw i.e., for given some IFSs \(\{\langle x, \mu _i, 0 \rangle \}_{i=1}^{\infty }\) and, the measure gives identical value zero. However, it should be noted that the representation of uncertain parameters (not always obtain in the form of \(\{\langle x, \mu _i, 0 \rangle \}_{i=1}^{\infty }\)) in criminal investigations often lacks precision, as the degree of a criminal’s involvement is not always definitively known. Therefore, our approach considers both positive membership and non-membership, reflecting this uncertainty. However, this flaw will be addressed in our future work, and we are committed to improving the methodology to account for such uncertainties more effectively.

Looking ahead, future studies must focus on refining and expanding upon these findings. Diligent attention should be given to the identified limitations, and nuanced exploration of varying weights assigned to different criteria is warranted. A holistic approach, incorporating expert insights, is crucial. Ongoing research efforts should aim to develop new measures and methodologies, ensuring a continual evolution of tools that empower law enforcement agencies with superior capabilities for crime prediction and offender profiling.