1 Introduction

Clinical practice is a complex engineering system aimed at curing, alleviating, and preventing diseases through medical interventions. Clinical effectiveness serves as the fundamental value and driving force behind clinical practice. Evaluating the clinical effectiveness of real-world chronic diseases is particularly complex, as this evaluation requires comprehensive measurement and analysis at each medical visit, reassessment of the overall patient condition, disease progression diagnosis, and determination of corresponding treatment plans (as illustrated in Fig. 1). As time passes and medical visits progress, handling and analysing the information found in electronic health records (EHRs), which contain disease-related information such as physiological indicators (laboratory tests, imaging), symptoms, quality of life measures (physical functioning, mental health, social interaction), and societal value (work capacity, family contribution, social labour), become increasingly challenging.

Fig. 1
figure 1

The decision-making process of the whole course cycle of clinical treatment

Evaluating clinical effectiveness for real-world chronic diseases involves modelling discrete, stochastic, and multisource heterogeneous data from EHRs, where information granulation, classification, indicators, and dynamic changes in treatment plans at each stage exhibit nonadditive phenomena, also known as nonadditivity. A series of steps are required to construct a method that can be used to evaluate the long-term treatment effectiveness of chronic diseases. However, the traditional clinical research paradigm based on statistics also faces information distortion or unprocessing when handling mixed information of temporal correlation. Therefore, in this paper, the long-term clinical efficacy evaluation process of chronic diseases is evaluated as an uncertain decision-making problem with multigranular temporal correlation nonadditive attribute information. First, for the long-term diagnosis and treatment information of chronic diseases in practical applications, a decision-making model of multigranular temporal correlation nonadditive attribute information is constructed. Second, the real-world temporal correlation hybrid attribute information is processed into evaluation information that can be used in the traditional medical research paradigm by describing the decision-making process of diagnosing and treating chronic diseases in practical applications; that is, two groups of control information, information with significant efficacy and information without efficacy, are classified. Finally, the traditional medical research paradigm is used to conduct an empirical study on these two groups of control information. Notably, preliminary research is provided in this paper to verify the feasibility of a data-driven clinical decision paradigm in the clinical decision field, successfully combining a data-driven clinical decision paradigm and a traditional medical research paradigm.

Granular computing (Yao 2009) is an effective paradigm for handling multigranularity and temporal correlation attribute information, representing a new computational approach in the field of intelligent information processing. Granular computing has been widely used in risk assessment (Liang et al. 2021), multicriteria group decision-making (Pedrycz and Song 2011; Morente-Molinera et al. 2020), and attribute selection (Zhang et al. 2022). Information granulation and granular computing are two fundamental issues in granular computing research. Currently, three granulation strategies have been developed based on binary relations, clustering, and functional approximation. The binary relation-based granulation approach has gained attention from researchers worldwide due to its capability for effectively granulating complex data from the perspective of data features, given diverse data collection methods and rapid data growth. The equivalence relation (Pawlak 1982), similarity relation (Abo-Tabl 2011), neighbourhood relation (Hu et al. 2008; Wang et al. 2020), and dominance relation (Greco et al. 1999) are four commonly used binary relations in information granulation, providing effective approaches for accurately processing data with single-type features. To handle data that combine multiple features in real-world decision-making problems [(e.g., numerical, symbolic, interval, set-valued, and missing data, linguistic value (Qin et al. 2023; Castillo et al. 2011; Slowiński 2012; Zhang et al. 2021)], composite binary relation-based information granulation methods have been proposed (Ye et al. 2022). Despite the significant research achievements in information granulation methods based on different types of feature data, systematic and in-depth research on information granulation methods for complex data with inherent features such as fuzzy semantics, temporal features, correlation features, and the aforementioned diverse feature mixtures is still lacking (Yuan et al. 2022).

The rough set (Pawlak 1982) is a well-established research theory and tool in granular computing. Its basic idea is to use equivalence relations to define sets that characterize imprecise or uncertain concepts. Upper and lower approximation sets of a target concept can then be obtained, and the uncertainty of the target concept can be characterized through the boundary region between the upper and lower approximations. In the context of rough sets for temporal and correlated feature attributes, Wan et al. (2021) constructed a novel interactive and complementary feature selection approach based on a fuzzy multineighbourhood rough set model. Kumari and Acharjya (2023) introduced an incremental rough set shuffled frog leaping algorithm for knowledge inference in lung cancer diagnosis. Che et al. (2022) constructed a label correlation in a multilabel classification model using fuzzy rough sets. The application of rough set models in handling temporal and correlated feature attributes has also been explored in various domains, such as stock prediction (Podsiadlo and Rybinski 2016), green energy (Hou et al. 2021) and clinical diagnosis (San et al. 2014).

To fully reflect the randomness of decision data information and fuse the decision data information of correlated features, Choquet (1953) proposed the Choquet integral, which is a nonadditive measure of bounded random variables. The extension models of the Choquet integral can be roughly classified into two categories. The traditional addition and subtraction operations are extended by incorporating t-norms or t-conorms, leading to the class of Choquet integrals. The value range of the Choquet integral is also extended, resulting in a set-valued Choquet integral (Jang et al. 1997), interval-valued Choquet integral (Pojala and Sengupta 2017), and fuzzy-valued Choquet integral (Hajek and Froelich 2019). These extensions improve the applications of the constructed Choquet integral models to real-world management decision scenarios. Grabisch (1996) first introduced the Choquet integral as an aggregation operator to replace the traditional weighted average for solving multiattribute decision problems with attributes that exhibit correlation, constraint, and contradiction characteristics. Epstein (1999) established a risk-averse model based on preference relation capacity, providing a new theory and tool for handling decision problems with attribute information that exhibits preference relation characteristics. Nonadditive measures and Choquet integral based on nonadditive measures provide new solutions and principles for addressing complex relationship management decision problems in practical applications. In medical decision-making, literatures focus on the use of Choquet integral as an aggregation operator to deal with interacting criteria, such as classifying the different degrees of diseases (Zhang et al. 2018), motion-image brainwave multi-band information classification decision (Wieczynski et al. 2022), multiattribute prognostic decision of breast cancer with multi-omics information (Dey et al. 2021), classification and prediction of COVID-19 based on lung image features (Palmal et al. 2023), clinical diagnosis of integrated traditional Chinese and Western medicine with multigrain diversity combined with nonadditive preference group characteristics (Ye et al. 2024) and healthcare management effectiveness decision problems (Krueger and Daziano 2022). In addition, nonadditive decision models and various extended theoretical models have been successfully applied to various real-world management decision problems, such as city development and car ownership assessment (Beliakov et al. 2020) and sustainable product assessment (Liao et al. 2023).

The primary focus of this study is the method of granular representation for temporal correlated qualitative and quantitative feature attribute information. Building upon this foundation, we establish a temporal correlated feature rough set model based on the gradient, and we construct cosine similarity. Second, we explore a decision-making method based on the temporal correlated feature rough set and Choquet integral. Ultimately, we aim to establish a theoretical framework for evaluating clinical treatment effectiveness in the context of chronic diseases. Building upon the theoretical model, we classify the data using the nonadditive rough set model proposed in this paper, focusing on EHRs, which involve diverse and randomly generated data from multiple sources. By integrating the clinical decision-making features related to chronic diseases, we create two sets of comparative information: significant treatment effectiveness and noneffective information. Subsequently, we empirically study these two sets of comparative information using traditional medical research paradigms to derive a collection of treatment options for specific groups characterized by distinct features. Finally, utilizing the temporal correlated feature rough set based on machine learning (ML) and the Choquet integral proposed in this paper, we rank the treatment options to identify the optimal treatment plan for specific groups with distinct features. The framework of this paper is illustrated in Fig. 2.

Fig. 2
figure 2

The framework of this paper

The remainder of this paper is organized as follows. The preliminary knowledge of rough sets and the Choquet integral are introduced in Sect. 2. In Sect. 3, a temporal correlated feature rough set decision-making model based on machine learning and the Choquet integral is proposed, including a temporal correlated feature rough set model based on gradient and cosine similarity, a decision scheme ranking method based on the Choquet integral and a temporal correlated feature rough set. In Sect. 4, we apply the theoretical model to evaluate traditional Chinese medicine (TCM) and Western medicine used to treat chronic renal failure (CRF). We conduct simulation experiments and an analysis on the constructed model based on clinical data to provide auxiliary decision-making support for clinical decision-making in Sect. 5. Finally, a summary and research outlooks are presented in Sect. 6.

2 Preliminaries

2.1 Rough set

Rough set is based on the assumption that each object in the universe has a certain amount of information (data or knowledge) that is represented by attributes used to describe the object. Objects with the same description are indistinguishable (similar) as far as available information is concerned, and the resulting indistinguishable relation forms the mathematical basis of rough set. Rough set run on a data table consisting of objects (actions) described by a set of properties, where rows represent objects, columns represent properties, and each cell is the value of the object on the property.

Definition 1

(Pawlak 1982; Yao 2010) Given a nonempty finite set U, \(R \subseteq U \times U\) is a a binary relation over U, \(A=(U,R)\) is a generalized approximation space. For any subset \(X \subseteq U\), the upper and lower approximation sets with respet to the approximation space \(A=(U,R) \) are defined as follows.

$$\begin{aligned} \begin{aligned} \overline{R}(X)&=\{x\in U|[x]_{IND(R)}\cap X\ne \emptyset \}.\\ \underline{R}(X)&=\{x\in U| [x]_{IND(R)} \subseteq X \}. \end{aligned} \end{aligned}$$
(1)

Where \([x]_{IND(R)}\) indicates the equivalence class of x induced by the indiscernibility relation IND(R). If \(\overline{R}(X)=\underline{R}(X)\), X is called a definable set or crisp set in (UR), and if \(\overline{R}(X) \ne \underline{R}(X)\), X is called a rough set. The pair (UR) is called the knowledge space. Obviously, \(\underline{R}(X) \subseteq X \subseteq \overline{R}(X)\). The rough sets method is a kind of approximate computing methodology for using crisp sets to describe an uncertain target set without any prior knowledge (Wang 2001).

Based on the rough set approximations of X defined by A, one can divide the universe U into three disjoint domains: the positive domain POS(X) indicating the union of all the equivalence classes defined by A that each for sure can induce the class X; the boundary domain BND(X) indicating the union of all the equivalence classes defined by A that each can induce a partial decision of X; and the negative domain NEG(X) which is the union of all equivalence classes that for sure cannot induce the decision class X (Yao and Zhao 2008).

$$\begin{aligned}&POS(X)=\underline{R}(X),\nonumber \\&NEG(X)=\sim \overline{R}(X)=\{x\in U|[x]_{IND(R)}\cap X= \emptyset \},\nonumber \\&BND(X)= \overline{R}(X)-\underline{R}(X). \end{aligned}$$
(2)

2.2 Choquet integral

Definition 2

(Choquet 1953) Let \(Y=\{y_1,y_2,\dots , y_n\}\) be a non-empty finite set, \(\varphi (Y) \) is a power set of Y, \(\varphi : Q \rightarrow [0,1]\) is a fuzzy measure set function. The following properties are satisfied.

(1) \(\varphi (\emptyset )=0, \varphi (Y)=1 \),

(2) \(A,B \subseteq \varphi (Y) ~and~ A\subseteq B \Rightarrow \varphi (A) \le \varphi (B)\).

Definition 3

Let \(\varphi \) be a fuzzy measure of Y, \(Y=\{y_1,y_2,\dots , y_n\}\) be a non-empty finite set, f is a real measurable function on Y, the Choquet integral is:

$$\begin{aligned} \int f d\varphi =\sum _{i=1}^{n}\int f(y_{(i)}^*)[\varphi (A_{(i)})-\varphi (A_{(i-1)})] \end{aligned}$$
(3)

where \((y_1^*,y_2^*,\dots ,y_n^*)\) is a permutation of \(Y=\{y_1,y_2,\dots , y_n\}\) such that \(f(y_{(i)}^*) \le f(y_{(2)}^*) \le \dots , \le f(y_{(i)}^*)\) and \(f(y_{(0)}^*):=0\).

Murofushi and Sugeno (Murofushi and Sugeno 1991) generalize the range of fuzzy measures from [0, 1] to \([0,+\infty ]\), defining the Choquet integral of a non-negative measurable function as follows.

Definition 4

(Murofushi and Sugeno 1991) Let \(Y=\{y_1,y_2,\dots ,y_n\}\) be a set of fuzzy measure spaces, \(\varphi (Y)\) be a power set of Y, \(\varphi :Y\rightarrow [0,+\infty ]\) be a fuzzy measure function and \(\varphi \) satisify monotonic and non-negative. The Choquet integral of the non-negative measurable function \(f:X\rightarrow [0,+\infty ] \) with respect to \(\varphi \) is:

$$\begin{aligned} \int f d \varphi =\sum _{i=1}^{n} \int _{0}^{\infty } \varphi (\{x|f(x)>r\})dr \end{aligned}$$
(4)

3 A nonadditive rough set decision model based on machine learning and the Choquet integral

The paper investigates an uncertain decision-making problem with temporal correlation and nonadditive attribute information, the original dataset involved in such medical decision-making problems cannot be used and processed by traditional medical research paradigms. Therefore, the paper focuses on temporal correlation nonadditive attribute information and decision-making process in such clinical decision-making problems. Using rough sets, machine learning and nonadditive measures, a nonadditive rough set model (NARST) is constructed within the framework of granular computing, and the temporal correlation nonadditive attribute decision information is processed into a dataset that can be processed by traditional medical research paradigm. First, a temporal correlation composite binary relation based on gradient and cosine similarity is defined, and the information is divided into three disjoint regions (positive, negative and boundary) based on the upper and lower approximation of the composite binary relation. The positive domain represents effective decision information, the negative domain represents invalid decision information, and the boundary domain represents the decision information to be observed. This approach achieves effective classification of temporal correlation hybrid decision information. Second, the traditional medical statistical method is used to test the difference between effective and ineffective decision information, and the set of effective decision schemes corresponding to effective decision information is obtained. Finally, the Choquet integral and the temporal correlation feature rough set are used to rank the set of effective decision schemes. The integration of data-driven clinical decision paradigms with traditional medical research paradigms is shown in Fig. 3.

Fig. 3
figure 3

The integration of data-driven clinical decision paradigms with traditional medical research paradigms

3.1 Temporal correlated feature rough set model based on gradient and cosine similarity

Existing research on granular computing (Qian et al. 2010) primarily focuses on decision information models that handle static, independent, or weakly correlated attributes (Mashinchi et al. 2023; Lin et al. 2022). In this section, we aim to construct a sound information granularity for temporal correlated hybrid attribute feature data, allowing this granularity to represent the underlying information contained in the original data, which is the fundamental principle of information granulation representation methods. The specific steps for information granulation under rough set theory involve constructing granules, identifying their attributes and attribute values, and subsequently establishing approximate spaces formed by granules of different sizes. From these approximate spaces, upper and lower approximation sets are derived, expressing the information that can be effectively utilized.

Definition 5

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature, where \(U=\{x_1,x_2,\dots ,x_i,x_M\}\) is a set of decision-making objects, \(T=\{t_1,t_2,\dots ,t_i,t_N\}\) is a decision time point set, \(A=\{a_1,a_2,\dots ,a_k,a_K\}\) is a set of attributes, and \(S=\{s_1,s_2,\dots ,s_l,s_L\}\) is a set of decision schemes. The decision object \(x_i\) corresponds to a decision scheme \(s_l\) on each decision time point \(t_j\). For each object \(x\in U\) attribute \(a \in A\) and time point \(t \in T\), the corresponding information function is \(a:U\times T \rightarrow f_x (a,t)\). Let \(_{\bigtriangledown} F_x(a,t)\) be the gradient set of the decision object x on the attribute a and the time point t, \(i=1,2,\dots ,M;j=1,2,\dots ,N; k=1,2,\dots ,K;l=1,2,\dots ,L\).

Taking the examination and treatment process of CRF as an example, the granularity representation of clinical time series diagnosis and treatment decision information is provided as follows.

Example 1

The examination index \( A=\{a_1,a_2,a_3,a_4,a_5\} \) of the decision object \(x_i\) (i.e. the patient) is assumed to represent urinary microalbumin (UM), creatinine (CR), urea nitrogen (UN), carbon dioxide(\(CO_2\)), and haemoglobin (HEM), respectively. During the intercepted treatment cycle, patient \(x_i\) visited a total of 4 times, i.e. \(T=\{t_1,t_2,t_3,t_4\}\). Each visit time corresponds to a treatment plan, that is, \(S=\{s_1,s_2,s_3,s_4\}\). The information set for the examination and treatment of patient \(x_i\) is shown in Table 1.

Table 1 The information set for the examination and treatment of patient \(x_i\)

Table 1 illustrates that the changes in the examination indicators of patient \(x_i\) are not always the same during the consultation cycle, and granulating the characteristic information of temporal correlated attributes is the primary problem. In this section, the granulation of temporal correlated mixed attribute features is divided into two steps. The first step is granulating the temporal mixed attribute features. The second is granulating the temporally correlated mixed attribute. The gradient is an effective method for describing the temporal correlated mixed attribute information, and the gradient direction can describe the change of decision objects on time series T. Since the direction of the gradient is consistent with the fastest growth rate of the attribute at the time point, the gradient of the attributes at a time point is defined.

Definition 6

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature, where \(U=\{x_1,x_2,\dots ,x_i,x_M\}\) is a set of decision-making objects, \(T=\{t_1,t_2,\dots ,t_i,t_N\}\) is a decision time point set, \(A=\{a_1,a_2,\dots ,a_k,a_K\}\) is a set of attributes, and \(S=\{s_1,s_2,\dots ,s_l,s_L\}\) be a set of decision schemes. The decision object \(x_i\) corresponds to a decision scheme \(s_l\) on each decision time point \(t_j\). For each object \(x\in U\) attribute \(a \in A\) and time point \(t \in T\), the corresponding information function is \(a:U\times T \rightarrow f_x (a,t)\). Let \(_{\bigtriangledown} F_x(a,t)\) be the gradient set of the decision object x on attribute a and time point t. For \(x_i \in U\), the metric vector between temporal attributes, i.e., the gradient, is:

$$\begin{aligned} _{\bigtriangledown} F_{x_i}(a,t)=(f_a(a_k,t_j),f_t(a_k,t_j)) \cos \theta \end{aligned}$$
(5)

where \(f(a,t)=\{f_a(a_k,t_j),f_t(a_k,t_j)\}\), \(f_a(a_k,t_j)\) is the change in time of the attributes, and \(f_t(a_k,t_j)\) indicates the change in time with the attributes. \(\theta \) represents the angle at which the attribute value has the greatest change in the time point direction, (\(\theta \in [0,360]\)).

Equation 4 establishes the gradient approximation space for the temporal correlated hybrid decision information system, measures the gradient direction difference of each decision object to other decision objects, and can intuitively display the direction of information change over time. In fact, the above calculation process calculates the longitudinal gradient of the dataset.

Remark 1

For a discrete dataset with temporal correlated hybrid attributes in a given period, the gradient approximation space is calculated as follows:

$$\begin{aligned} _{\bigtriangledown} F_{x_i}(a,t)=\frac{\partial (F_{x_i}(a,t))}{\partial t}= \frac{a_k(x_i,t_{j+1})-a_k(x_i,t_{j-1})}{t_{j+1}-t_{j-1}}, j=1,2,\dots , N \end{aligned}$$
(6)

Next, to construct temporal correlated attribute feature gradient approximation space, the cosine similarity is used to measure the similarity between objects.

Definition 7

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature. For each object \(x\in U\) attribute \(a \in A\) and time point \(t \in T\), the corresponding information function is \(a:U\times T \rightarrow f_x (a,t)\). Let \(_{\bigtriangledown} F_x(a,t)\) be the gradient set of the decision object x on the attribute a and the time point t, \(\rho \) be the correlated degree attributes. The similarity between decision objects \(x_i,x_m \in U, (i,m=1...M)\) is then defined as follows:

$$\begin{aligned} \tau _{(x_i,x_m)}(t)=\cos (_{\bigtriangledown} F_{x_i}(a,t),_{\bigtriangledown} F_{x_m}(\rho a,t))) \end{aligned}$$
(7)

where \(\rho \in [-1,1]\), \( \rho <0\) indicates the attributes are negatively correlated, \( \rho \textsc {>}0\) indicates that the properties are positively correlated, and \(\rho =0\) indicates that the properties are independent of each other. For the cosine of the angle between two decision objects, the smaller the angle is, the larger the cosine value, and the higher the similarity.

In the decision-making problem of clinical efficacy evaluation, the evaluation results of clinical efficacy are obtained from the changes in test indicators (attributes). Three kinds of relationships exist among attributes: negative correlation, positive correlation and mutual independence. A positive attribute value increment indicates that the treatment scheme has a positive curative effect, which is called positive therapeutic effect evaluation, and \(\rho >0\). A negative attribute value increment indicates that the treatment scheme has a negative curative effect, which is called negative curative effect evaluation, and \(\rho <0\).

Binary relation is an important tool for information granulation and the core of rough set and its extended model. Therefore, in order to effectively classify the no artificial labeling original data with mixed temporal correlation features, we construct the composite binary relation and its rough set model based on the gradient and cosine similarity.

Definition 8

Let \(IS=(U,T,A,S)\) be a decision-making information system with temporal correlated hybrid attribute feature, \(R_{(a,t)}(X)\) be a temporal correlated composite binary relation, \(X \in U, x_i, x_m \in X\).

$$\begin{aligned} R_{(a,t)}(X)=\{(x_m,t)\in U\times T|(x_i,x_m)\in \tau _{(x_i,x_m)}(t)\} \end{aligned}$$
(8)

This Def. 8 completes the temporal correlated mixed attribute feature information representation and granulation process. For any time point T and attribte A, there is a relation \(R_ {(a, t)} (x_i, x_m)\) between object \(x_i\) and \(x_m\).

For the temporal correlated composite binary relation \(R_{(a,t)}(X)\) we have the following properties:

  1. (1)

    Reflexivity: \(R_{(a,t)}(x,x)\), \(x\in X\).

  2. (2)

    Symmetry: \(R_{(a,t)}(x,y)=R_{(a,t)}(y,x)\), \(x,y \in X\).

  3. (3)

    Nonadditivity: \(R_{(a,t)}(x,y) \cup R_{(a,t)}(y,z) \ne R_{(a,t)}(x,z)\), \(x,y,z \in X\).

Proof

They are straightforward.

In the following sections, using the description of the decision-making information system with temporal correlated hybrid attribute features as a reference, we investigate the rough approximation of temporal correlated decision-making objects over the decision-making information system. \(\square \)

Definition 9

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature, where \(U=\{x_1,x_2,\dots ,x_i,\dots , x_M\}\) is a set of decision-making objects, \(T=\{t_1,t_2,\dots ,t_i,\dots ,t_N\}\) is a decision time point set, and \(R_{(a,t)}(X)\) is a temporal correlated composite binary relation on \(U \times T\). For \(X\subseteq U\) and \(x_i\in X\), the upper and lower approximates of the rough set model based on the temporal correlation mixed information system are as follows:

$$\begin{aligned} \begin{aligned}&\overline{R_{(a,t)}}(X)=\{(x_i,t)\in U \times T | R_{(a,t)}(X) \cap X =\emptyset \}.\\&\underline{R_{(a,t)}}(X)=\{(x_i,t)\in U \times T | R_{(a,t)}(X) \subseteq X\}. \end{aligned} \end{aligned}$$
(9)

Moreover, for any \(X\subseteq U\) on the time series T, the exact lower approximation is composed of three disjoint domains: the positive domain \(POS_{(a,t)} (X)\), the negative domain \(NEG_{(a,t)} (X)\) and the boundary domain \(BND_{(a,t)} (X)\).

Definition 10

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature, where U is a set of decision-making objects, T is a decision time point set, and \(R_{(a,t)}(X)\) is a temporal correlated composite binary relation on \(U \times T\). The exact lower approximation is composed on three disjoint domains, namely the positive domain \(POS_{(a,t)} (X)\), the negative domain \(NEG_{(a,t)} (X)\) and the boundary domain \(BND_{(a,t)} (X)\), \(x_i, x_m \in X\).

$$\begin{aligned} \begin{aligned}&POS_{(a,t)} (X)=\{\underline{R_{(a,t)}}(X) \cap ((R_{(a,t)}(x_i)(x_m))>0)\}=\{\underline{R_{(a,t)}}(X) \cap \{\tau _{(x_i, x_m)}(t)>0\}\}.\\&NEG_{(a,t)} (X)=\{\underline{R_{(a,t)}}(X) \cap ((R_{(a,t)}(x_i)(x_m))<0)\}=\{\underline{R_{(a,t)}}(X) \cap \{\tau _{(x_i, x_m)}(t)<0\}\}.\\&BND_{(a,t)} (X)=U-POS_{(a,t)} (X)-NEG_{(a,t)} (X). \end{aligned} \end{aligned}$$
(10)

Where \(\{\tau _{(x_i, x_m)}(t)>0\}\) is the set of objects that satisfy \(\tau _{(x_i, x_m)}(t)>0\), \( R_{(a,t)}(x_i)(x_m)\) is the temporal correlated hybrid binary classes of object \(x_i\) with respect to temporal correlated composite binary relation \(R_{(a,t)}(x_i)(x_m)\).

Based on the composite binary relation \(R_{(a,t)}(X)\), the unannotated raw information with mixed temporal correlation features is divided into three pair-disjoint regions, the information with the same positive direction is put into the positive domain (\(POS_{(a,t)} (X)\)), and the information with the same negative direction is put into the negative domain (\(NEG_{(a,t)} (X)\)). The uncertain information is put into the boundary domain (\(BND_{(a,t)} (X)\)). This classification method can show the uncertainty characteristics information, and provide as effective information as possible for the subsequent data processing and calculation.

Remark 2

In clinical efficacy evaluation decision-making, \(POS_{(a,t)} (X)\) represents the set of effective schemes, \(NEG_{(a,t)} (X)\) represents the set of ineffective schemes, and \(BND_{(a,t)} (X)\) represents the set of uncertain effective schemes.

In this section, we introduce a rough set based on machine learning model for information classification without artificial labels. For the specific decision-making problem of clinical efficacy evaluation, temporally correlated mixed information is classified to group attributes (examination indicators) with similar differences over the course of long-term treatment. Within clinical efficacy evaluation, the positive domain represents the set of schemes with deterministic efficacy, the negative domain represents the deterministic inefficacy schemes, and the boundary domain represents the set of uncertain efficacy schemes. On this basis, identifying and ranking the effective treatment plan from the classified set is the last step in handling the decision-making problem of clinical efficacy evaluation. In short, to improve the accuracy and efficiency of decision making, two steps are used to handle the evaluated decision making problem of the unannotated raw information with mixed temporal correlation features. One step is to realize effectively classify temporal correlated mixed attribute information, and the second is to sort the classified set on this basis. This leads to the research content of the following sections.

3.2 A nonadditive rough set based on temporal correlated feature rough set and the Chqouet integral

Based on the classification of temporal correlated mixed attribute information in Sect. 3.1, this section focuses on solving the ordering problem of decision schemes in positive domain that represent the set of schemes with deterministic efficacy. The Chqouet integral based on non-additive measure is introduced to constructe a nonadditive rough set (NARST) and then calculate the correlated feature attributes. Firstly, the calculation method of fuzzy measure are given to achieve the characterization of the correlated feature. On this basis, the procedure of Chqouet integral based on fuzzy measure are given.

The period of disease progression control is uncertain because each patient’s visit cycle is not uniform. Furthermore, at the beginning of the diagnosis and treatment process, the clinic may not be able to provide all possible examinations and treatments at once but may gradually increase the examination indicators and adjust the treatment plan to ensure desirable treatment effects. Therefore, columnable additive phenomena, or nonadditive phenomena, are exhibited in both the time point of treatment and the attributes of diagnostic indicators. To address this, we introduce the Choquet integral to compute the value domain \(V_A (T,S)\) with the nonadditive correlated attribute A on the time series T using the scheme S.

Definition 11

Let \(IS=(U,T,A,S)\) be a decision-making information system with a temporal correlated hybrid attribute feature, where U is a set of decision-making objects, and T is a decision time point set. Let \(\varphi (t_i \dots t_j):U\rightarrow [0,+\infty ]\) be an attribute correlated fuzzy measure.

$$\begin{aligned} \varphi (t_i\dots t_j)=\frac{\varepsilon }{\lambda }\left[ \prod _{i=1}^{n}(1+\lambda \varphi (t_i\dots t_{j-1}))-1 \right] \end{aligned}$$
(11)

Where \(\lambda \in [0,1]\) is a fuzzy coefficient, and \(\varepsilon \) is the disturbance parameter proportional to the number of decision objects.

Property 1. The attribute correlated fuzzy measure \(\varphi (t_i \dots t_j):U\rightarrow [0,+\infty ]\) satisfies the following:

  1. (1)

    Nonnegativity: \(\varphi (\emptyset )=0, \varphi (T)=+\infty \).

  2. (2)

    Monotonicity: \(A\subseteq B \Rightarrow \varphi (A) \le \varphi (B)\).

Definition 12

Let \(IS=(U,T,A,S)\) be a decision-making information system with temporal correlated hybrid attribute feature, where U is a set of decision-making objects, T is a decision time point set, A is a set of attributes, and S is a set of decision schemes. Let \(V_A (T,S)=\{V_{a_1}(T,S),V_{a_2}(T,S), \dots , V_{a_k}(T,S),\dots , V_{a_K}(T,S)\}\) be a set of value domains with the nonadditive correlated attribute A on the time series T using the scheme S, where \(V_{a_k}(T,S)\) and \(V_A (T,S)\) are defined as follows.

$$\begin{aligned} V_{a_k} (T,S)= & {} \int A d \varphi = \sum _{j=1}^{n}\varphi (t_1,\dots ,t_j) _{\bigtriangledown} F_{x}(a,t) \end{aligned}$$
(12)
$$\begin{aligned} V_A (T,S)= & {} \int V_{a_k}(T,S) d \varphi =\int V_{a_k} (T,S) d \varphi =\sum _{k=1}^{n} \varphi (T) V_{a_k}(T,S) \end{aligned}$$
(13)

For a discrete dataset with temporally correlated attributes in a given period, the value domain is as follows:

$$\begin{aligned} V_{a_k}(T,S)=\int A d \varphi =\sum _{i,j=1}^{n} \varphi (t_i \dots t_j) \left[ \frac{a_k(x,t_{j})-a_k(x,t_{j-1})}{t_{j}-t_{j-1}}\right] \end{aligned}$$
(14)

The calculation of the Choquet integral in the value domain of the correlated attribute is explained in Example 2, continued from Example 1.

Example 2

The gradient values of Table 1 derived from Eq. 12 are shown in Table 2.

Table 2 The gradient values of \(x_i\)

\(V_{a_i}(T,S)=V_{a_i}(t_1,S) \times \varphi (t_1)+V_{a_i}(t_2,S) \times [\varphi (t_1,t_2)-\varphi (t_1)]+V_{a_i}(t_3,S) \times [\varphi (t_1,t_2,t_3)-\varphi (t_1,t_2)]+\dots +V_{a_i}(t_i,S) \times [\varphi (t_1,t_2,t_3,\dots ,t_i)-\varphi (t_1,t_2,\dots ,t_{i-1})]\) according to Eq. 12.

Given the initial value \(\varphi (t_1)=0.1\), \(\lambda =0.3\), \(\varepsilon =0.05\) according to Eq. 10.

The value domain of attribute A of decision object \(x (x\in U)\) is calculated after scheme S is used on time series T, as shown in Table 3.

Table 3 The value domain of \(x_i\)

In this section, we use the rough set under the grain computing framework to classify and order the temporal correlation mixed feature information. In the following section, detailed decision principles and steps based on this model are provided real-world clinical efficacy evaluation problems, and real clinical big data are used for simulation experiments and data analysis to support clinical decision-making.

Actually, our proposed NARST model is not only applicable to a long-term medical decision-making problems, but also to a wide range of management decision-making scenarios with temporal correlation characteristics, such as stock investment decisions, medical drug reactions, traffic flow and so on.

4 Evaluation model of TCM and Western medicine for the treatment of CRF based on NARST

Let \(IS=(U,T,A,S)\) be a decision-making information table with temporal correlated hybrid attribute features, where \(U=\{x_1,x_2,\dots ,x_i,x_M\}\) is a set of patients, \(T=\{t_1,t_2,\dots ,t_i,t_N\}\) is a time point set of visits, \(A=\{a_1,a_2,\dots ,a_k,a_K\}\) is a set of laboratory test indicator, and \(S=\{s_1,s_2,\dots ,s_l,s_L\}\) is a set of treatment schemes. Patient \(x_i\) corresponds to a treatment scheme \(s_l\) at each visit time point \(t_j\). For each object \(x\in U\) attribute \(a \in A\) and time point \(t \in T\), the corresponding information function is \(a:U\times T \rightarrow f_x (a,t)\). Let \(_{\bigtriangledown} F_x(a,t)\) be the gradient set of the decision object x on the attribute a and the time point t, \(i=1,2,\dots ,M;j=1,2,\dots ,N; k=1,2,\dots ,K;l=1,2,\dots ,L\). The decision principles and steps of clinical efficacy evaluation based on the Choquet integral and temporal correlation feature rough set model are as follows:

Step 1. Information classification based on the temporal correlation feature rough set.

(1) The gradient calculation of temporal attributes is achieved according to Eq. 6.

(2) The similarity calculation between patients is calculated according to Eq. 7.

(3) The set of schemes with deterministic efficacy \(POS_{(a,t)} (X)\), the set of schemes with deterministic efficacy without efficacy \(NEG_{(a,t)} (X)\), and the set of uncertain efficacy schemes \(BND_{(a,t)} (X)\) are calculated according to Eq. 10.

Step 2. Ranking of treatment schemes based on the classified dataset.

(4) The value domain \(V_{(a_k)} (T,S) \) of the attribute of an individual patient \(x_i\) at all visit points T is calculated to obtain the single laboratory test indicator value in the visit cycle according to Eq. 12.

(5) The fusion calculation of all laboratory test indicator values of an individual patient during the visit cycle is achieved according to Eq. 13.

Finally, a ranking of treatment options can be obtained.

5 Clinical real data experiment and analysis

Following the decision principles and steps for clinical efficacy evaluation discussed in Sect. 4, in this section, simulation experiments and data analysis using a dataset of 80,139 medical records from 3094 patients with CRF are described. These experiments are used to verify the performance of the proposed NARST model and provide decision support and recommendations for clinical efficacy evaluation. The flow chart of the data experiment and analysis is shown in Fig. 4.

Fig. 4
figure 4

The flow chart of data experiment and analysis

CRF refers to the progressive impairment of kidney function caused by various renal diseases. As the disease progresses, patients experience a gradual loss of residual kidney function, ultimately leading to end-stage renal disease (Lv and Zhang 2019). The mortality rate among patients with CRF accounts for \(1.5\%\) of global mortality and has become a significant global public health issue (Covic et al. 2018). Therefore, research on the prevention and treatment of CRF holds great scientific significance and practical value.

CRF not only leads to complications such as electrolyte imbalance and metabolic acidosis but also affects the functions of multiple systems, including the cardiovascular, nervous, digestive, and immune systems; CRF may even be life-threatening. Current treatment strategies for CRF mainly focus on controlling the underlying causes and high-risk factors that influence disease progression, thus delaying the progression of chronic renal insufficiency. Treatment also involves managing various complications. End-stage renal disease requires renal replacement therapy, such as haemodialysis, peritoneal dialysis, or kidney transplantation (Bazeley and Wish 2021).

Next, in Sect. 5.1, the characteristics of the original dataset are described. In Sect. 5.1, classification performance evaulation of the proposed NARST model will be presented. In Sect. 5.3 the integration of the nonoperator aggregation-based group decision-making methodology utilizing Choquet integral and multigranularity rough sets with traditional medical statistical methods is described, and real clinical big data are used for simulation experiments and data analysis. Finally, in Sect. 5.3.3, clinical decision support and recommendations are provided based on the experimental findings.

5.1 Dataset

In this section, we utilize a dataset consisting of 80,139 medical records from 3,094 patients with chronic renal disease collected from the outpatient department of Zhongshan Traditional Chinese Medicine Hospital. The data cover the period from January 2019 to April 2022. Each patient’s records include a maximum of 45 laboratory test indicators. Additionally, a total of five diagnosis codes were manually annotated by senior attending physicians with over 30 combined years of clinical practice experience. These codes include microalbuminuria (quantitative measurement of 24-h urinary protein), creatinine, blood urea nitrogen, haemoglobin, and carbon dioxide (as shown in Fig. 5).

Fig. 5
figure 5

Clinical real datasets

The green line represents the mean, and the purple line represents the median. The reference ranges for the indicators are as follows: creatinine (41–84  μmol/L), microalbuminuria (< 20.0 mg/L), blood urea nitrogen (2.6–7.5 mmol/L), carbon dioxide (22.0–31.0 mmol/L), and haemoglobin (female: 115–150 g/L, male: 130–175 g/L). The sex and age distribution of the overall dataset are shown in Table 4.

Table 4 The sex and age distribution of the overall dataset

5.2 Comparison experiment

To verify the performance of the proposed NARST model, we invited two nephrologists to label all the data and divide the data into three categories based on treatment effectiveness: effective, ineffective, and further observation.

A comprehensive evaluation of the performance of the NARST model will be provided, such as accuracy, precision, recall, and F1-score. The performance of 6 temporal feature models such as Naive Bayes, multi-layer perception (MLP), logistic regression, decision trees, support vector machines (SVM), random forest will be compared with the proposed NARST model. Figure 6 shows the classification performance of 7 models in the whole universe (i.e., the whole dataset), negative domain (ineffective), positive domain (effective), and boundary domain (further observation).

For the whole universe, the proposed NARST generally outperformed all the benchmark methods in term of accuracy, precision, recall, and F1-score (Fig. 6a). For the negative domain, the proposed NARST outperformed all the benchmark methods in term of accuracy (Fig. 6b). For the positive domain, the proposed NARST outperformed all the benchmark methods in term of both accuracy and precision (Fig. 6c). For the boundary domain, the proposed NARST outperformed all the benchmark methods in term of precision, F1-score, recall (Fig. 6d). Therefore, in clinical decision-making, the proposed NARST generally outperformed all the benchmark methods in classifying effective treatment alternatives.

Fig. 6
figure 6

Classification performance evaluation of the NARST model

5.3 Clinical efficacy evaluation of CRF in real-world settings

5.3.1 CRF data calculation

The laboratory test indicator values of groups \(x_1, x_2,...,x_i,...,x_M\) are calculated as shown in Table 5 according to Eq.  11 continued from Example 2.

Table 5 Laboratory test indicators values of groups \(x_1, x_2,...,x_i,...,x_M\)

The laboratory test indicator values of individual patients are sorted in ascending order. Then, the laboratory test indicator value of each \(x_i\) is calculated according to Eq. 12, and the corresponding treatment scheme of group patients is sorted according to \(V_A (T,S)\) value, as shown in Table 6.

Table 6 Treatment scheme effect ranking results

A given \(\rho >0\) indicates that the scheme is valid. \(\rho \le 0\) indicates that the scheme is uncertain (or invalid). The number of groups in the dataset for which the treatment regimen was effective during this period was 1445, accounting for approximately 61\(\% \) of the total population. The number of people who did not respond to treatment was 934, or approximately 39\(\%\) of the total population. We next test the difference.

5.3.2 Chi-square test for differences

Next, we use a contingency table to perform chi-square tests for differences in treatment outcomes across age, sex, and comorbidities.

(1) Sex and age differences in treatment outcomes.

Table 7 Sex and age differences in treatment outcomes

\(P\le 0.05\) indicating no significant difference in treatment effect exists by sex or age as shown in Table 7.

(2) The difference in treatment effect in the combination of underlying diseases.

Samples with renal insufficiency (chronic renal insufficiency) combined with underlying diseases such as gout (rheumatism), hypertension, diabetes, hyperuricaemia or abnormal liver function are selected for the Pearson chi-square test, as shown in Table 8.

Table 8 The combination diseases differences in treatment outcomes

* means \(P < 0.05\), indicating a significant difference in the combined effects of hypertension and diabetes mellitus.

Hence, during this therapeutic period, the treatment outcomes are more favourable for the patient population with renal insufficiency (chronic renal insufficiency) accompanied by hypertension, diabetes, or hyperuricaemia. However, no therapeutic effects have been observed yet in the group with renal insufficiency (chronic renal insufficiency) complicated by gout (rheumatism) or hepatic dysfunction.

5.3.3 Ranking treatment schemes

The treatment effect will be ranked according to the value calculated by the NARST model. By combining these results with the analysis results presented in Sect. 5.3.2, the ranking results of the treatment effect were obtained as follows: no complications > CKD combined with hypertension > CKD combined with diabetes. S is the set of decision plans corresponding to the decision time point, and the treatment plan can be summarized as follows:

(1) Treatment of CRF combined with diabetes.

The Western medicine schemes are as follows:

  • Scheme 1: Recombinant erythropoietin injection, compound alpha-ketone tablets, furosemide tablets, raloxifene dispersible tablets, Uraemic Kang capsule, Haiqun Shenni capsule, hydrochlorothiazide tablets, etc.

  • Scheme 2: Lai Fu insulin injection, Ganjing insulin injection (Insulin), Uraemic Kang capsule, compound alpha-ketone tablets, calcitriol soft capsules, calcium carbonate/vitamin D3 tablets, etc.

  • Scheme 3: Lai Fu insulin injection, Ganjing insulin injection (Insulin), compound alpha-ketone tablets, calcitriol soft capsules, calcium carbonate/vitamin D3 tablets, etc.

  • Scheme 4: Raloxifene dispersible tablets, Uraemic Kang capsule, compound alpha-ketone tablets, recombinant erythropoietin injection, calcitriol soft capsules, calcium acetate capsules, etc.

In addition, the herbal combinations are as follows:

  • Scheme 1: Poria, Astragalus root, Chinese yam, white atractylodes, coptis, salvia, cinnamon twig, etc.

  • Scheme 2: Poria, Astragalus root, Chinese yam, white atractylodes, prepared rehmannia, dogwood fruit, roasted licorice, etc.

  • Scheme 3: Coptis, bamboo shavings, steamed tangerine peel, processed pinellia, processed atractylodes, Poria, licorice, Job’s tears, etc.

  • Scheme 4: Atractylodes, achyranthes, prepared rhubarb, common knotgrass, Job’s tears, steamed tangerine peel, licorice, etc.

(2) Treatment of CRF combined with hypertension.

The Western medicine schemes are as follows:

  • Scheme 1: Recombinant human erythropoietin injection, sodium bicarbonate tablet (CCB), nifedipine controlled release tablet (Beta-blocker), compound A-ketoic acid tablet, carvedilol tablet, urine Dukang mixture, etc.

  • Scheme 2: Nifedipine controlled release tablets (CCB), carvedilol tablets (Beta-blocker), calcitriol softgel capsules, calcium carbonate/vitamin D3 tablets, sodium bicarbonate tablets, etc.

  • Scheme 3: Nifedipine controlled release tablets (CCB), carvedilol tablets (Beta-blocker), metoprolol sustained-release tablets, calcitriol softgel capsules, calcium carbonate/vitamin D3 tablets, lanthanum carbonate chewable tablets, recombinant human erythropoietin injection, etc.

  • Scheme 4: Nifedipine controlled release tablets (CCB), carvedilol tablets (Beta-blocker), terazosin capsules, compound A-ketoic acid tablets, sodium bicarbonate tablets, recombinant human erythropoietin injection, etc.

In addition, the herbal combinations are as follows:

  • Scheme 1: Poria, Astragalus, yam, white art, Salvia miltiorrhiza, cassia twigs, etc.

  • Scheme 2: Astragalus, cooked rhubarb, ulmus, salviorrhiza, Tuckahoe, etc.

  • Scheme 3: Coptis, bamboo shaver, steamed orange peel, pinellia, atractylode, licorice, coix seed, etc.

  • Scheme 4: Steamed orange peel, pinellia, Tuckahoe, licorice, white art, cooked rhubarb, atractylodes, etc.

Authoritative clinical medical literatures have also provided treatment schemes for CRF complications which are consistent with the results of the paper, as shown in Table 9. The reason for the inconsistency may be due to the clinical personalized medication strategy. In addition, by comparing the results calculated by the proposed NARST model with those in the clinical medical literatures, the progression of CKD is not easy to be alleviated in real-world settings, and non-first-line therapy are the primary treatment options have better clinical effect than the first-line therapy.

Table 9 Clinical medical literatures

6 Conclusion and outlook

In this study, a clinical effectiveness evaluation model and method for the long-term diagnosis and treatment of chronic diseases in practical situations are proposed. The traditional paradigm of clinical decision-making research follows a problem-driven statistical approach, where the data come after the problem. In contrast, the decision-making problem of chronic disease efficacy evaluation based on real-world dataset is investigated in this paper, integrating granular computing with machine learning and nonadditive measures. A temporal correlation feature rough set model based on the Choquet integral and machine learning is constructed. The theoretical model is then applied to the decision-making problem of long-term clinical effectiveness evaluation in chronic kidney failure in a real-world setting. The theoretical model is used to analyse various potential associative patterns in clinical real-world data and to conduct an empirical analysis of clinical decision-making practices based on data. Finally, a research paradigm and “data first, problem second” approach are formed.

The research in this paper makes two main contributions. First, it establishes the general theory of decision making and the decision-making method of a temporal correlation hybrid feature rough set model. Second, it integrates data-driven clinical decision-making paradigms with traditional medical research paradigms, achieving preliminary explorations of the feasibility of data-driven clinical decision-making in the field of clinical decision-making. Additionally, the proposed methods provide a technical and applied approach for the classification of groups in cohort studies and controlled studies in medical real-world research.

Clinical practice is a complex engineering system, and the unique characteristics of the clinical decision-making process and the information generated (such as interval data), as well as the new features of the big data era, render traditional medical statistical methods ineffective in analysing these data. Furthermore, the comprehensive analysis of data used for clinical decision-making is even more complex and challenging. Future research will focus on the integration of multimodal data, including interval data, images, and signals (such as pulse), in clinical decision-making problems.