Combining machine learning and metaheuristics algorithms for classification method PROAFTN

The supervised learning classification algorithms are one of the most well known successful techniques for ambient assisted living environments. However the usual supervised learning classification approaches face issues that limit their application especially in dealing with the knowledge interpretation and with very large unbalanced labeled data set. To address these issues fuzzy classification method PROAFTN was proposed. PROAFTN is part of learning algorithms and enables to determine the fuzzy resemblance measures by generalizing the concordance and discordance indexes used in outranking methods. The main goal of this chapter is to show how the combined meta-heuristics with inductive learning techniques can improve performances of the PROAFTN classifier. The improved PROAFTN classifier is described and compared to well known classifiers, in terms of their learning methodology and classification accuracy. Through this chapter we have shown the ability of the metaheuristics when embedded to PROAFTN method to solve efficiency the classification problems.


sed learning
lassification approaches face issues that limit their application especially in dealing with the knowledge interpretation and with very large unbalanced labeled data set.To address these issues fuzzy classification method PROAFTN was proposed.PROAFTN is part of learning algorithms and enables to determine the fuzzy resemblance measures by generalizing the concordance and discordance indexes used in outranking methods.The main goal of this chapter is to show how the combined meta-heuristics with inductive learning techniques can improve performances of the PROAFTN classifier.The improved PROAFTN classifier is described and compared to well known classifiers, in terms of their learning methodology and classification accuracy.Through this chapter we have shown the ability of the metaheuristics when embedded to PROAFTN method to solve efficiency the classification problems.

Introduction

In this chapter we introduce and compare various algorithms which have been used to enhance the performance of the classification method PROAFTN.It is a supervised learning that learns from a training set and builds set of prototypes to classify new objects [10,11].The supervised learning classification methods have been applied extensively in Ambient Assisted Living (AAL) from sensors' generated data [36].The enhanced algorithm can be used for instance to activity recognition and behavior analysis in AAL on sensors data [43].It can be applied for the classification of daily living activities in a smart home using the generated sensors data [36].Hence, the enhanced PROAFTN classifier can be integrated to active and assisted living systems as well as for smart homes health care monitoring frameworks as any classifiers used in the comparative study presented in this chapter [47].This chapter is concerned with the supervised learning methods where the given samples or objects have known class labels called also training set, and the target is to build a model from these data to classify unlabeled instances called testing data.We focus on the classification problems in which classes are identified with discrete, or nominal, values indicating for each instance to which class it belongs, among the classes residing in the data set [21,60].Supervised classification problems require a classification model that identifies the behaviors and characteristics of the available objects or samples called training set.This model is then used to assign a predefined class to each new object [31].A variety of research disciplines such as statistics [60], Multiple Criteria Decision Aid (MCDA) [11,22] and artificial intelligence have addressed the classification problem [39].The field of MCDA [10,63] includes a wide variety of tools and methodologies developed for the purpose of helpin a decision model (DM) to select from finite sets of alternatives according to two or more criteria [62].In MCDA, the classification problems can be distinguished from other classification problems within the machine learning framework from two perspectives [2].The first includes the characteristics describing the objects, which are assumed to have the form of decision criteria, providing not only a description of the objects but also some additional preferential information associated with each attribute [22,51].The second includes the nature of the classification pattern, which is defined in both ordinal, known as sorting [35], and nominal, known as multicriteria classification [10,11,63].Classification based machine learning models usually fail to tackle these issues, focusing basically on the accuracy of the results obtained from the classification algorithms [62].

This chapter is devoted to the classification method based on the preference relational models known as outranking relational models as described by Roy [52] and Vincke [59].The method presented in this paper employs a partial comparison between the objects to be classified and prototypes of the classes on each attribute.Then, it applies a global aggregation using the concordance and nondiscordance principle [45].Therefore it avoids resorting to conventional distance that aggregates the score of all attributes in the same value unit.Hence, it helps to overcome some difficulties encountered when data is expressed in different units and to find the correct preprocessing and normalization data methods.The PROAFTN method uses concordance and non-discordance principle that belongs to MCDA field developed by Roy [52,54].Moreover, Zopounidis and Doumpos [63] dividing the classification problems based on MCDA into two categories: sorting problems for methods that utilize preferential ordering of classes and multicriteria classification for nominal sorting there is no preferential ordering of classes.In MCDA field the PROAFTN method is considered as nominal sorting or multicriteria classification [10,63].The main characteristic of multicriteria classification is that the classification models do not automatically result only from the training set but depend also on the judgment of an expert.In this chapter we will show how techniques from machine learning and optimization can determine the accurate parameters for fuzzy the classification method PROAFTN [11].When applying PROAFTN method, we need to learn the value of some parameters, in case of our proposed method we have boundaries of intervals that define the prototype profiles of the classes, the attributes' weights, etc.To determine the attributes' intervals, PROAFTN applies the discretization technique as described by Ching et al. [20] from a set of pre-classified objects presenting a training set [13].Even-though these approaches offer good quality solutions, they still need considerable computational time.The focus of this chapter concerns the application of different optimization techniques based on meta-heuristics for learning PROAFTN method.To apply PROAFTN method over very large data, there are many parameters to be set.If one were to use the exact optimization methods to infer these parameters, the computational effort that would be required is an exponential function of the problem size.Therefore, it is sometimes necessary to abandon the search for the optimal solution, using deterministic algorithms, and simply seek a good solution in a reasonable computational time, using meta-heuristics algorithms.In this paper, we will show how inductive learning method based on meta-heuristic techniques can lead to the efficient multicriteria classification data analysis.

The major characteristics of the multicriteria classification method compared with other well known classifiers can be summari ed as follows:

-The PROAFTN method can apply two learning approaches: deductive or knowledge based and inductive learning.In the deductive approach, the expert has the role of establishing the required parameters for the studied problem f r example the experts' knowledge or rules can be expressed as intervals, which can be implemented easily to build the prototype of the classes.

In the inductive approach, the parameters and the clas ification models are obtained and learned automatically from the training dataset.-PROAFTN uses the outranking and preference modeling as proposed by Roy [52] and it hence can be used to gain understanding about the problem domain.-PROAFTN uses fuzzy sets for deciding whether an object belon

to a class or
ot.The fuzzy membership degree gives an idea about its weak and strong membership to the corresponding classes.

The overriding goal of this study is to present a generalized framework to learn the classification method PROAFTN.And then compare the performance and the efficiency of the learned method against well-known machine learning classifiers.

We shall conclude that the integration of machine learning techniques and meta-heuristic optimization to PROAFTN method will lead to significantly more robust and efficient data classification tool.

The rest of the chapter is organized as follows: Sect. 2 overviews the PROAFTN methodology and its notations.Section 3 explains the generalized learning framework for PROAFTN.In Sect. 4 the results of our experiments are reported.Finally, conclusions and future work are drawn in Sect. 5.


PROAFTN Method

This section describes the PROAFTN procedure, which belongs to the class of supervised learning to solve classification problems.Based on fuzzy relations between the objects being classified and the prototype of the classes, it seeks to define a membership degree between the objects and the classes of the problem [11].The PROAFTN method is based on outranking relation as an alternative to the Euclidean distance through the calculation of an indifference index between the object to be assigned and the prototype of the classes obtained through the training phase.Hence, to assign an object to the class PROAFTN follow the rule known as concordance and no discordance principle as used by the outranking relations: if the object a is judged indifferent or similar to prototype of the class according to the majority of attributes "concordance principle" and there is no attribute uses its veto against the affirmation "a is an indifferent to this prototype" "no-discordance principal", the object a is considered indifferent to this prototype and it should be assigned to the class of this prototype [11,52].

PROAFTN

s been applied to
he resolution of many real-world practical problems such as acute l

kemia diagnosis
14], asthma treatment [56], cervical tumor segmentation [50], Alzheimer diagnosis [18], e-Health [15] and in optical fiber design [53], asrtocytic and bladder tumors grad ng by means of computeraided diagnosis image analysis system [12] and it was also applied to image rocessing and classification [1].PROAFTN also has been applied for intrusion detection and analyzing Cyber-attacks [24,25].Sing oach used by PROAFTN.


PROAFTN Notations

The PROAFTN notations used in this paper are presented in Table 1.


Fuzzy Intervals et of m attributes {g 1 , g 2 , (b h i )] is defined where S 2 j (b h i ) ≥ 1 j (b h i ). Two thresholds d 1 j (b h i ) and d 2 j (b h i ) are introduced to define the fuzzy intervals: the pessimistic interval [S 1 j (b h i ), S 2 j (b h i )] and the optimistic inter- val [S 1 j (b h i )−d 1 j (b h i ), S 2 j (b h i )+d 2 j (b h i )].
The pessimistic intervals are determined by applying discretization techniques from the training set as described in [26,28].The classical data mining techniques, such as decision tree, numerical domains "continuous numeric values" into intervals and the discretized intervals are treated as ordinal "discretized" values during induction.Ramírez-Gallego et al. [29] present more details on different approaches used for data discretization in machine learning.In our case the discretized intervals are treated as intervals and they are not treated as discrete value.As a result, PROAFTN avoids losing information n the induction process and also can use both inductive and deductive learning without transforming the continue values to discrete data.In deductive learning, the rules in our case can also be given by interacting with the expert in the form of ranges or intervals, and then can be optimized during the learning process.Figure 2 depicts the representation of PROAFTN's intervals.

To apply PROAFTN, the pessimistic interval [S 1 jh , S 2 jh ] and the optimistic interval [q 1 jh , q 2 jh ] [13] of each attribute in each class need to be determined.Figure 2 depicts the representation of PROAFTN's intervals.When evaluating a certain quantity or a measure with a regular or crisp interval, there are two extreme cases, which we should try to avoid.It is possible to make a pessimistic evaluation, but then the interval will appear wider.It is also possible to make an optimistic evaluation, but then there will be a risk of the output measure to get out of limits of the resulting narrow interval, so that the reliability of obtained results will be doubtful.To overcome this problem we have introduced fuzzy approach to features' or criteria evaluation as presented in Fig. 1 [16].They permit to have simultaneously both pessimistic and optimistic representations of the studied measure [23].This is why we introd r necessary limits, and the kernel (S1 to S2) will contain the most true-like values [61].To apply PROAFTN, the jh , S

jh ] and the optimistic interval [q 1 jh ,
q 2 jh ] [13] for each attribute in each class need to be determined, where:
q 1 jh = S 1 jh − d 1 jh q 2 jh = S 2 jh + d 2 jh (1)
applied to:
q 1 jh ≤ S 1 jh q 2 jh ≥ S 2 jh (2) Hence, S 1 jh = S 1 j (b h i ), S 2 jh = S 2 j (b h i ), q 1 jh = q 1 j (b h i ), q 2 jh = q 2 j (b h i ), d 1 jh = d 1 j (b h i ), and d 2 jh = d 2 j (b h i ).
The following subsections explain the stages requir Indifference Relation

The initial stage of classification procedure is performed by calculating the fuzzy i fuzzy resemblance measure.The fuzzy indifference relation is based on the concordance and non-discordance principle whic ect a to the prototype b h i according to the attribute g j .where
C i jh (a, b h i ) = min{C 1 jh (a, b h i1 ), C i2 jh (a, b h i )},(4)Cj (a, b h i ) Indifference Indifference 1 0 S 2 jh q 2 jh q 1 jh g j (a)C i1 jh (a, b h i ) = d 1 j (b h i ) − min{S 1 j (b h i ) − g j (a), d 1 j (b h i )} d 1 j (b h i ) − min{S 1 j (b h i ) − g j (a), 0}andC i2 jh (a, b h i ) = d 2 j (b h i ) − min{g j (a) − S 2 j (b h i ), d 2 j (b h i )} d 2 j (b h i ) − min{g j (a) − S 2 j (b h i ), 0} D i jh (a, b h i )
, is the discordance index tha attribute g j .Two veto thresholds v 1 j (b h i ) and v 2 j (b h i ) [11], are used to define this value, where the object a is considered perfectly different from the prototype b e comparative cases between the object a and prototype b h i according to the attribute g j are obtained (Fig. 2):

-case 1 (strong indifference):
C i jh (a, b h i ) = 1 ⇔ g j (a) ∈ [S 1 jh , S 2 jh ]; (i.e., S 1 jh ≤ g j (a) ≤ S 2 jh ) -case 2 (no indifference): C i jh (a, b h i ) = 0 ⇔ g j (a) ≤ q 1 jh , or g j (a) ≥ q 2 jh -case 3 (weak indifference): The value of C i jh (a, b h i ) ∈ (0, 1
) is calculated based on Eq. ( 4).(i.e., g j (a) ∈
[q 1 jh , S 1 jh ] or g j (a) ∈ [S 2 jh , q 2 jh ])
The partial fuzzy indifference relation is represented by the trapezoidal membership function.This type of functions are well studied in the references [42] and [9].Table 2 presents the performance matrix which is used to evaluate the prototype of c

(a, b k L k ) ... C L k jk (a, b k L
) ... C L k mk (a, b k L k )

Evaluation of the Membership Degree

The membership degree δ(a, C h ) between the object a and the class C h is calculated based on the indifference degree between

he class C h .To calculate the degree of membership of the
object a to the class C h , PROAFTN apply the formulae given by the Eq. 6.
δ(a, C h ) = max{I(a, b h 1 ), I(a, b h 2 )

..., I(a, b h L h )} (6)

Assignment of an Obje
t to the Class

Once the membership degree of the testing "unlabeled" object a is calculated, the PROAFTN classifier will assign this object to the right class C h by following the decision rule given by Eq. 7.
a ∈ C h ⇔ δ(a, C h ) = max{δ(a, C i )/i ∈ {1, ..., k}}(7)
3 Introduced Meta-heuristic Algorithms for Learning PROAFTN

The classification procedure used by PROAFTN to assign objects to the preferred classes is summarized in Algorithm 1.


Algorithm 1. PROAFTN classification procedure.

Input:A: set of objects; K: the number of classes; w h j the weight of the attribute j of the class h.A is divided into training lassification model for PROAFTN: Assign a relative importance weights w h j , j = 1, .., m; h = 1, ..., proto presented in Fig. 2.

Step 2: Compute the ind class h:
I(a, b h i ) = m j=1 w h j Cj(a, a, b h i )},(9)
where
C 1 j (a, b i h ) = d 1 j (b h i ) − min{S 1 j (b i h ) − gj(a), d 1 j (b h i )} d 1 j (b h i ) − min{S 1 j (b i h ) − gj(a), 0} , C 2 j (a, b i h ) = d 2 j (b h i ) − min{gj(a) − S 2 j (b i h ), d 2 j (b h i )} d 2 j (b h i ) − min{gj(a) − S 2 j (b i h ), 0}
Step 3: Evaluation of the membership degree:
δ(a, C h ) = max{I(a, b h 1 ), I(a, b h 2 ), ..., I(a, b h L h )}(10)
Step 3: Assign the object a to the class:
a ∈ C h ⇔ δ(a, C h ) = max{δ(

C i )/i ∈ {1, ..., k}}(11)
The rest of the chapter is to prese
t the different methodologies based on machine learning and metaheuristic techniques for learning the classification method PROAFTN from data.The goal of the development of such methodologies is to obtain, from the training data set, the PROAFTN parameters that achieve the highest classification accuracy by applying the Algorithm 1.For this purpose, different learning methodologies are summarized in the following subsections.


Learn and Improve PROAFTN Based on Machine Learning Techniques

In [7,13] Thereafter, an induction approach was introduced to compose PROAFTN prototypes to be used for classification.To evaluate the performance of the proposed approaches, a general comparative study was carried out between DT algorithms (C4.5 and ID3) and PROAFTN based on the proposed learn ng techniques.That p rtion of the study concluded that PROAFTN and DT algorithms (C4.5 and ID3) share a very important property: they are both interpretable.In te ms of classification accuracy, PROAFTN was able to outperform DT [16].

A superior technique for learning PROAFTN was introduced using Genetic algorithms (GA).More particularly, the developed technique, called GAPRO, integrates k-Means and a genetic algorithm to establish PROAFTN prototypes automatically from data in near optimal form.The purpose of using GA was to automate and optimize the selection of number of clusters and the thresholds to refining the prototypes.Based on the results generated by 12 typical classification problems, it was noticed that the newly proposed approach enabled PROAFTN to outperform widely used classification methods 13].A GA is an adaptive metaheuristic search algorithm based on the concepts of natural selection and biological evolution.GA principles are inspired by Charles Darwin's theory of "survival of the fittest"; that is, the strong tend to adapt and survive while the weak tend to vanish.GA was first introduced by John H. Holland in the 1970s and further developed in 1975 to allow computers to evolve solutions to difficult search and combinatorial systems, such as function optimization and machine learning.As reported in the literature, GA represents an intelligent exploitation of a random search used to solve optimization problems.In spite of its stochastic behavior, GA is generally quite effective for rapid global searches for large, non-linear and poorly understood spaces; it exploits historical information to direct the search into the region of better performance within the search space [32,49].

In this work, GA is utilized to approximately obtain the best values for the threshold β and the number of clusters κ.The threshold β represents the ratio of the total number of objects from training set within each interval of each attribute in each class.As discussed earlier, to apply the discretization k-Means, the best κ value is required to obtain the intervals:
[S 1 j (b h i ), S 2 j (b h i )], [d 1 j (b h i ), d 2 j (b h i )
] and thresholds β as illustrated in Algorithm 4. In addition, the best value of β is also required to build the classification model that contains the best prototypes as described in Algorithm 4. Furthermore, since each dataset may have different values f r κ and β, finding the best values for β and κ to compose PROAFTN prototypes is considered a difficult optimization task.As a result, GA is utilized to obtain these values.Within this framework, the value for β varies between 0 and 1 (i.e., β ∈ [0, 1]), and the value for κ changes from 2 to 9 (κ ∈ 2, ..., 9).The formulation of the optimization problem, which is based on rs (κ and β), is defined as:
P : Maximize 100 n n r=1 f r (κ, β)(12)
Subject to: κ ∈ {2, ..., 9};
β ∈ [0, 1]
where the objective or fitness function f depends on the classification accuracy and n represents the set of training objects/samples to be assigned to different classes.The procedure for calculating the fitness function f is described in Algorithm 3. In this regard, the result of the optimization problem defined in Eq. ( 12) can vary within the interval [0, 100].

Algorithm 3. Procedure to calculate objective function f .

Step


Learning PROAFTN Using Particle Swarm Optimization

A new methodology based on the particle swarm optimization (PSO) algorithm was introduced to learn PROAFTN.First, an optimization model was formulated, and thereafter a PSO was used to solve it.PSO was proposed to induce the classification model for PROAFTN in so-called PSOPRO by inferring the best parameters from data with high classification accuracy.It was found that PSOPRO is an efficient approach for data classification.The performance of PSOPRO applied to different classification datase the well-known classification methods.

PSO is an efficien evolutionary optimization algorithm using the social behavior of living o ganisms to explore the search space.Furthermore, PSO is easy to code and req ires few control parameters [17].The proposed approach employs PSO for traini

and improving the efficiency of the PROAFTN classi
ier.In this perspective, the optimization model is first formulated, and thereafter a PSO algorithm is used for solving it.During the learning stage, PSO uses training samples to induce the best PROAFTN parameters in the form of prototypes.Then, these prototypes, which represent the classification model, are used for assigning unknown samples.The target is to obtain the set of prototypes that maximizes the classification accuracy on each dataset.

The general description of the PSO methodology and its application is described in [6].As discussed earlier, to apply PROAFTN, th pessimistic interval [S 1 jh , S 2 jh ] and the optimistic interval [q 1 jh , q 2 jh ] for each attribute in each class need to be determined, where:
q 1 jh = S 1 jh − d 1 jh q 2 jh = S 2 jh + d 2 jh (13)
applied to:
q 1 jh ≤ S 1 jh q 2 jh ≥ S 2 jh (14) Hence, S 1 jh = S 1 j (b h i ), S 2 jh = S 2 j (b h i ), q 1 jh = q 1 j (b h i ), q 2 jh = q 2 j (b h i ), d 1 jh = d 1 j (b h i ), and d 2 jh = d 2 j (b h i ).
As mentioned above, to apply PROAFTN, the intervals [S 1 jh , S 2 jh ] and [q 1 jh , q 2 jh ] satisfy the constraints in Eq. ( 14) and the weights w jh must be obtained for each attribute g j in class C h .To simplify the constraints in Eq. ( 14), the variable substitution based on Eq. ( 13) is used.As a result, the parameters d 1 jh and d 2 jh are used instead of q 1 jh and q 2 jh , respectively.Therefore, the optimization problem, which is based on maximizing classification accuracy providing the optimal parameters S 1 jh , S 2 jh , d 1 jh , d 2 jh and w jh , is defined here,
P : Maximize f (S 1 jh , S 2 jh , d 1 jh 1 jh , d 2 j ere f i r calculating the fitness function f (S 1 jh , S 2 jh , d 1 jh , d 2 jh , w jh ) is described in Table 3.To solve the optimization problem presented in Eq. ( 15), PSO is adopted here.The problem dimension D (i.e., the number of parameters in the optimization problem) is described as follows: Each particle x is composed of the parameters S 1 jh , S 2 jh , d 1 jh , d 2 jh and w jh , for all j = 1, 2, ..., m and h = 1, 2, ..., k.Therefore, each particle in the population is composed of D = 5 × m × k real values (i.e., D = dim(x)).


Differential Evolution for Learning PROAFTN

A new rithm was pr tegy is called DEPRO.DE is an efficient metaheuristics optimisation algorithm based on a simple mathematical structure that mimics a complex process of evolution.Based on results generated from a variety of public datasets, DEPRO provides excellent results, outperforming the most common classification algorithms.

In this direction, a new learning approach based on DE is proposed for learning the PROAFTN method.More particularly, DE is introduced here to solve the optimization problem introduced in Eq. (15).The new proposed learning technique, called DEPRO, utilizes DE to train and improve the PROAFTN classifier.In this context, DE is utilized as an inductive learning approach to infer the b

t PROAFTN parameters from the training sampl
s.The generated parameters are then used to compose the prototypes, which represent the classification model that will be used for assigning unknown samples.The target is to find the prototypes that maximize the classification accuracy on each dataset.The full description of the DE methodology and its application to learn PROAFTN is described in [4].The general procedure of the DE algorithm is presented in Algorithm 5.

The procedure for calculating the fitness function f (S
v ihjτ = x r1hjτ + F (x r2hjτ − x r3hjτ ), if (rand τ < κ) or (ρ = τ ) x ihjτ , otherwise. (16) i, r 1 , r 2 , r 3 ∈ {1, ..., N pop }, i = r 1 = r 2 = r 3 ; h = 1, ..., k; j = 1, ..., m; τ = 1, ..., D
where F is the mutation factor ∈ [0, 2], and κ is the crossover factor.This modified operation (i.e., Eq. ( 16)) forces the mutation and crossover process to be applied on each gene τ selected randomly for each set of 5 parameters S 1 jh , S 2 jh , d 1 jh , d 2 jh and w jh in v i for all j = 1, 2, ..., m and h = 1, 2, ..., k.


A Hybrid Metaheuristic Framework for Establishing PROAFTN Parameters

As discussed earlier, there are different ways to classify the behavior of metaheuristic algorithms based on their characteristics.One of these major characteristics is to identify whether the evolution strategy is based on populationbased search or single point search.Population-based methods deal in every iteration with a set of solutions rather than with a single solution.As a result, population-based algorithms have the capability to efficiently explore the search space, whereas the strengt

provide a structured way to exp
ore a pro hods to improve the search mechanism.While the use of populationbased methods ensures an exploration of the search space, the use of single-point techniques helps to identify good areas in the search space.One of the most popular ways of hybridization concerns the use of single-point search methods in population-based methods.

us, hybridization that in some way manages to combine the advantages
f population-based methods with the strengths of single-point methods is often very successful, which is the motivation and the case for this work.In many applications, hybrids metaheuristics have proved to be quite beneficial in improving the fitness of individuals [37,38,57].In this methodology, a new hybrid of metaheuristics approaches were introduced to obtain the best PROAFTN parameters configuration for a given problem.The two proposed hybrid approaches are: (1) Particle Swarm optimization (PSO) and Reduced Variable Neighborhood Search (RVNS), called PSOPRO-RVNS; and (2) Differential Evolution (DE) and RVNS, called DEPRO-RVNS.Based on the generated results on both training and testing data, it was shown that the performance of PROAFTN is significantly improved compared with the previous study presented in the previous sections (Sects.3.2 and 3.3).Furthermore, the experimental study demonstrated that PSOPRO-RVNS and DEPRO-RVNS strongly outperform well-known machine learning classifiers in a variety of problems.RVNS is a variation of the metaheuristic Variable Neighborhood Search (VNS) [33,34].The basic idea of the VNS algorithm is to find a solution in the search space with a systematic change of neighborhood.The basic VNS is very useful for approximate solutions for many combinatorial and global optimization problems; however, the major limitation is that it is very time consuming because of the utilization of ingredient-based approaches as it is used as a local search routine.RVNS uses a different approach; the solutions are drawn randomly from their neighborhood.The incumbent solution is replaced if a better solution is found.RVNS is simple, efficient and provides good results with low computational cost [30,34].In RVNS, two procedures are used: shake and move.Starting from the initial solution (the position of prematurely converged individuals) x, the algorithm selects a random solution x from the initial solution's neighborhood.If the generated x is better than x, it replaces x and the algorithm starts all over again with the same neighborhood.Otherwise, the algorithm continues with the next neighborhood structure.The pseudo-code of RVNS is given in Algorithm 6.

Algorithm 6. Random Variable Neighborhood Search steps.


Require:

Define neighborhood structures N k for k = 1, 2, . . ., kmax, that will be used in the search Get the initial solution x and choose stopping condition
k ← 1 while k < k kmax do Shaking: Generate a point x at random from the k-th neighborhood of x (x ∈ N k (x)) Move or not: if x is better than the incumbent x then x ← x k ← 1 else set k ← k + 1 end if end while
In [13] the RVNS heuristics is used to learn the PROAFTN classifier by optimizing its parameters that are presented as intervals namely the pessimistic and optimistic intervals.In this light, a hybrid of metaheuristics is proposed here for training the PROAFTN method.In this regard, the two different hybrid approaches PSO augmented with RVNS (called PSOPRO-RVNS) and DE augmented with RVNS (called DEPRO-RVNS) are proposed for

techniqu
s presented in (Sects.3.2 and 3.3) are integrated with the single point search RVNS, to improve the performance of PROAFTN.The details on how DE and RVNS have been used together to learn the PROAFTN classifier is described in [5].And in the same context, the details of the application of PSO and RVNS to learn PROAF ter solution provided by PSO or DE in each iteration, the following equations are considered to update the boundary for the previous solution x containing (S 1 jh , S 2 jh , d 1 jh , d 2 jh ) parameters:
l λjbh = x λjbh − (k/k max )x λjbh (17) use xinstead of su λjbh = x λjbh + (k/k max )x λjbh (18)
where l λjbh and u λjbh are the lower and upper bounds for each element λ ∈ [1, . . ., D]. Factor k/k max is used to define the boundary for each element and x λjbh is the previous solution for each element λ ∈ [1, . . ., D] provided by PSO.The use of the hybrid PSO/DE augmented with RVNS for learning PROAFTN is explained here and for more details please see [5].Using PSO, the elements for each particle position x i consisting of the parameters S 1 jh , S 2 jh , d 1 jh and d 2 jh are updated using:
x iλjbh (t + 1) = x iλjbh (t) + v iλjbh (t + 1)(19)
where the velocity update v i for each element based on P Best i and G Best is formulated as:
v iλjbh (t + 1) = (t)v iλjbh (t)+ τ 1 ρ 1 (P Best iλjbh − x iλjbh (t))+ τ 2 ρ 2 (G Best λjbh − x iλjbh (t))(20) i = 1, ...
weight that controls the exploration of the search space.τ 1 and τ 2 are the individual and social components/weights, respectively.ρ 1 and ρ 2 are random numbers between 0 and 1. P Best i (t) is the personal best position of the particle i, and G Best (t) is the neighborhood best position of particle i. Algorithm 6 demonstrates the required steps to evolve the velocity v i and particle position x i for each particle containing PROAFTN parameters.The shaking phase to randomly generate the elements of x us approaches introduced throughout this research for learning PROAFTN -GAPRO, PSOPRO, DEPRO, PSOPRO-RVNS and DEPRO-RVNS -is presented in Table 5.One can see that DEPRO-RVNS and PSOPRO-RVNS perform the best.Table 7 summarizes and gives robust analysis on a comparison that includes the developed approaches of learning PROAFTN classifier against other classifiers.As observed, both approaches DEPRO-RVNS and PSOPRO-RVNS strongly outperform other classifiers.Therefore, the developed approaches can be classified into three groups, based on their performances:
x λjbh = l λjbh + (u λjbh − l λjbh ).rand[0, 1](21)
-Best approaches: DEPRO-RVNS an f computation speed.One of the advantag ds is that they often converge faster and with more certainty than other methods.Furthermore, utilizing

NS inside DE and PSO improved the search for good solution
in a shorter time (Table 5).Comparison with was done against implementations provided in WEKA [27] for neural network multi-level perceptron (NN MLD), naive Bayes (NB), decision trees (PART), C4.5 and k nearest neighbour (knn).We used H2O for deep learning (h2o DL) [19] and generalized linear models (h2o GLM) [44].We used R's implementation of random forest (RFOREST) [41] with n = 500 trees.PROAFTN and decision trees share a very important property: both of them use the white box model.Decision trees and PROAFTN can generate classification models which can be easily explained and interpreted.However, when evaluating any classification method there is another important factor to be considered: classification accuracy.Based on the experimental study presented in Sect.4, the PROAFTN method has proven to generate a higher classification accuracy than decision tree such as C4.5 [46] and other well-known classifiers learning algorithms including Naive Bayes, Support Vector Machines (SVM), Neural Network (NN), K-Nearest Neighbor K-NN, and Rule Learner (see Table 6).That can be explain by the fact that PROAFTN using fuzzy intervals.A general comparison between PROAFTN based on the proposed learning approaches adopted in this paper (PRO-BPLA) and other machine learning classifiers is summarized in Table 8.The made in this table are based on evidence of existing empirical and studies presented in [40].We have also added some evidence based on the results obtained using the developed learning methodology introduced in this research study.As a summary, Table 8 compares the properties of some well known machine learning classifiers against the properties of the classification method PROAFTN.In this chapter, we have presented the implementation of machine learning and metaheuristics algorithms for parameters training of multicriteria classification method.We have shown that learning techniques based on metaheuristics proved to be a successful approach for optimizing the learning of PROAFTN classification method and thus greatly improving its performances.As has been demonstrated, every classification algorithm has its strengths and limitations.More particularly, the characteristics of the method and whether it is strong or weak depend on the situation or on the problem.For instance, assume the problem at hand is a medical dataset and the interest is to look for a classification method for medical diagnostics.Suppose the executives and experts are looking for a high level of classification accuracy and at the same time they are very keen to know more details about the classification process (e.g., why the patient is classified to this category of disease).In such circumstances, classifiers such as Deep Learning networks, k-NN, or SVM may not be an appropriate choice, because of the limited interpret-ability of their classification models.Although deep learning networks have been successfully applied to some health-care application and in particularly into medical imaging, they suffered from some limitations such as the limited interpret-ability of their classification results; they require a very large balanced labeled data set; the preprocessing or change of input domain is often required to bring all the input data to the same scale [48].Thus, there is a need to look for other classifiers that reason about their outputs and can generate good classification accuracy, such as DTs (C4.5, ID3), NB, or PROAFTN.

Based on the experimental and the comparative study presented in Table 8, the PROAFTN method based on our proposed learning approaches has good accuracy in most instances and can deal with all types of data without sensitivity to noise.PROAFTN uses the pairwise comparison and therefore, there is no need for looking for suitable normalization technique of data like the case of other classifiers.Furthermore, PROAFTN is a transparent and interpretable classifier where it's easy to generalize the classification rules from the obtained prototypes.It can use both approaches deductive and inductive learning, which allow us to use in the same time historical data with expert judgment to compose the classi-fication model.To sum up, there is no complete or comprehensive classification algorithm that can handle or fit all classification problems.In response to this deficiency, the major task of this work is to review an integration of methodologies from three major fields, MCDA, machine learning, and optimization based metaheuristics, through the aforementioned classification method PROAFTN.The target of this study was to exploit the machine learning techniques and the optimization approaches to improve the performance of PROAFTN.The aim is to find a good suitable and comprehensive (interpretable) classification procedure that can be applied efficiently in many applications including the ambient assisted living environments.


Conclusions and Future Work

The target of this chapter is to exploit the machine learning techniques and the optimization approaches to improve the performance of PROAFTN.The aim is to find a good suitable and comprehensive (interpretable) classification procedure that can be applied efficiently in health applications including the ambient assisted living environments.This chapter describes the ability of the metaheuristics when embedded to the classification method PROAFTN in order to classify new objects.To do this we compared the improved PROAFTN methodology with those previously on the same data and same validation technique (10-cross validation).In addition to reviewing several approaches to modeling and learning classification method PROAFTN, this chapter also presents new ideas to further research in the areas of data mining and machine learning.Below are some possible directions for future research.

1.The fact that PROAFTN has several parameters to be obtained for each attribute and for each class, which provides more information to assign objects to the closest class.However, in some cases this may cause some limitation on the speed of learning, particularly when using metaheuristics, as we presented in this paper.Possible future solutions could be summarized as follows:

-Utilizing different approaches for obtaining the weights.One possible direction is to use a features ranking approach by using some strong algorithms that perform well in the aspect of dimensionality reduction.-Determining intervals bounds for more

an one prototype before perf
rming optimization.This would involve establishing the intervals' bounds a priori by using some clustering techniques, hence improving and speeding up the search and improving the likelihood of finding the best solutions.2. As we know the performance of approaches based on the choice of control parameters varies from one application to another.However, in this work the control parameters are fixed for all applications.A better control of parameter choice for the metaheuristics based PROAFTN algorithms will be investigated.

3. To speed up the PROAFTN learning process, possible improvement could be made by using parallel computation.The different processors can deal with the fold independently in the cross validation folds process.The parallelism can be also applied in the composition of prototypes of each class.4. In this chapter, an inductive learning is presented to build the classification m dels for the PROAFTN method.PROAFTN also can apply the deductive learning that allows the introduction of the given knowledge in setting PROAFTN parameters such intervals and/or weights to build the prototype of classes.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Fig. 1 .
1
Fig. 1.Fuzzy approach for features evaluation


1 jhFig. 2 .
12
Fig. 2. Graphical representation of the partial indifference concordance i dex between the object a and the prototype b h i represented by intervals.


For

all a ∈ A : Step 1: -Apply the classification procedure according to Algorithm 1 Step 2: -Compare the value of the new class with the true class C h -Identify the number of misclassified and unrecognized objects -Calculate the classification accuracy (i.e. the fitness value): f = number of correctly classified objects n


Table 1 .
1
Notations and parameters used by the PROAFTN method
ASet of objects with known labels {a1, a2, ..., an}the preassigned objects (training set){g1, g2, ..., gm}Set of m attributes:Ωset of