Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification

Kusy, Maciej; Zajdel, Roman

doi:10.1007/s10489-014-0562-9

Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification

Open access
Published: 06 August 2014

Volume 41, pages 837–854, (2014)
Cite this article

Download PDF

You have full access to this open access article

Applied Intelligence Aims and scope Submit manuscript

Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification

Download PDF

Maciej Kusy¹ &
Roman Zajdel¹

7295 Accesses
1 Altmetric
Explore all metrics

Abstract

In this article, an iterative procedure is proposed for the training process of the probabilistic neural network (PNN). In each stage of this procedure, the Q(0)-learning algorithm is utilized for the adaptation of PNN smoothing parameter (σ). Four classes of PNN models are regarded in this study. In the case of the first, simplest model, the smoothing parameter takes the form of a scalar; for the second model, σ is a vector whose elements are computed with respect to the class index; the third considered model has the smoothing parameter vector for which all components are determined depending on each input attribute; finally, the last and the most complex of the analyzed networks, uses the matrix of smoothing parameters where each element is dependent on both class and input feature index. The main idea of the presented approach is based on the appropriate update of the smoothing parameter values according to the Q(0)-learning algorithm. The proposed procedure is verified on six repository data sets. The prediction ability of the algorithm is assessed by computing the test accuracy on 10 %, 20 %, 30 %, and 40 % of examples drawn randomly from each input data set. The results are compared with the test accuracy obtained by PNN trained using the conjugate gradient procedure, support vector machine algorithm, gene expression programming classifier, k–Means method, multilayer perceptron, radial basis function neural network and learning vector quantization neural network. It is shown that the presented procedure can be applied to the automatic adaptation of the smoothing parameter of each of the considered PNN models and that this is an alternative training method. PNN trained by the Q(0)-learning based approach constitutes a classifier which can be treated as one of the top models in data classification problems.

Stateless Q-Learning Algorithm for Training of Radial Basis Function Based Neural Networks in Medical Data Classification

Assessment of prediction ability for reduced probabilistic neural network in data classification problems

Article Open access 05 October 2016

New Approaches in Metaheuristic to Classify Medical Data Using Artificial Neural Network

Article 10 July 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Probabilistic neural network (PNN) is an example of the radial basis function based model effectively used in data classification problems. It was proposed by Donald Specht [37, 38] and, as the data classifier, draws the attention of researchers from the domain of data mining. For example, it is applied in medical diagnosis and prediction [23, 25, 28], image classification and recognition [7, 20, 27], bearing fault detection [32], digital image watermarking [45], earthquake magnitude prediction [1] or classification in a time-varying environment [31].

PNN is a feed-forward neural network with a complex structure. It is composed of an input layer, a pattern layer, a summation layer and an output layer. Despite its complexity, PNN only has a single training parameter. This is a smoothing parameter of the probability density functions (PDFs) which are utilized for the activation of the neurons in the pattern layer. Thereby, the training process of PNN solely requires a single input-output signal pass in order to compute network response. However, only the optimal value of the smoothing parameter gives the possibility of correctness of the model’s response in terms of generalization ability. The value of σ must be estimated on the basis of the PNN’s classification performance which is usually achieved in an iterative manner.

Within the process of the smoothing parameter estimation two issues must be addressed. The first one pertains to the selection of σ in PDF for the pattern layer neurons of PNN. Four possible approaches are applied, i.e. a single parameter for the whole model [37, 38], a single parameter for each class [1], a single parameter for each data attribute [11, 39], and a single parameter for each attribute and a class [7, 11, 12].

The second problem related to the smoothing parameter estimation for PNN is concerned with the computation of the σ value. In literature, different procedures have been developed. For example, in [39], the conjugate gradient descent (ascent) is used to find iteratively the set of σ’s which maximize the optimization criterion. Chtioui et al [7] exploit the conjugate gradient method and the approximate Newton algorithm to determine the smoothing parameters associated with each data attribute and class. In [12], the authors utilize the particle swarm optimization algorithm to estimate the matrix of the smoothing parameters for the probability density functions in the pattern layer. An interesting study is presented in [47], where the gap-based approach for smoothing parameter adaptation is proposed. The authors provide the formula for σ on the basis of the gap computed between the two nearest points of the data set. The solution is applied to PNN for which the smoothing parameter takes the form of a scalar and the vector whose parameters are associated with each data feature.

As one can observe, the choice of the smoothing parameter plays a crucial role in the training process of the probabilistic neural network. This fact is of particular importance when PNN has different σ for: each class, each attribute, and each class and attribute. The task of smoothing parameter selection can then be considered as a high-dimensional function optimization problem. The reinforcement learning (RL) algorithm is an efficient method in solving such type of problems, e.g. finding extrema of some family of functions [46] or the computation of the set of optimal weights for multilayer perceptron [40]. The RL method is also frequently applied in various engineering tasks. It is used in nonstationary serial supply chain inventory control [18], adaptive control of nonlinear objects [43], adjusting robot behavior for autonomous navigation system [26] or path planning for improving positioning accuracy of a mobile microrobot [22]. There are also studies which propose the use of RL in non-technical domains, e.g. in the explanation of dopamine neuronal activity [5] or in an educational system to improve the pedagogical policy [16].

In this work, we introduce a novel procedure for the computation of the smoothing parameter of the PNN model. This procedure uses the Q(0)-learning algorithm. The method adjusts the smoothing parameter according to four different strategies: single σ for the whole network, single σ for each class, single σ for each data attribute and single σ for each data attribute and each class. The results of our proposed solution are compared to the outcomes of PNN for which the smoothing parameter is calculated using the conjugate gradient procedure and, additionally, to the support vector machine classifier, gene expression programming algorithm, k–Means clustering method, multilayer perceptron, radial basis function neural network and learning vector quantization neural network in medical data classification problems.

The authors of the present study have already proposed the application of the reinforcement learning algorithm to the computation of the smoothing parameter of radial basis function based neural networks [19]. In that work, the stateless Q-learning algorithm was used for the adaptive computation of the smoothing parameter of the networks.

This paper is organized as follows. Section 2 discusses the probabilistic neural network highlighting its basics, structure, principle of operation and the problem of smoothing parameter selection. Section 3 presents the basis of one of reinforcement learning algorithms which is applied in this work, namely the Q(0)-learning algorithm. In Section 4, we present the proposed procedure. Here the problem statement is provided, a general idea of applying the Q(0)-learning algorithm to the choice of the smoothing parameter is described and, finally, the details of the algorithm are given. Section 5 presents the data sets used in this research, the algorithm settings and the obtained empirical results along with the illustration of the PNN training process. In this part of our work, we compare the performance of our method with the efficiency of the PNN whose σ is determined by means of the conjugate gradient method and, additionally, to the efficiency of the reference classifiers and neural networks. Finally, in Section 6, we conclude our work.

2 Probabilistic neural network

Probabilistic neural network is a data classification model which implements the Bayesian decision rule. This rule is defined as follows. If we assume that: (1) there is a data pattern $\mathbf {x\in \mathbb {R}}^{n}$ which is included in one of the predefined classes g=1,…,G; (2) the probability of x belonging to the class g equals p _g; (3) the cost of classifying x into class g is c _g; (4) the probability density functions y ₁(x),y ₂(x),…,y _G(x) for all classes are known. Then, according to the Bayes theorem, when g≠h, the vector x is classified to the class g, if p _g c _g y _g(x)>p _h c _h y _h(x). Usually p _g=p _h and c _g=c _h, thus if y _g(x)>y _h(x), the vector x is classified to the class g.

In real data classification problems, any knowledge on the probability density functions y _g(x) is not given since a data set distribution is usually unknown. Therefore, some approximation of the PDF must be determined. Such an approximation can be obtained using the Parzen method [29]. Commonly, the Gaussian function is a choice for PDF since it satisfies the conditions required by Parzen’s method.

The assumption of using the Gaussian density for PDF gives the possibility of constructing a feed-forward classifier. It is composed of the input layer represented by the attributes of x, the pattern layer and the summation layer consisting of G neurons where each one computes the signal only for patterns which belong to g-th class

$$ y_{g}\left( \mathbf{x};\sigma\right) =\frac{1}{l_{g}\left( 2\pi\right) ^{n/2}\sigma^{n}}{\displaystyle\sum \limits_{i=1}^{l_{g}}}\exp\left( -{\displaystyle\sum\limits_{j=1}^{n}} \frac{\left( x_{ij}^{(g)}-x_{j}\right)^{2}}{2\sigma^{2}}\right) , $$

(1)

where l _g is the number of examples of class g, σ denotes the smoothing parameter, $x_{ij}^{(g)}$ is the j-th element of the i-th training vector (i=1,…,l _g) which is contained in the class g and x _j is the j-th coordinate of the unknown vector x. Finally, the output layer estimates the class of x in accordance with the Bayes’s decision rule based on the outputs of all the summation layer neurons

$$ G^{\ast}\left( \mathbf{x}\right) =\arg\underset{g}{\max}\left\{ y_{g}\left( \mathbf{x}\right) \right\}, $$

(2)

where $G^{\ast }\left (\mathbf {x}\right ) $ denotes the predicted class of the pattern x. Since y _g defined in (1) depends on scalar σ, this type of PNN is henceforth named PNNS. The architecture of the probabilistic neural network is depicted in Fig. 1.

If we consider that the patterns of particular classes differ in their densities, then the summation layer signal defined in (1) has a different shape depending on the value of the smoothing parameter in relation to the class (such a model is called PNNC)

$$ y_{g}\left( \mathbf{x};\boldsymbol{\sigma}_{C}\right) =\frac{1}{l_{g}\left( 2\pi\right) ^{n/2}\left( \sigma^{(g)}\right)^{n}}{\displaystyle\sum \limits_{i=1}^{l_{g}}}\exp\left( -{\displaystyle\sum\limits_{j=1}^{n}} \frac{\left( x_{ij}^{(g)}-x_{j}\right)^{2}}{2\left( \sigma^{(g)}\right)^{2}}\right), $$

(3)

where $\boldsymbol {\sigma }_{C}=\left [ \sigma ^{(1)},\ldots ,\sigma ^{(G)}\right ]^{T} $ is the smoothing parameter vector consisting of σ ^(g) elements associated with g-th class.

It is also possible to differentiate the smoothing parameter with respect to each attribute of the input data. In such a case, the formula in (1) takes the following form

$$ y_{g}\left( \mathbf{x};\boldsymbol{\sigma}_{V}\right) =\frac{1}{l_{g}\left( 2\pi\right) ^{n/2}{\displaystyle\prod\limits_{j=1}^{n}}\sigma_{j}}{\displaystyle\sum \limits_{i=1}^{l_{g}}}\exp\left( -{\displaystyle\sum\limits_{j=1}^{n}} \frac{\left( x_{ij}^{(g)}-x_{j}\right)^{2}}{2{\sigma_{j}^{2}}}\right) , $$

(4)

where $\boldsymbol {\sigma }_{V}=\left [ \sigma _{1},\ldots ,\sigma _{n}\right ] $ is the smoothing parameter vector consisting of σ _j elements associated with the j-th input variable. PNN with the smoothing parameters different for each variable is denoted as PNNV.

Finally, if one regards a PNN model, whose smoothing parameter is different for each data variable and each class, the network’s summation layer signal can be expressed in the most general form (such a model is named PNNVC)

$$ y_{g}\left( \mathbf{x};\boldsymbol{\sigma}_{VC}\right) =\frac{1}{l_{g}\left( 2\pi\right) ^{n/2}{\displaystyle\prod\limits_{j=1}^{n}}\sigma_{j}^{(g)}}{\displaystyle\sum \limits_{i=1}^{l_{g}}}\exp\left( -{\displaystyle\sum\limits_{j=1}^{n}} \frac{\left( x_{ij}^{(g)}-x_{j}\right)^{2}}{2 \left(\sigma_{j}^{(g)}\right)^{2}}\right), $$

(5)

where

$$ \boldsymbol{\sigma}_{VC}=\left[ \begin{array} [c]{ccccc} \sigma_{1}^{(1)}, & \ldots, & \sigma_{j}^{(1)}, & \ldots, & \sigma_{n}^{(1)}\\ {\vdots} & {\ddots} & {\vdots} & {\ddots} & \vdots\\ \sigma_{1}^{(g)}, & \ldots, & \sigma_{j}^{(g)}, & \ldots, & \sigma_{n}^{(g)}\\ {\vdots} & {\ddots} & {\vdots} & {\ddots} & \vdots\\ \sigma_{1}^{(G)}, & \ldots, & \sigma_{j}^{(G)}, & \ldots, & \sigma_{n}^{(G)} \end{array} \right] $$

(6)

is the matrix of the smoothing parameters where each $\sigma _{j}^{(g)}$ element is associated with the j-th input variable and g-th class.

Taking into account the four above possibilities of computing summation layer signal, the PNN output defined in (2) is generalized to the following form

$$ G^{\ast}\left( \mathbf{x}; sigma \right) =\arg\underset{g}{\max}\left\{ y_{g}\left( \mathbf{x}; sigma \right) \right\} , $$

(7)

where

$$ sigma=\left\{ \begin{array}{lll} \sigma & \text{for\quad PNNS}\\ \boldsymbol{\sigma}_{C} & \text{for\quad PNNC}\\ \boldsymbol{\sigma}_{V} & \text{for\quad PNNV}\\ \boldsymbol{\sigma}_{VC} & \text{for\quad PNNVC} \end{array} \right. , $$

(8)

where y _g is computed according to (1), (3), (4) and (5) for PNNS, PNNC, PNNV and PNNVC, respectively.

Figure 2 shows the difference between the PNN models expressed in terms of the smoothing parameter selection and computation for the summation neuron output signal y _g defined by the formulas (1), (3), (4) and (5). In this figure, we can see five points in $\mathbb {R}^{2}$ space which belong to two classes. The summation layer signals y ₁ and y ₂ are only marked for PNNS. One can observe that a single σ for the whole network (PNNS) does not enable the consideration of the input data densities of each class. Different shapes of PDFs for a PNNC model allow the dispersion of data of each class to be taken into account. When the smoothing parameter is different for each variable (PNNV), the particular PDFs take an elliptical form. This approach does not consider the input data densities of each class but provides information about the influence of input attribute values on y _g. Finally, PNNVC is the network which integrates data classes densities and the influence of the particular input features. This is the most general form of the model since PNNS, PNNC and PNNV are special cases of PNNVC.

3 Reinforcement learning

3.1 Introduction

Reinforcement learning addresses the problem of the agent that must learn to perform a task through a trial and error interaction with an unknown environment. The agent and the environment interact continuously until the terminal state is reached. The agent senses the environment and selects an action to perform. Depending on the effect of its action, the agent obtains a reward. Its goal is to maximize the discounted sum of future reinforcements r _t received in the long run in any time step t, which is usually formalized as $\sum \nolimits _{t=0}^{\infty }\gamma ^{t}r_{t}$, where $\gamma \in \left [ 0,1\right ] $ is the agent’s discount rate [41].

The mathematical model of the reinforcement learning method is a Markov Decision Process (MDP). MDP is defined as the quadruple $\langle S, A, P_{s_{t}s_{t+1}}^{a_{t}}, r_{t} \rangle $ where S is a set of states, A is a set of actions, $P_{s_{t}s_{t+1}}^{a_{t}}$ denotes the probability of the transition to the state s _t+1∈S after the execution of the action a _t∈A in the state s _t∈S.

3.2 Q(0)-learning

Different types of reinforcement learning algorithms exist. The Q(0)-learning proposed by Watkins [44] is one of the most often used. This algorithm computes the table of all $Q\left (s,a\right )$ values (called Q–table) by successive approximations. $Q\left (s,a\right ) $ represents the expected pay-off that an agent can obtain in state s after it performs action a. In time step t, the Q–table is updated for the state-action pair $\left (s_{t},a_{t}\right ) $ according to the following formula [44]

$$\begin{array}{@{}rcl@{}} Q_{t+1}\left( s_{t},a_{t}\right) =Q_{t}\left( s_{t},a_{t}\right)+ \alpha\left(r_{t}+\gamma\max_{a}Q_{t}\left( s_{t+1},a\right)\right.\\\left. -Q_{t}\left(s_{t},a_{t}\right) {\vphantom{\left(r_{t}+\gamma\max_{a}Q_{t}\left( s_{t+1},a\right)\right.}} \right),\\ \end{array} $$

(9)

where the maximization operator refers to the action value a which may be performed in the next state s _t+1 and $\alpha \in \left (0,1\right ] $ is the learning rate. The formula (9) will be used as the basis of the algorithm for the PNN’s smoothing parameter optimization presented in the next section.

4 Application of Q(0)-learning based procedure to the adaptation of PNN’s smoothing parameter

4.1 Problem statement

Assume, we are given a training set in the form of the pairs $\langle \mathbf {x}_{i}, \hat {G_{i}}\rangle $, i=1,…,l, where x _i is the i-th input element and $\hat {G_{i}} $ is its corresponding output. Assume furthermore the following measure of the accuracy

$$ Acc(sigma)=\frac{1}{l_{T}}{\displaystyle\sum\limits_{i=1} ^{l_{T}} }c_{i}(sigma), $$

(10)

where l _T is the cardinality of a training set, and c _i is the indicator of the classification’s correctness defined as follows

$$ c_{i}(sigma)=\left\{ \begin{array} [c]{lll} 1 & \text{ if } & G^{\ast}(\mathbf{x}_{i};sigma)=\hat{G_{i}}\\ 0 & \text{ if } & G^{\ast}(\mathbf{x}_{i};sigma)\neq\hat{G_{i}} \end{array} \right. , $$

(11)

where $G^{\ast }(\mathbf {x}_{i};sigma)$ is defined in (7).

The task is to find the optimal value of the smoothing parameter which maximizes the accuracy (10). For PNNC, PNNV and PNNVC models, this is a multivalued optimization problem. As the solution, we propose a new procedure, which is based on the Q(0)-learning algorithm. The set of system states S, the set of actions A and the reinforcement signal r which are required by the Q(0)-learning method will be defined along with the description of the algorithm.

4.2 General idea

For the adaptation of the smoothing parameter, the procedure based on the Q(0)-learning algorithm is proposed for PNNS, PNNC, PNNV and PNNVC models. The introduction of the Q(0)-learning algorithm is based on the assumption that in the PNN training process it is possible to distinguish two elements which interact with each other: the environment and the agent. The environment is composed of the data set used for the training process, the PNN model and the accuracy measure. The agent, on the basis of the policy which is represented by the action value function Q, chooses an action a _t in a state s _t. The action a _t is used to modify the smoothing parameter. In this work, the state is represented by the accuracy measure. This has some natural interpretation since the state defined in such a way is the function of PNN output, which depends on the smoothing parameter. The output of PNN is computed for the training and test set in order to determine the training and test accuracies. On the basis of the training accuracy, the next state s _t+1 and the reinforcement signal r _t are computed. The reinforcement signal provides information about the change of the training accuracy taking the negative value when the accuracy decreases and the positive value when the accuracy increases. The effect of interaction between the agent and the environment results in both the modification of the action value function Q, and the change of the smoothing parameter.

The main assumption of the proposed procedure is to perform the training of the PNN model on the training set in order to maximize the training accuracy (10). Additionally, PNN is tested by computing the accuracy on the test set. The highest test accuracy and its corresponding value of the smoothing parameter is stored. Finding the highest test accuracy of PNN will provide the optimal smoothing parameter in terms of the prediction ability.

The proposed procedure for the adaptation of the smoothing parameter of the network is illustrated in the form of the flowchart in Fig. 3. We only present the application of the procedure for the PNNV model, but it performs in a comparable manner for the remaining networks taking the cardinality of the smoothing parameters into account.

As shown, the procedure consists of M stages. In the first stage, the smoothing parameter vector σ _V of PNNV is initialized with ones. Such an initialization is proposed in the exemplary experiments in [8]. The actions from the set A ⁽¹⁾ are assigned some values which should be large. The definition of the action along with the details concerning action set selection will be provided later in subsection 4.3. Then the model is trained on the training set using the proposed. Algorithm 1, which will be explained later in detail, finds the smoothing parameter vector $\boldsymbol {\sigma }^{(1)}_{\max }$ which maximizes the training accuracy. Algorithm 1 is performed for each time step t and in short, for m-th stage of the procedure, consists of the following steps:

choose a _t action using an actual policy derived from the action value function Q;
update a single element of σ _V with the value of a _t;
compute training and test accuracy A c c _t according to (10);
actualize the maximal test accuracy $Acc_{\max }$ and the corresponding $\boldsymbol {\sigma }_{\max }^{(m)}$;
calculate the reinforcement signal r _t on the basis of training accuracy;
update the action value function Q.

Once the first stage of the procedure is completed, the second one begins. Here, $\boldsymbol {\sigma }_{\max }^{(2)}$ is initialized with the optimal value from the previous stage and the action set is changed. In our approach, this change relies on the decrease of all action values by an order of magnitude. The PNNV model is trained using Algorithm 1 and the smoothing parameter vector which maximizes the training accuracy is updated.

The procedure is performed M-times, each time: (1) – updating $\boldsymbol {\sigma }_{\max }^{(m)}$ on the basis of $\boldsymbol {\sigma }_{\max }^{(m-1)}$, (2) – decreasing the absolute values of the actions and (3) – finding new smoothing parameter values which maximize training accuracy. Such a type of approach, where the initial absolute values of the actions are large, gives the possibility of the selection of the smoothing parameter values within a broad range. The iterative decrease of actions in subsequent stages makes $\boldsymbol {\sigma }_{\max }^{(m)}$ narrow its values. This, in turn, allows us to search for a more optimal parameter of the PNNV model.

Once all M stages are performed, the highest test accuracy $Acc_{\max }$ is computed for $\boldsymbol {\sigma }_{\max }^{(m)}$. Such a solution provides the highest prediction ability of PNNV.

As shown, the above procedure utilizes RL in the problem of smoothing parameter adjustment to perform a classification task. However, it is also possible to combine PNN and RL in the other way. In the work of Heinen and Engel [15], a new incremental probabilistic neural network (IPNN) is proposed. The matrix of the smoothing parameters of IPNN is used for the action selection in the RL problem. IPNN is therefore utilized as the action value function approximator.

4.3 Application of the Q(0)-learning algorithm to adaptive computation of the smoothing parameter

In this subsection, we explain the details concerning the application of the Q(0)-learning algorithm to the problem of σ _V adaptive selection for the PNNV classifier. As mentioned before, the algorithm is solely highlighted for this type of network since Q(0)-learning works in a similar manner for PNNS, PNNC and PNNVC. The only difference is related to the number of the smoothing parameters which have to be updated. For PNNV, there is n parameters of the model while for PNNS, PNNC and PNNVC, there exist 1, G and n×G smoothing parameters, respectively.

The use of the Q(0)-learning algorithm for the choice of σ _V parameter requires the definition of the set of the system states, the action set and the reinforcement signal.

Definition 1

The set of system states is defined by the accuracy measure: $S=\left \lbrace 0,\frac {1}{l_{T}},\frac {2}{l_{T}},\ldots ,\frac {l_{T}-1}{l_{T}},1\right \rbrace $. S takes the real values from the interval [0,1]. The total number of states is therefore l _T+1.

Definition 2

A ^(m) is the symmetric set of actions of the following form: $A^{(m)}=\left \lbrace -a_{1}^{(m)},-a_{2}^{(m)},\ldots ,-a_{p^{(m)}}^{(m)},\right .$ $\left .a_{p^{(m)}}^{(m)},\ldots ,a_{2}^{(m)},a_{1}^{(m)}\right \rbrace $ where p ^(m) denotes the half of the cardinality of this set in stage m of the procedure.

Since p ^(m) in each iteration of the proposed procedure may be different, the cardinality of A ^(m) may vary and equals 2p ^(m). The action set should be chosen so that $\max \left (A^{(1)}\right ) >\ldots >\max \left (A^{(m)}\right ) >\ldots >\max \left (A^{(M)}\right ) $ holds. In our work, we assume the number of stages to be M=3 which provides three action sets. For each m-th set, the following action values are proposed

$$ \begin{array} [c]{l} A^{(1)}=\left\lbrace -10,-1,-0.1,0.1,1,10\right\rbrace,\\ A^{(2)}=\left\lbrace -1,-0.1,-0.01,0.01,0.1,1\right\rbrace,\\ A^{(3)}=\left\lbrace -0.1,-0.01,-0.001,0.001,0.01,0.1\right\rbrace. \end{array} $$

(12)

In each stage of the procedure, the smoothing parameters of PNNV are increased or decreased by the element values of A ^(m) action set. The proposition of the action set in the first stage (A ⁽¹⁾) allows the modification of σ _V with large values. This gives the possibility of searching optimal parameters inside a broad range of values. Maximally, the elements of σ _V can be modified by the value of ±10. The first stage of the procedure ends up with finding a candidate for optimum of the smoothing parameter. Subsequent decreases of absolute values of actions in A ⁽²⁾, shrink the domain of possible optimal parameter values. Finally, in A ⁽³⁾, the absolute values of the actions are so small so that the smoothing parameters of PNNV slightly change. A large change of σ _V in the third stage of the procedure is not required because the optimal modification route has already been established (in the first two stages).

In order to maximize the training accuracy of PNNV, the actual reinforcement signal r _t should reward an agent when the training accuracy increases and punish an agent when the accuracy decreases. This idea can be simply formalized as follows.

Definition 3

For the accuracy computed on the training set in the actual and previous step $Acc_{t}^{train}\left (\boldsymbol {\sigma }_{V,t}\right )$ and $Acc_{t-1}^{train}\left (\boldsymbol {\sigma }_{V,t-1}\right )$, the reinforcement signal is realized as follows

$$ r_{t}=Acc_{t}^{train}\left(\boldsymbol{\sigma}_{V,t}\right)-Acc_{t-1}^{train}\left(\boldsymbol{\sigma}_{V,t-1}\right). $$

(13)

Since the training accuracy is normalized, r _t∈[−1,1].

Such a form of the reinforcement signal combined with the action value function update strengthens the confidence if the choice of an action is beneficial or not.

Algorithm 1 shows the application of the Q(0)-learning method to the adaptive choice of σ _V for the PNNV classifier. This algorithm is executed in each m-th stage of the procedure shown in Fig. 3.

The algorithm starts with the initialization of $Acc_{\max }$ on the basis of the smoothing parameter values found in the previous stage of the procedure except from the first stage, when σ _V is initialized with ones. $Acc_{\max }$ will store the maximal test accuracy computed on the test set during the training process. Then, in step 2, the action value function Q is set to zero.

The main loop begins in step 4, which runs over the maximum number of training steps $t_{\max }$. Since the PNNV training process is considered, in step 5 the inner loop begins which iterates over the number of input features. At the beginning of this loop, the actual state s _t is observed on the basis of the training accuracy (step 6). Next, on the basis of the Q–table values, the actual action a _t is chosen at the state s _t using the ε-greedy method. Then, in step 8, the smoothing parameter is updated by adding the value of the action a _t as follows

$$ \sigma_{j,t}=\sigma_{j,t-1}+a_{t}. $$

(14)

Modification of σ _j,t throughout the addition of the action value allows us to find the optimal smoothing parameter within the range determined by the extreme values of A ^(m) multiplied by the maximum number of training steps $t_{\max }$. Once a new value of the smoothing parameter is determined, the training accuracy $Acc_{j,t}^{train}\left (\boldsymbol {\sigma }_{V,t}\right )$ is calculated (step 9), which then (step 10) becomes the state of the system in t+1 time step. Next, the test accuracy $Acc_{j,t}^{test}\left (\boldsymbol {\sigma }_{V,t}\right )$ is computed on the test set. If an actual test accuracy is greater than the maximum one, both $\boldsymbol {\sigma }_{\max }^{(m)}$ and $Acc_{\max }$ are updated (steps 12–15). Afterwards, the reinforcement signal is calculated using (13) and the actualization of the action value function is performed (steps 16 and 17, respectively). Finally, if the current training accuracy reaches the value of 1, the algorithm stops and the next step of the procedure begins. If the algorithm is not able to find the most optimal solution ($Acc_{j,t}^{train}\left (\boldsymbol {\sigma }_{V,t}\right ) < 1$), the condition in step 18 is never fulfilled. In such a case, (m+1)-th stage of the procedure starts after $t_{\max }$ training steps of PNNV has been performed.

Table 1

Full size table

It is worth noting that the type of the PNN model influences the number of smoothing parameter updates. For PNNS, PNNC, PNNV and PNNVC, the number of the smoothing parameter updates is equal $t_{\max }$, $t_{\max }\times G$, $t_{\max }\times n$ and $t_{\max }\times n\times G$, respectively.

5 Experiments

In this section, we present the simulation results in the classification of medical databases obtained by PNNS, PNNC, PNNV and PNNVC trained by the proposed procedure. These results are compared with the outcomes obtained by PNN trained using the conjugate gradient procedure (PNNVC–CG), support vector machine (SVM) algorithm, gene expression programming (GEP) classifier, k–Means method, multilayer perceptron (MLP), radial basis function neural network (RBFN) and learning vector quantization neural network (LVQN). The data sets used in these experiments are also briefly described and the adjustments of the algorithm are provided. Moreover, the illustration of the PNNS training process is presented.

5.1 Data sets used in the study

In the simulations, six UCI machine learning repository medical data sets are used:

Wisconsin breast cancer database [24] that consists of 683 instances with 9 attributes. The data is divided into two groups: 444 benign cases and 239 malignant cases.
Pima Indians diabetes data set [36] that includes 768 cases having 8 features. Two classes of data are considered: samples tested negative (500 records) and samples tested positive (268 records).
Haberman’s survival data [21] that contains 306 patients who underwent surgery for breast cancer. For each instance, 3 variables are measured. The 5-year survival status establishes two input classes: patients who survived 5 years or longer (225 records) and patients who died within 5 years (81 records).
Cardiotocography data set [3] that comprises 2126 measurements of fetal heart rate and uterine contraction features on 22 attribute cardiotocograms classified by expert obstetricians. The classes are coded into three states: normal (1655 cases), suspect (295 cases) and pathological (176 cases).
Dermatology data [13] that includes 358 instances each of 34 features. Six data classes are considered: psoriasis (111 cases), lichen planus (71 cases), seborrheic dermatitis (60 cases), cronic dermatitis (48 cases), pityriasis rosea (48 cases) and pityriasis rubra pilaris (20 cases).
Statlog heart database [3] that consists of 270 instances and 13 attributes. There are two classes to be predicted: absence (150 cases) or presence (120 cases) of heart disease.

5.2 Algorithms’ settings

In the case of the proposed algorithm, the initial values of the action value function Q are set to zero. Three six–element action sets proposed in (12) are used. The maximum number of the training steps $t_{\max }=100$ is assumed. We apply such a value of $t_{\max }$ in order to show that at a relatively small number of training steps it is possible to achieve satisfactory results. Additionally, the Q(0)-learning algorithm requires appropriate selection of its intrinsic parameters: the greedy parameter, the update rate and the discount factor.

The greedy parameter ε determines the probability of random action selection and must be taken from the set $\left [ 0,1\right ] $. If ε=0.05, only 5 actions out of 100 are chosen randomly from the action set. The remaining 95 % of action selections are performed according to learned policy represented by the Q–table. If the elements of the Q–table are the same (initial iterations of Algorithm 1), the actions are selected randomly. In this work, the greedy parameter is chosen experimentally from the set $\left \lbrace 0.5, 0.05, 0.005 \right \rbrace $. Unfortunately, for $t_{\max }=100$, the use of ε=0.5 does not yield repeatable results. In turn, for ε=0.005, it is observed that some actions have never been selected. Therefore, ε=0.05 is utilized in the experiments.

The α parameter determines the update rate for the action value function Q. The small value of this factor increases the time of the training process. Its large value introduces the oscillations of Q elements [34]. The proper selection of α has a significant influence on the convergence of the training process. From the theoretical point of view, one requires that α is large enough to overcome any initial conditions or possible random fluctuations and it should decrease its value in time. However, in practical applications, the constant values of this factor are mostly used. Admittedly, this approach does not assure the convergence of the learning process, but a stable policy can be reached. In our study, we choose α experimentally from the set $\left \lbrace 0.1, 0.01, 0.001 \right \rbrace $. For all three parameter values, similar results are obtained. In the final simulation, we assume α=0.01.

The discount factor γ determines the relative importance of short and long termed prizes. This parameter is mostly picked arbitrarily near 1, e.g. 0.8 [6], 0.9 [2], [33] or 0.95 [4]. In this contribution, γ=0.95.

PNNVC–CG used in the simulations is the probabilistic neural network trained by the conjugated gradient procedure. The model is a built-in tool of DTREG predictive modeling software [35]. In the experiments, we use the network for which the smoothing parameter is adapted for each input feature and class separately. The starting values of the smoothing parameters for PNNVC–CG are between 0.0001 and 10 [35].

SVM algorithm [42] is used in this work as the data classifier. The model is trained by means of the SMO algorithm [30] available in Matlab’s Bioinformatics Toolbox. Multiclass classification problems are solved by applying the one-against-all method. In all data classification cases, radial basis function kernel is applied with experimental grid search for both C constraint and sc spread constant: $C=\left \{10^{-1}\right .$, 10⁰, 10¹, 10², 10³, 10⁴, $\left .10^{5},10^{6}\right \}$ and s c={0.08, 0.2, 0.3, 0.5, 0.8, 1.2, 1.5, 2, 5, 10, 50, 80, 100, 200, 500}, respectively.

The classification of considered data sets with the use of the GEP algorithm [9, 10] is performed in GeneXproTools software. For the simulation purposes, the GEP’s parameters are chosen on the basis of Table 1. In all experiments, the number of chromosomes in the population is set to 30. For genetic computations, we use 10 random floating point constants per gene from the range $\left [-1000,1000 \right ]$. Evolution is performed until 10000 generations are reached.

Table 1 The head size, the number of genes within each chromosome, the linking functions between genes, the computing functions in the head, the fitness functions and the genetic operators used for the GEP classifier

Full size table

The k–Means clustering algorithm [14] is used in the comparison for classification purposes. The predictions are made for the unknown cases by assigning them the category of the closest cluster center. In the simulations, the optimal number of clusters is found which provides highest test accuracy.

MLP neural network is simulated with one or two hidden layers activated by the transfer functions from the set {linear, hyperbolic tangent, logistic}. The same set of transfer functions is applied for the output neurons, for which the sum squared error function is calculated. The number of hidden layer neurons is optimized in order to minimize the network error. The model is trained with gradient descent with momentum and adaptive learning rate backpropagation algorithms [8].

For RBFN and LVQN neural networks, the number of hidden neurons is selected empirically from the set {2,4,6,…,100}. The optimal number of hidden neurons is taken so that the sum squared error for each model is minimized. The spread constant in RBFN hidden layer activation function is chosen experimentally from the interval [0.1,10] with the step size 0.1.

5.3 Empirical results

In this study, the performance of PNN models, for which the smoothing parameter is determined using the Q(0)-learning based procedure, is evaluated on the input data partitioned in the following way. Firstly, the testing subsets are created by applying a random extraction of 10 %, 20 %, 30 % and 40 % of cases out of the input database. Then, the training sets are created using the rest of the patterns, i.e. 90 %, 80 %, 70 % and 60 % of data, respectively. This type of data division is introduced on purpose since considering all possible training–test subsets is complex from a computational point of view – the number of ways of dividing l training patterns into v sets, each of size k, is large and equals $ l! / \left (v! \cdot (k!)^{v}\right ) $ [17].

The remaining classifiers used in the comparative research: PNNVC–CG, SVM, GEP, k–Means, MLP, RBFN and LVQN are trained and validated on the same data subsets. The use of the same training/test sets for all the models makes the obtained results comparable.

Tables 2, 3, 4, 5, 6 and 7 show the test accuracy values computed in terms of the percentage of correctly classified examples for PNNS, PNNC, PNNV and PNNVC with the smoothing parameter adapted by the proposed procedure for particular training–test set partitions on each of the six data bases. Additionally, for comparison purposes, the results are presented for PNNVC–CG, SVM, GEP, k–Means, MLP, RBFN and LVQN. For all models, the maximum (max), average (avr) and standard deviation (sd) values are provided. The results presented in the tables lead to the following observations:

1.
In the classification of Wisconsin breast cancer data, out of all compared models, PNNC and PNNVC reach the highest average test accuracy which is equal 99.0 %. In the case of Haberman and dermatology data classification problems, the highest average test accuracy is obtained for PNNVC (81.2 %) and PNNV (97.6 %), respectively.
2.
The SVM model provides the highest average test accuracy in the classification of the Pima Indians diabetes data set (77.2 %) and cardiotocography database (97.2 %). In these two classification tasks, PNNV is the second best model with the test accuracy lower by 0.5 % and 1.8 %, respectively. For the Statlog classification problem, GEP algorithm yields the highest average test accuracy which equals 94.6 %. This result is followed by the outcomes of PNNVC, PNNV and PNNC.
3.
Except for the dermatology classification problem, PNNVC–CG turns out to be the worst classifier. The k–Means algorithm and the remaining reference neural networks (MLP, RBFN and LVQN) achieve lower test accuracy than the PNNV, PNNVC, SVM and GEP classifiers.

Table 2 The test accuracy values (in %) determined for four considered training-test subsets for Wisconsin breast cancer data set

Full size table

Table 3 The test accuracy values (in %) determined for four considered training-test subsets for Pima Indians diabetes data set

Full size table

Table 4 The test accuracy values (in %) determined for four considered training-test subsets for Haberman survival data set

Full size table

Table 5 The test accuracy values (in %) determined for four considered training-test subsets for cardiotocography data set

Full size table

Table 6 The test accuracy values (in %) determined for four considered training-test subsets for dermatology data set

Full size table

Table 7 The test accuracy values (in %) determined for four considered training-test subsets for Statlog heart data set

Full size table

5.4 Illustration of the PNN training process

Figures 4, 5, 6, 7, 8 and 9 illustrate the changes of A c c ^train, A c c ^test, σ and r as a function of time steps for six data set classification problems. These changes are only shown for one exemplary data set partition. The plots are depicted for PNNS since for this model the smoothing parameter takes the form of a scalar. In each figure, we mark the maximum values of A c c ^train, A c c ^test and its corresponding smoothing parameter.

We can observe that the changes of the smoothing parameter values within the training process result from the implementation of the proposed procedure. The magnitude of these changes becomes smaller in subsequent stages of the procedure (e.g. Figs. 4, 6 and 9). Large modifications of the smoothing parameter provide the possibility of either finding optimal σ after a small number of steps (t=6 in Fig. 5), or narrowing the range of its possible optimal values (Fig. 9).

Another interesting feature worth noting is that the reinforcement signal follows the changes of the training accuracy. r becomes negative when A c c ^train decreases and r takes the positive value when A c c ^train increases.

On the basis of the figures, the following observation can also be noticed. (i) In the dermatology and Statlog heart data classification tasks, the maximal value of A c c ^test is obtained for A c c ^train=100 %. In the remaining classification problems, the maximal value of A c c ^train does not guarantee the highest value of the test accuracy; (ii) Only for the Haberman survival data set classification problem it is impossible to achieve 100 % of the training accuracy; (iii) The classification problem of the Haberman survival and Statlog heart data sets confirm that it is necessary to perform all stages of the procedure. In the first case, 100 % of the training accuracy is not reached in any stage. In the second one, the maximum value of A c c ^test=92.6 % is obtained in the third stage of the procedure.

6 Conclusions

In this article, the procedure based on the Q(0)-learning algorithm was proposed to the adaptive choice and computation of the smoothing parameters of the probabilistic neural network. All possible classes of the PNN models were regarded. These models differed in the way of the smoothing parameters representation. Application of the procedure based on theQ(0)-learning algorithm for PNN parameter tuning is the element of novelty. It is worth to note that the comparison of all types of probabilistic neural networks has not been presented in literature.

The proposed approach was tested on six data sets and compared with PNN trained by the conjugate gradient procedure, SVM algorithm, GEP classifier, k–Means method, multilayer perceptron, radial basis function neural network and learning vector quantization neural network. In three classification problems, at least one of the PNNC, PNNV or PNNVC models trained by the proposed procedure provided the highest average accuracy. Four out of six times, PNNS was the second to last data classifier. This means that the representation of the smoothing parameter, either in terms of a vector or a matrix, contributes to higher PNN’s prediction ability. As one can observe, for PNN trained by the conjugate gradient procedure the lowest accuracy was obtained for all six data classification cases. Thus, the proposition of any alternative method for probabilistic neural network training is by all means justified.

References

Adeli H, Panakkat A (2009) A probabilistic neural network for earthquake magnitude prediction. Neural Netw 22:1018–1024
Article Google Scholar
Asadpour M, Siegwart R (2004) Compact Q-learning optimized for micro-robots with processing and memory constraints. Robot Auton Syst 48(1):49–61
Article Google Scholar
Bache K, Lichman M (2013) UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. University of California School of Information and Computer Science. Irvine
Barto AG, Sutton RS, Anderson CW (1983) Neuronlike adaptive elements that can solve difficult learning problem. IEEE Trans SMC 13:834–847
Google Scholar
Bertin M, Schweighofer N, Doya K (2007) Multiple model-based reinforcement learning explains dopamine neuronal activity. Neural Netw 20:668–67
Article MATH Google Scholar
Braga APS, Arauno AFR (2003) A topological reinforcement learning agent for navigation. Neural Comput Applic 12:220– 236
Article Google Scholar
Chtioui Y, Panigrahi S, Marsh R (1998) Conjugate gradient and approximate Newton methods for an optimal probabilistic neural network for food color classification. Opt Eng 37:3015– 3023
Article Google Scholar
Demuth H, Beale M (1994) Neural network toolbox user’s guide. The Mathworks Inc.
Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Compl Syst 13(2):87– 129
MATH Google Scholar
Ferreira C (2006) Gene expression programming: mathematical modeling by an artificial intelligence. Springer, Berlin
Google Scholar
Georgiou LV, Pavlidis NG, Parsopoulos K E et al (2006) New self-adaptive probabilistic neural networks in bioinformatic and medical tasks. Int J Artificial Intell Tools 15:371–396
Article Google Scholar
Georgiou LV, Alevizos PD, Vrahatis MN (2008) Novel approaches to probabilistic neural networks through bagging and evolutionary estimating of prior probabilities. Neural Process Lett 27:153–162
Article MATH Google Scholar
Guvenir HA, Demiroz G, Ilter N (1998) Learning differential diagnosis of Eryhemato-Squamous diseases using voting feature intervals. Artif Intell Med 13:147–165
Article Google Scholar
Hartigan JA, Wong MA (1979) A k-means clustering algorithm. J R Stat Soc Ser C 1:100–108
Google Scholar
Heinen MR, Engel PM (2010) An incremental probabilistic neural network for regression and reinforcement learning tasks. In: Diamantaras K, Duch W, Iliadis LS (eds) Lecture notes in computer science, vol 6353. Springer, Berlin, Heidelberg, pp 170–179
Google Scholar
Iglesias A, Martinez P, Aler R, Fernandez F (2009) Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Appl Intell 31:89–106
Article Google Scholar
Jonathan P, Krzanowski WJ, McCarthy WV (2000) On the use of cross-validation to assess performance in multivariate prediction. Stat Comput 10:209–229
Article Google Scholar
Kim CO, Kwon I–H, Baek J–G (2008) Asynchronous action-reward learning for nonstationary serial supply chain inventory control. Appl Intell 28:1–16
Article Google Scholar
Kusy M, Zajdel R (2014) Stateless Q-learning algorithm for training of radial basis function based neural networks in medical data classification. In: Korbicz J, Kowal M (eds) Advances in intelligent systems and computing, vol 230. Springer, Berlin / Heidelberg, pp 267–278
Google Scholar
Kyriacou E, Pattichis MS, Pattichis CS et al (2009) Classification of atherosclerotic carotid plaques using morphological analysis on ultrasound images. Appl Intell 30:3–23
Article Google Scholar
Landwehr JM, Pregibon D, Shoemaker AC (1984) Graphical methods for assessing logistic regression models. J Am Stat Assoc 79:61–71
Article MATH Google Scholar
Li J, Li Z, Chen J (2011) Microassembly path planning using reinforcement learning for improving positioning accuracy of a 1 cm3 omni-directional mobile microrobot. Appl Intell 34:211– 225
Article Google Scholar
Maglogiannis I, Zafiropoulos E, Anagnostopoulos I (2009) An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl Intell 30:24– 36
Article Google Scholar
Mangasarian OL, Street WN, Wolberg WH (1995) Breast cancer diagnosis and prognosis via linear programming. Oper Res 43(4):570–577
Article MATH MathSciNet Google Scholar
Mantzaris D, Anastassopoulos G, Adamopoulos A (2011) Genetic algorithm pruning of probabilistic neural networks in medical disease estimation. Neural Netw 24:831–835
Article Google Scholar
Mendonca M, Arruda LVR, Neves F Jr (2012) Autonomous navigation system using Event Driven-Fuzzy Cognitive Maps. Appl Intell 37:175–188
Article Google Scholar
Nebti S, Boukerram A (2013) Handwritten characters recognition based on nature-inspired computing and neuro-evolution. Appl Intell 38:146–159
Article Google Scholar
Orr RK (1997) Use of a probabilistic neural network to estimate the risk of mortality after cardiac surgery. Med Decis Making 17:178–185
Article Google Scholar
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 36:1065–1076
Article MathSciNet Google Scholar
Platt JC (1999) Sequential minimal optimization: a fast algorithm for training support vector machines. In: Schlkopf B, Burges J C, Smola J (eds) Advances in kernel methods - support vector learning. MIT Press, Cambridge, pp 185–208
Google Scholar
Rutkowski L (2004) Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE Trans Neural Netw 15:811–827
Article Google Scholar
Samanta B, Al-Balushi KR, Al-Araimi SA (2006) Artificial neural networks and genetic algorithm for bearing fault detection. Soft Comput 10:264–271
Article Google Scholar
Schoknecht R, Riedmiller M (2003) Reinforcement learning on explicitly specified time scales. Neural Comput & Applic 12(2):61–80
Article Google Scholar
Schweighofer N, Doya K (2003) Meta-learning in reinforcement learning. Neural Netw. 16:5–9
Article Google Scholar
Sherrod PH (2013) DTREG predictive modelling software. http://www.dtreg.com. Accessed 26 September 2013
Smith JW, Everhart JE, Dickson WC, Knowler WC, Johannes RS (1988) Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the symposium on computer applications and medical care. IEEE Computer Society Press, pp 261–265
Specht DF (1990) Probabilistic neural networks and the polynomial adaline as complementary techniques for classification. IEEE Trans Neural Netw 1:11–121
Article Google Scholar
Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118
Article Google Scholar
Specht DF (1994) Experience with adaptive probabilistic neural networks and adaptive general regression neural networks. In: IEEE international conference on neural networks. USA, Orlando, pp 1203–1208
Google Scholar
Starzyk JA, Liu Y, Batog S (2010) A novel optimization algorithm based on reinforcement learning. In: Tenne Y, Goh C-K (eds) Computational intelligence in optimization, ALO, vol 7, pp 27–47
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York
Book MATH Google Scholar
Vien NA, Ertel W, Chung TC (2013) Learning via human feedback in continuous state and action spaces. Appl Intell 39:267–278
Article Google Scholar
Watkins C (1989) Learning from delayed rewards. PhD Dissertation. University of Cambridge, England
Google Scholar
Wen XB, Zhang H, Xu XQ, Quan JJ (2009) A new watermarking approach based on probabilistic neural network in wavelet domain. Soft Computing 13:355–360
Article Google Scholar
Wu QH, Liao HL (2010) High-dimensional function optimisation by reinforcement learning. In: IEEE congress on evolutionary computation (CEC), Barcelona, pp 1–8
Zhong M, Coggeshall D, Ghaneie E et al (2007) Gap-based estimation: choosing the smoothing parameters for probabilistic and general regression neural networks. Neural Comput 19(10):2840–2864
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported in part by Rzeszow University of Technology Grant No. U–235/DS and U–8613/DS.

Author information

Authors and Affiliations

Faculty of Electrical and Computer Engineering, Rzeszow University of Technology, al. Powstancow Warszawy 12, 35-959, Rzeszow, Poland
Maciej Kusy & Roman Zajdel

Authors

Maciej Kusy
View author publications
You can also search for this author in PubMed Google Scholar
Roman Zajdel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maciej Kusy.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Reprints and permissions

About this article

Cite this article

Kusy, M., Zajdel, R. Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification. Appl Intell 41, 837–854 (2014). https://doi.org/10.1007/s10489-014-0562-9

Download citation

Published: 06 August 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s10489-014-0562-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Probabilistic neural network training procedure based on Q(0)-learning algorithm in medical data classification

Abstract

Similar content being viewed by others

Stateless Q-Learning Algorithm for Training of Radial Basis Function Based Neural Networks in Medical Data Classification

Assessment of prediction ability for reduced probabilistic neural network in data classification problems

New Approaches in Metaheuristic to Classify Medical Data Using Artificial Neural Network

1 Introduction

2 Probabilistic neural network