International Conference on Information and Software Technologies

Information and Software Technologies pp 400-411 | Cite as

Rough Deep Belief Network - Application to Incomplete Handwritten Digits Pattern Classification

  • Wojciech K. Mleczko
  • Tomasz Kapuściński
  • Robert K. Nowicki
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 538)


The rough deep belief networks (RDBN) are new modification of well known deep belief networks. Thanks to applied elements from Pawlak’s rough set theory, RDBNs are suitable in processing of incomplete patterns. In this paper we present the results of adaptation of this class of networks for classification of handwritten digits. The samples of the pattern applied in the learning and working processes are randomly corrupted. This allows to study the robustness of classifier for various levels of incompleteness.


Deep belief network Rough set Missing features 

1 Introduction

The Restricted Boltzmann Machine [7, 27] is one of sophisticated types of neural networks which can process probability distribution, and is applied to filtering, image recognition, and modelling [4]. Deep belief network [2, 9] is a structure that contains RBMs. As other types of computational intelligence systems, DBNs can process real data, which often contain imperfections such as noise, inexactness, uncertainty and incompleteness. The easiest way to use such data is kind form of preprocessing. In the case of incompleteness there are two general ways, imputation and marginalization. They can take into consideration a class of incompleteness, for example MCAR (Missing Completely At Random), MAR (Missing At Random), MNAR (Missing Not At Random) [19]. An interesting way to process the data with a variable set of available values of input features is the rough set theory proposed by Pawlak [25, 26]. It defines the approximations of the sets in form of the pair of sets, called the rough set, and consist of the lower and upper approximation. The quality of approximation depends on the usefulness of available knowledge. The theory has been extended by defining the rough fuzzy sets, fuzzy rough sets [5, 6], covering rough sets [30, 31] and other. It allows us to extend various types of a fuzzy systems [3, 12, 13, 14, 21, 22], a nearest neighbor classifier [23], a decision tree [20] and other [2, 24] to work with missing data. The resulting systems have been called rough fuzzy systems, rough k-NN classifiers etc. In some solutions missing values are replaced by appropriate interval which can cover the whole domain of feature (MCAR) or its parts (MAR, MNAR). Answer of the systems is represented as an interval or, in the case of classification, information about assignment to one of three regions defined in rough set theory, i.e. positive, boundary and negative. It means that, using available input information, the classifier can decide that the object being classified definitely belongs to a class (positive region), definitely does not belong to a class (negative region) or that input information is insufficient to make a decision (boundary region). It also allows us to start the classification process with limited description of classified object and complement it until the answer is either positive or negative.

In the paper we introduce Rough Deep Belief Network (RDBN) which is created in similar way. It is a structure that contains rough restricted Boltzmann machines (RRBM) capable of processing information in the form of intervals as well as incomplete data. It should be noted that the vast majority of network-like architectures are suitable in various parallel implementations. It could be realized using many signal processors connected by dedicated serial bus [1] and multicore CPU architectures [28, 29]. Nowadays, networks are even implemented in structures made of single molecules [17], for example distributed in mesoporous silica matrix [15, 16]. RDBN like other rough hybrids uses answer “unknown” when input information is too incomplete to make a credible answer. In the same case, other classifiers give answer with low level of credibility — frequently incorrect.

The paper is organized as follows. The Sect. 2 brings the reader to architecture of DBN, followed by Sect. 3 describes RDBN. The following section describes the MNIST database of handwritten digits which was used in testing. Then, the obtained result is presented. Section 6 summarizes the work.

2 Deep Belief Network Architecture

Deep Belief Networks (DBN) are neural networks composed of multiple layers of latent stochastic variables which use Restricted Boltzmann Machines (RBMs) [27] as their basic building blocks. Deep Belief Networks were proposed by Hinton et al. along with an unsupervised greedy learning algorithm for constructing a network one layer at a time [9].
Fig. 1.

Schematic representation of a Restricted Boltzmann Machine (RBM)

In summary, an RBM contains a set of stochastic hidden units h that are fully connected in an undirected model to a set of stochastic visible units v as shown in Fig. 1. The RBM - (l) defines the following joint distribution:
$$\begin{aligned} E^{(l)} \left( \mathbf v,\mathbf h \right) = \exp \left( - \sum _{i \epsilon visible} b^{(l)}_{vi}v^{(l)}_i - \sum _{j \epsilon hidden} b^{(l)}_{hj}h^{(l)}_j - \sum _{i,j} v^{(l)}_i h^{(l)}_j w^{(l)}_{ij}\right) \!, \end{aligned}$$
where \(v_i\), \(h_j\) are binary states of visible unit i and hidden unit j, \(b_{vi}\), \(b_{hj}\) are their biases and \(w_{ij}\) is the weight between them. The network assigns a probability to every possible pair of visible and hidden vectors via following energy function:
$$\begin{aligned} p^{(l)} \left( \mathbf v,\mathbf h \right) = \frac{1}{Z} e^{-E^{(l)} \left( \mathbf v,\mathbf h \right) }, \end{aligned}$$
where the partitionfunctionZ is given by summing over all possible pairs of visible and hidden vectors:
$$\begin{aligned} Z^{(l)} = \sum _{\mathbf v,\mathbf h} e^{-E^{(l)} \left( \mathbf v,\mathbf h \right) }, \end{aligned}$$
The probability which network assigns to a visible vector \(\mathbf v\) is given by summing over all possible hidden vectors:
$$\begin{aligned} p^{(l)} \left( \mathbf v \right) = \frac{1}{Z^{(l)}} \sum _{\mathbf h} e^{-E^{(l)} \left( \mathbf v,\mathbf h \right) }, \end{aligned}$$
Given a random input configuration \(\mathbf v\), the state of the hidden unit j is set to 1 with probability:
$$\begin{aligned} P^{(l)}\left( h^{(l)}_{j} = 1 |\mathbf v^{(l)}\right) = \sigma \left( b^{(l)}_{hj} + \sum _{i} v^{(l)}_i w^{(l)}_{ij} \right) \!, \end{aligned}$$
where \(\sigma (x)\) is the logistic sigmoid function \(\frac{1}{1+\exp (-x) }\). Similarly, given a random hidden vector, the state of the visible unit i can be set to 1 with probability:
$$\begin{aligned} P^{(l)}\left( v^{(l)}_{i} = 1 |\mathbf h^{(l)}\right) = \sigma \left( b^{(l)}_{vi} + \sum _{i} h^{(l)}_j w_{ij} \right) \!. \end{aligned}$$
The probability which network assigns to the training image can be raised by adjusting the weights and biases to lower the energy of that image and to raise the energy of other images, especially those that have low energies and therefore make a big contribution to the partition function. The derivative of the log probability of a training vector with respect to a weight is surprisingly simple.
$$\begin{aligned} \frac{\partial \log p^{(l)}(\mathbf v)}{\partial w^{(l)}_{ij}} = \langle v^{(l)}_i h^{(l)}_j\rangle _0 - \langle v^{(l)}_i h^{(l)}_j\rangle _\infty , \end{aligned}$$
where \( \langle \cdot \rangle _0 \) denotes the expectations for the data distribution \( (p_0)\) and \( \langle \cdot \rangle _\infty \) denotes the expectations for the model distribution \((p^{(l)}_\infty )\) [18]. It can be done by starting at any random state of the visible units and performing alternating Gibbs sampling for a very long time. One iteration of alternating Gibbs sampling consists of updating all hidden units in parallel using Eq. 5 followed by updating all visible units in parallel using Eq. 6 [7].
To solve this problem, Hinton proposed a much faster learning procedure - Contrastive Divergence algorithm [7, 8]. This procedure can be applied in order to correct the weights and bias of the network:
$$\begin{aligned} {\varDelta }w^{(l)}_{ij} = \eta \left( \langle v^{(l)}_i h^{(l)}_j\rangle _0 - \langle v^{(l)}_i h^{(l)}_j\rangle _\infty \right) \!, \end{aligned}$$
$$\begin{aligned} {\varDelta }b^{(l)}_{\mathrm {v}i} =\eta (v^{(l)}_{i0} - v^{(l)}_{i\infty }), \end{aligned}$$
$$\begin{aligned} {\varDelta }b^{(l)}_{\mathrm {h}j} =\eta (h^{(l)}_{j0} - h^{(l)}_{j\infty }), \end{aligned}$$
Fig. 2.

DBN an example of architecture

Every next layer is stacked on top of the DBN as shown in Fig. 2. The training process is performed in an unsupervised manner allowing the system to learn complex functions by mapping the input to the output directly from data. All weights of DBNs must be pre-trained layer by layer as the RBM training. After pre-training, the weights of DBNs are fine-tuned by the standard back-propagation algorithm and the steepest descent algorithm as the Multi-Layer Perceptron (MLP). For this purpose we create an additional layer using the output from last RBM layer which will represent a model of logistic regression in form of a probabilistic classifier. An additional layer forms a bilayer network which output units are defined as follows:
$$\begin{aligned} y^{(L)}_{j} = softmax_j(w^{(L)}_{ij} h^{(L)}_i + b^{(L)}_{j}), \end{aligned}$$
where L is an additional layer, \(y^{(L)}_{j}\) output from the network, \(w^{(L)}_{ij}\) and \( b^{(L)}_{j}\) weight and bias of extra layer, their initial value is set to 0, \(h^{(L)}_i\) value is obtained from the last RBM layer with a scholar DBN. Softmax function is calculated as follows:
$$\begin{aligned} softmax_j(w^{(L)}_{ij} h^{(L)}_i + b^{(L)}_{j}) = \frac{e^{w^{(L)}_{ij} h^{(L)}_i + b^{(L)}_{j}}}{\sum _j e^{w^{(L)}_{ij} h^{(L)}_i + b^{(L)}_{j}}} \end{aligned}$$

3 Rough Deep Belief Network

For the purpose of data processing in form of intervals we propose a version of RBM architecture which constists of a pair of machines both using a common weight matrices \(\mathbf {W}=\left\{ w_{ij}\right\} \). Part responsible for processing the bottom ends of output intervals is called the lower engine, a part responsible for processing the upper ends of the output intervals – upper engine. The unavailable (missing) input values \(v_i\) are replaced by appropriate intervals \(\left[ \underline{v}_i,\overline{v}_i\right] \) which can cover the whole domain of the feature (MCAR) or its parts (MAR, MNAR). Thus the data is processes alike in the case of the two separate classic RBMs with a few exceptions. The first one is the way linear output of neurons is calculated. Both lower and upper machine can be given lower or upper end of intervals. The choice depends on the sign of appropriate weight. Thus, the lower linear output of neurons in hidden layer is calculated as follows:
$$\begin{aligned} \underline{s}_{\mathrm {h}j}(t) = \sum _{i=0 \atop w_{ij}(t) > 0} w_{ij}(t) \cdot \underline{v}_i(t) + \sum _{i=0 \atop w_{ij}(t) < 0} w_{ij}(t) \cdot \overline{v}_i(t) + b_{\mathrm {h}j}(t) \text {.} \end{aligned}$$
The upper value is derived by opposite conditions:
$$\begin{aligned} \overline{s}_{\mathrm {h}j}(t) = \sum _{i=0 \atop w_{ij}(t) > 0}^{N} w_{ij}(t) \cdot \overline{v}_i(t) + \sum _{i=0 \atop w_{ij}(t) < 0}^{N} w_{ij}(t) \cdot \underline{v}_i(t) +b_{\mathrm {h}j}(t) \end{aligned}$$
Similar methodology is applied in the visible layer, i.e.
$$\begin{aligned} \underline{s}_{\mathrm {v}i}(t) = \sum _{j=0 \atop w_{ij}(t) > 0}^{N} w_{ij}(t) \cdot \underline{h}_j(t) + \sum _{j=0 \atop w_{ij}(t) < 0}^{N} w_{ij}(t) \cdot \overline{h}_j(t) + b_{\mathrm {v}i}(t) \text {,} \end{aligned}$$
$$\begin{aligned} \overline{s}_{\mathrm {v}i}(t) = \sum _{i=0 \atop w_{ij}(t) > 0}^{N} w_{ij}(t) \cdot \overline{h}_j(t) + \sum _{i=0 \atop w_{ij}(t) < 0}^{N} w_{ij}(t) \cdot \underline{h}_j(t) +b_{\mathrm {v}i}(t) \text {.} \end{aligned}$$
The output of j-th neuron in hidden layer of lower RBM is signed by \(underline h_j(t)\), and \(overline h_j(t)\) in the case of upper RBM. They are derived with the probability described by non-linear output of the neurons as follows:
$$\begin{aligned} P\left( \underline{h}_{0j}(t) = 1 | \underline{y}_{\mathrm {h}j}(t)\right) = \underline{y}_{\mathrm {h}j}(t), \end{aligned}$$
$$\begin{aligned} P\left( \overline{h}_{0j}(t) = 1 | \overline{y}_{\mathrm {h}j}(t)\right) = \overline{y}_{\mathrm {h}j}(t), \end{aligned}$$
$$\begin{aligned} \underline{h}_{0j}(t)\le \overline{h}_{0j}(t)\text {.} \end{aligned}$$
The output of the neurons in visible layers is derived in the same way.

The common weights \(w_{ij}\), biases in hidden layers \(b_{\mathrm {h}j}(t)\) and visible layers \(b_{\mathrm {v}i}(t)\) are corrected using correction values \({\varDelta }w_{ij}\), \({\varDelta }b_{\mathrm {h}j}(t)\) and \({\varDelta }b_{\mathrm {v}i}(t)\) which come from both upper and lower RBMs.

The output of the classifier for the additional layer are defined as follows
$$\begin{aligned} \underline{y}^{(L)}_{j} = softmax_j(\underline{s}^{(L)}_{i}(t)), \end{aligned}$$
$$\begin{aligned} \overline{y}^{(L)}_{j} = softmax_j(\overline{s}^{(L)}_{i}(t)), \end{aligned}$$
where L is an additional layer, \(\underline{y}^{(L)}_{j}\) and \(\overline{y}^{(L)}_{j}\) are output from the network, \(\underline{s}^{(L)}_{i}(t)\) and \(\overline{s}^{(L)}_{i}(t)\) are values obtained from the last RBM with a scholar DBN. Softmax function is calculated using Eq. 12.

4 The MNIST Database of Handwritten Digits

In our work we used MNIST database which contains samples of handwritten digits. Samples are commonly used while testing machine learning, pattern recognition techniques, and their implementations. Database has been created from NIST’s databases and is divided into 60,000 training samples and 10,000 testing samples. Each sample is a 28 by 28 gray-scale image representing a single handwritten digit. All samples have been scaled down to 20 by 20 bounding box while preserving their aspect ratio and positioned so the center of mass of the pixels is at the center of 28 by 28 image.

Data sets are stored in four files. Two files contain images for the data sets. Two remaining files contain labels for corresponding images. Images and labels are stored in custom, easy to read binary format. All images from given data set are kept as a sequence of bytes organized row-wise. Each byte defines a color for single pixel. Value 0 means white color (paper) and value 255 means black color (ink). Some samples are shown in Fig. 3.
Fig. 3.

Example samples taken from MNIST Database

Files containing labels have very similar format but instead of image data they only contain a single byte value for each image. Values are in range from 0 to 9 and describe what kind of digit corresponding image represents. Labels are specified in the same order as images data set.

5 Implementation and Experimental Results

For the purpose of our study, solution has been implemented in Matlab which allowed us to compare results with implementation presented by Karpathy [10, 11]. In all our tests we used 6000 samples of hand-written digits from the data set described in Sect. 4.

5000 samples used for learning the system, and the remaining 1000 samples used for testing the system. In experiments, it has not applied any rotation of the sample. Missing data were obtained completely randomly by assigning pseudo-random data from the interval [0,1]. DBN and RDBN networks had 784-500-500-100-10 architecture. Last layers from tested networks used softmax activation function and all remaining layers used sigmoid function.
Fig. 4.

Comparison of incorrectly recognized digits

In RDBN, the result of digit recognition is based on comparison of results from upper and lower systems. Those systems are created at the beginning of algorithm. For complete data, results from those systems are identical and similar to results from DBN. Small differences originate from the stochastic nature of the algorithm. In case that we provide incomplete data to systems, results are calculated in accordance to model described in Eqs. 1321. As a result, we obtain two systems that adequately recognize digits from testing data set.
Fig. 5.

Comparison of correctly recognized digits

Fig. 6.

Samples which were incorrectly classified by RDBN with \(5\,\%\) of the missing data

Fig. 7.

Samples which were considered unknown by RDBN at \(5\,\%\) of the missing data

Tested image is given classification when lower and upper systems provide the same answer. If the answers are different, RDBN informs us that it does not recognize the digit. This way system refrains from making possibly incorrect classification and reduces total number of mistakes. Resulting information is thereby more reliable and more accurate. Details describing how digit recognition is performed is shown on Algorithm 1.

System is able to correctly classify most images when little information is missing. When the amount of missing information exceeds \(25\,\%\) system starts being unable to correctly classify majority of samples and the number of “unknown” answers increases. Unknown answers represent different type of classification and are therefore not counted in diagrams representing correct and incorrect answers. Thanks to extending the range of possible answers by response “unknown”, total number of mistakes for RDBN system has been significantly reduced which is shown in Fig. 4. Results show that RDBN system provides noticeably fewer incorrect answers than DBN. Comparison of correct classification results between RDBN and DBN is shown in Fig. 5. RDBN gives a similar number of correctly classified digits as DBN which proves comparable level of efficiency.

Some of the samples used for testing are particularly difficult to classify due to irregularities of handwriting. Removing randomly information can turn a digit into different one or make it unrecognizable. Figure 6 shows tested samples with randomly removed \(5\,\%\) of information which RDBN could not correctly recognize and gave incorrect answer. Figure 7 shows samples which system could not classify and instead gave answer “unknown”.

6 Conclusions and Future Work

In the paper we examined rough deep belief network as a system for recognition of handwritten digits in samples with missing values. The investigation was processed for various level of missing input information to evaluate the robustness of the classifier. The obtained results confirm again that the rough set theory is a useful to extend traditional computational intelligence systems. The digits were recognized with quite high level of missing pixels. The indisputable advantage of RDBN and other system extended using rough set theory is the possibility to apply the incomplete information also in the developing (e.g. learning) phase. The future step in the investigation is to use RDBN with data containing other forms of imperfection, for example patterns with erroneous values and noise.


  1. 1.
    Bilski, J.: Momentum modification of the RLS algorithms. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 151–157. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Chu, J.L., Krzyzak, A.: The recognition of partially occluded objects with support vector machines and convolutional neural networks and deep belief networks. J. Artif. Intell. Soft Comput. Res. 4(1), 5–19 (2014)CrossRefGoogle Scholar
  3. 3.
    Cpaka, K., Nowicki, R., Rutkowski, L.: Rough-neuro-fuzzy systems for classification. In: The First IEEE Symposium on Foundations of Computational Intelligence (FOCI 2007) (2007)Google Scholar
  4. 4.
    Dourlens, S., Ramdane-Cherif, A.: Modeling & understanding environment using semantic agents. J. Artif. Intell. Soft Comput. Res. 1(4), 301–314 (2011)Google Scholar
  5. 5.
    Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets. Int. J. Gen. Syst. 17(2–3), 191–209 (1990)CrossRefMATHGoogle Scholar
  6. 6.
    Dubois, D., Prade, H.: Putting rough sets and fuzzy sets together. In: Słowiński, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, pp. 203–232. Kluwer, Dordrecht (1992)CrossRefGoogle Scholar
  7. 7.
    Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)CrossRefMATHGoogle Scholar
  8. 8.
    Hinton, G.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 926 (2010)Google Scholar
  9. 9.
    Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Karpathy, A.: Code for training restricted Boltzmann machines (RBM) and deep belief networks in MATLAB.
  11. 11.
    Karpathy, A.: CPSC 540 project: Restricted Boltzmann machinesGoogle Scholar
  12. 12.
    Korytkowski, M., Nowicki, R., Rutkowski, L., Scherer, R.: AdaBoost ensemble of DCOG rough–neuro–fuzzy systems. In: Jędrzejowicz, P., Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 62–71. Springer, Heidelberg (2011)Google Scholar
  13. 13.
    Korytkowski, M., Nowicki, R., Scherer, R.: Neuro-fuzzy rough classifier ensemble. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009, Part I. LNCS, vol. 5768, pp. 817–823. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Korytkowski, M., Nowicki, R., Scherer, R., Rutkowski, L.: Ensemble of rough-neuro-fuzzy systems for classification with missing features. Proc. World Congr. Comput. Intell. 2008, 1745–1750 (2008)Google Scholar
  15. 15.
    Laskowski, L., Laskowska, M.: Functionalization of sba-15 mesoporous silica by cu-phosphonate units: probing of synthesis route. J. Solid State Chem. 220, 221–226 (2014)CrossRefGoogle Scholar
  16. 16.
    Laskowski, L., Laskowska, M., Balanda, M., Fitta, M., Kwiatkowska, J., Dzilinski, K., Karczmarska, A.: Mesoporous silica sba-15 functionalized by nickel-phosphonic units: raman and magnetic analysis. Microporous Mesoporous Mater. 200, 253–259 (2014)CrossRefGoogle Scholar
  17. 17.
    Laskowski, Ł., Laskowska, M., Jelonkiewicz, J., Boullanger, A.: Spin-glass implementation of a hopfield neural structure. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part I. LNCS, vol. 8467, pp. 89–96. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  18. 18.
    Le Roux, N., Bengio, Y.: Representational power of restricted boltzmann machines and deep belief networks. Neural Comput. 20(6), 1631–1649 (2008)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Little, R., Rubin, D.: Statistical Analysis with Missing Data. Wiley, New York (1987)MATHGoogle Scholar
  20. 20.
    Nowak, B.A., Nowicki, R.K., Mleczko, W.K.: A new method of improving classification accuracy of decision tree in case of incomplete samples. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part I. LNCS, vol. 7894, pp. 448–458. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  21. 21.
    Nowicki, R.: Rough-neuro-fuzzy structures for classification with missing data. IEEE Trans. Syst. Man Cybern.-Part B: Cybern 39(6), 1334–1347 (2009)CrossRefGoogle Scholar
  22. 22.
    Nowicki, R.: On combining neuro-fuzzy architectures with the rough set theory to solve classification problems with incomplete data. IEEE Trans. Knowl. Data Eng. 20(9), 1239–1253 (2008)CrossRefGoogle Scholar
  23. 23.
    Nowicki, R.K., Nowak, B.A., Woźniak, M.: Rough k nearest neighbours for classification in the case of missing input data. In: Proceedings of the 9th International Conference on Knowledge, Information and Creativity Support Systems, Limassol, Cyprus, pp. 196–207, November 2014Google Scholar
  24. 24.
    Pawlak, M.: Kernel classification rules from missing data. IEEE Trans. Inf. Theory 39, 979–988 (1993)CrossRefMATHGoogle Scholar
  25. 25.
    Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11(5), 341–356 (1982)CrossRefMATHGoogle Scholar
  26. 26.
    Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer, Dordrecht (1991)CrossRefMATHGoogle Scholar
  27. 27.
    Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart, D.E., McLelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 1 Fundations, pp. 194–281. MIT Press, Cambridge (1986)Google Scholar
  28. 28.
    Staff, C.I., Reinders, J.: Parallel Programming and Optimization with Intel\(\textregistered \) Xeon PhiTM Coprocessors: Handbook on the Development and Optimization of Parallel Aplications for Intel\(\textregistered \) Xeon Coprocessors and Intel\(\textregistered \) Xeon PhiTM Coprocessors. Colfax International, Sunnyvale (2013)Google Scholar
  29. 29.
    Szustak, L., Rojek, K., Gepner, P.: Using Intel Xeon Phi coprocessor to accelerate computations in MPDATA algorithm. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013, Part I. LNCS, vol. 8384, pp. 582–592. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  30. 30.
    Zhu, W., Wang, F.Y.: Reduction and axiomization of covering generalized rough sets. Inform. Sci. 152, 217–230 (2003)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Zhu, W., Wang, F.Y.: On three types of covering-based rough sets. IEEE Trans. Knowl. Data Eng. 19(8), 1131–1144 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Wojciech K. Mleczko
    • 1
  • Tomasz Kapuściński
    • 1
  • Robert K. Nowicki
    • 1
  1. 1.Institute of Computational IntelligenceCzestochowa University of TechnologyCzestochowaPoland

Personalised recommendations