Influence of graphical weights’ interpretation and filtration algorithms on generalization ability of neural networks applied to digit recognition
 769 Downloads
 2 Citations
Abstract
In this paper, the method of the graphical interpretation of the singlelayer network weights is introduced. It is shown that the network parameters can be converted to the image and their particular elements are the pixels. For this purpose, weighttopixel conversion formula is used. Moreover, new weights’ modification method is proposed. The weight coefficients are computed on the basis of pixel values for which image filtration algorithms are implemented. The approach is applied to the weights of three types of the models: singlelayer network, twolayer backpropagation network and the hybrid network. The performance of the models is then compared on two independent data sets. By means of the experiments, it is presented that the adjustment of the weights to new values decreases test error value compared to the error obtained for initial set of weights.
Keywords
Weights Neural network Filtration Digit recognition1 Introduction
Character recognition problem has enjoyed great attention for a few decades. Back in 1972, the way of automatic recognition of handwritten characters was already described [1]. In mideighties, the method of "learning" character sets and various feature extraction techniques were proposed [2]. Machine learning techniques, such as neural networks, played a very important role in this domain [3]. In 1990, the application of a backpropagation neural network to the recognition of handwritten US Postal Service Office zipcode was presented [4]. Input data differed significantly in writing style, character size, overlapping numerals, postmarks, horizontal bars and marks on the envelope which made the recognition process more difficult. Even though, the performance on zipcode digits was 92% recognition, 1% substitution and 7% rejects [4]. In the field of character classification, one did not only focus on offline handwriting recognition, which is performed after the process of writing. The effort was also devoted to online, i.e. dynamic handwriting recognition in which the machine recognizes the characters while the user writes [5]. The transducers (converters, e.g. tablet) were used for this purpose, but the process was strongly dependent on the power of contemporary computers. It is necessary to emphasize that neural networks are not the only models that have been used in handwritten patterns classification. The other methods of computational intelligence have also been applied. There are many contributions that present the use of distance classifiers [6, 7, 8], support vector machines [9, 8] or decision trees [10].
In spite of the fact that the task of character recognition has been thoroughly explored, it still attracts a lot of researchers nowadays. A great number of scientists still apply neural networks for this purpose. The recognition of subcontinental languages, e.g. Chinese letters [11, 12], Persian fonts [13] or Indian numeral optical characters [8, 14, 15], receives an increasing attention. For these particular cases, backpropagation neural networks, particle swarm optimization neural networks, singlelayer perceptrons and probabilistic neural networks were used. Numerous amount of work has been done on benchmarking Arabic digits (e.g. CENPARMI released by Concordia University, CEDAR released by CEDARSUNY Buffalo or MNIST extracted from the NIST database) where various neural models (multilayer perceptrons, radial basis function networks, learning vector quantization networks and polynomial networks) were tested against stateoftheart machine learning techniques such as nearest neighbor classifiers, naive Bayes, rulebased learning or support vector machines [16, 17, 18].
In this work, the concept of the graphical interpretation of the singlelayer neural network weights is proposed. The model is designed to classify all digits; thus, it is equipped with 10 neurons where each element is responsible for the recognition of a single numeral. Once the training process of the network is completed, it is shown that it is possible to convert the weights to pixel values in order to transform model parameters into the images. On the basis of the fact that the networks weights can be regarded as an image, the filtration algorithms are applied to the pixels obtained from the weight values. The filtered pixels then serve for new weights computation. The idea is tested on two data sets using three types of the models: singlelayer network, twolayer backpropagation network and the hybrid network by comparing the efficiency of the networks with the weights computed from the filtered images, and the performance of the models having original set of parameters. For computational purposes, all the models, image transformations and filtration algorithms were hardcoded in the authors’ software.
The main motivation of such a research lies in the intention of understating how the neural model "perceives" input data, how it faces image filtration and whether it is capable of generalizing to unknown examples.
The paper is organized as follows. Section 2 describes handwritten digit data sets used for recognition. In Sect. 3, the neural networks employed to the classification are briefly described. Section 4 highlights the graphical interpretation of the singlelayer network weights. Later on, in Sect. 5, the filtration algorithms applied to image pixels and new weight modification method are discussed. Section 6 verifies the performance of the neural networks in two digit classification tasks. Finally, Sect. 7 presents the conclusions.
2 Input data sets
Two digit databases are considered in the work. The first set represents the numerals entered by means of Wacom CTE440/S graphics tablet. Its working area covered A6 letter format (127.6 × 92.8 mm). The device resolution reached 2,000 dpi (787 lines/cm). Input patterns were entered by means of wireless pencil lead. Total input data included 1,000 handwritten digits (\(0,1,\ldots ,9\)), which, in turn, were converted to 30 × 40 size. For the sake of unification of all digit patterns, some necessary transformation operations were carried out. Initially, all the digits needed to be rescaled [19] since while writing on the tablet, their size was different. Then, each image underwent binarization [20] to be converted from colorful to the one in gray scale. Additionally, in order to limit the information of the characters that were written with thick lines and to extract the parts that represent the relevant elements of the images, the patterns were peeled off using skeletonization algorithm [21], [22]. Finally, due to the fact that the placement of the digits within the frame of the device’s screen was different, all the characters had to be centered for proper representation in 30 × 40 pixel pattern.
The second set was the MNIST database [7], a subset of a larger set available from National Institute of Standards and Technology (NIST). It consisted of 60,000 training examples and 10,000 test examples. The digits were sizenormalized and centered in a fixedsize 28 × 28 image. The images contained gray levels as a result of the antialiasing technique used by the normalization algorithm. The regular 28 × 28 database along with the content description and a performance results for some computational intelligence methods is available at [23].
3 Neural networks used in digit recognition
Three types of neural networks were analyzed in the research: singlelayer network, twolayer backpropagation network and the hybrid network. All the models are shortly discussed in the following subsections.
3.1 Singlelayer network
3.2 Twolayer backpropagation network
3.3 Hybrid network
4 Graphical interpretation of singlelayer network’s weights
The initial idea behind the interpretation of the weights of the singlelayer network was to understand how the particular neurons of the artificial model "see" their coefficients. Since the row input consists of 30 × 40 elements for the tablet data and 28 × 28 elements for the NIST database connected to m = 10 neurons, one may ponder whether it is possible to generate the pictures of the same size showing the weight values computed for each neuron after training process. Can the weights be perceived as an image of a digit?
In order to find the answer, it is necessary to provide the method of how to convert the weights (a set of real numbers) into the values that can correspond to the pixels with some brightness. Once such a transformation is determined, one can represent the set of calculated pixels in a resolution image. In this section, such a weight visualization is proposed. The following example highlights the idea.
As shown, neural network parameters determined after training process do not have to be only treated as some numbers that allow the model to classify input examples. The weights can also be interpreted as the images and, what was shown in this section, such an interpretation can illustrate the effect of network’s training and the way the artificial model "understands" digit recognition.
5 Image filtration and the weight adjustment
As shown in Sect. 4, the neural network weights can be interpreted in a graphical form. Each appropriately normalized coefficient is then treated as the element of the image seen by the neuron. This image, after the model training process, resembles a digit to a large degree. However, the picture is not so "perfect" as the original input pattern. For example, the neuron weights obtained from NIST digits illustrated in Fig. 1 in the form of pixels are blurry. The weights calculated from the tablet numerals, as the image, do not have a strong signal (white pixels) but are more distinct. For this reasons, high and lowpass filtration algorithms were applied to pixels computed from the set of optimal weights found within the network training on tablet and NIST data sets. On the basis of filtered pixels, neural network weight modification was introduced. Following subsections describe the filtration algorithms and the solution of how to adjust the weights parameters on the basis of new pixel values.
5.1 Filtration algorithms
5.2 Weights adjustment
6 The performance of neural networks on digit data sets
In this part of work, the comparative efficiency analysis was conducted for the singlelayer network, twolayer backpropagation network and the hybrid network in the classification of tablet and NIST data sets. The comparison was carried out by measuring two indicators of the performance: test error—calculated by the models with the weights obtained after training process, and test error after filtration—determined by the networks with the weights updated according to the mapping (13). Both factors were computed on the test set different from the patterns used in the training process: 200 of numerals for the tablet data set and 10,000 digits for NIST database. The errors were measured as the function of the network training parameters. Two following subsections highlight the results received on each data set. Afterward, a short summary is added.
6.1 Tablet data set

Gaussian filtration (LPG mask) provided the lowest test error for singlelayer network (10.425%), twolayer backpropagation network (9.468%) and the hybrid network (17.234%),

the lowest overall test error (9.468%) was achieved by twolayer backpropagation network with the set of weights modified by Gaussian filtration algorithm,

the highest reduction in the error rate was equalled 2.925%—it was found when applying Gaussian filtration to the weights of backpropagation network,

all filtration algorithms decreased 19.894% test error by the margin of 1.276% (HP3), 2.021% (LP3) and 2.659% (LPG) for the hybrid network,

HP3 mask filtration increased test error by 0.106% and 0.851% for the singlelayer and backpropagation network, respectively.
The lowest percentage of test error (Test) and test errors after filtration (HP3, LP3, LPG) found for singlelayer network, twolayer backpropagation network and hybrid network in tablet digits classification
Model  Minimum error values [%]  

 Test  HP3  LP3  LPG 
Singlelayer network  12.127  12.234  11.117  10.425 
Backpropagation network  12.394  13.245  10.585  9.468 
Hybrid network  19.894  18.617  17.872  17.234 
6.2 NIST database

HP3 and LP3 mask decreased test set error for each neural network,

all filtration algorithms decreased 18.174% test error by the margin of 0.424% (HP3), 0.382% (LP3) and 0.286% (LPG) for the hybrid network,

LP3 mask filtration applied to backpropagation network weights provided the lowest test error among all classifiers,

Gaussian filtration algorithm made the test error increase for the singlelayer network by 0.111%.
The lowest percentage of test error (Test) and test errors after filtration (HP3, LP3, LPG) recorded for singlelayer network, twolayer backpropagation network and hybrid network in NIST database pattern recognition
Model  Minimum error values [%]  

 Test  HP3  LP3  LPG 
Singlelayer network  9.376  9.172  9.294  9.488 
Backpropagation network  9.078  9.068  8.966  9.078 
Hybrid network  18.174  17.750  17.792  17.888 
6.3 Summary
However, HP3 mask applied to the weights obtained from the training of the NIST digits made the filtration test error lower than the error determined on the unfiltered coefficients for the singlelayer network and the hybrid network. LPG filtration, as shown in Figs. 2 and 3 provided here worse results for the single and twolayer networks, respectively. It can be justified by the fact that the set of weights as the image was blurry (Fig. 1). On the other hand, for the hybrid network (Fig. 4), all filtration algorithms decreased test error rate for each number of hidden neurons.
7 Conclusion
In the article, the method of singlelayer neural network weights interpretation was proposed. The network was destined to recognize digits from the range \(0,1,\ldots ,9; \) therefore, it was built of 10 neurons. Each unit recognized single numeral. It was shown that after the training of the model, one is possible to transform the weight values to the image pixels. The idea was tested on two data sets: 30 × 40 resolution 1,000 digits entered by means of the graphical tablet and 60,000 NIST web page database digits with 28 × 28 size. Furthermore, high and lowpass filtration algorithms were applied to the pixels computed from the weight values. On the basis of the filtered pixels, the weights of the network were adjusted. This approach was then verified on three types of the models: singlelayer network, twolayer backpropagation network and the hybrid network by comparing the performance of the models having the weights computed from the filtered images and the coefficients obtained after training process. The analysis were carried out on both data sets. The results presented in the work showed that, in both data classification cases, the filtration algorithms decreased test error calculated by the networks with the weights set to values determined after training process. In particular, in tablet data recognition, the use of the LPG mask (Gaussian lowpass filter) provided the lowest test error for singlelayer network (10.425%), twolayer backpropagation network (9.468%) and the hybrid network (17.234%). Moreover, this filtration algorithm applied to the weights of backpropagation network reduced the test error rate by the margin of 2.925% what yielded the lowest test error among all models (9.468%).
The improvement obtained by considered networks in the test error rate for NIST digit recognition was not that large though. The highest reduction of this indicator (0.424%) was obtained when using highpass filtration (HP3 mask) to the weights of the hybrid network. The application of both HP3 and LP3 masks decreased admittedly the test error value, but the gain was subtle. It can be explained by the fact that this particular data set is a web base to which no image preprocessing was applied. In contrast, the tablet data set images, before been fed as the input to the networks, underwent the skeletonization process that extracted the shape of pattern digits.
The entire process of computing the pixels from the optimal network weights, applying the filtration algorithm to calculated pixels and, finally, updating the weights on the basis of filtered pixels can increase the generalization ability of the neural network. However, it is important to add that such an improvement can be found if an appropriate image filtration is applied. Sometimes, it may even amount to a trial and error approach. Moreover, some data preprocessing has to be performed since the images to be classified usually contain a lot of information, which mislead the network in the process of generalization.
Notes
Acknowledgments
This research was partially supported by Rzeszow University of Technology Grant No. U8255/DS and NN 514 705540 from National Science Centre. The authors are grateful to valuable comments of the anonymous reviewer. All the remarks significantly improved the quality of the manuscript.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References
 1.Harmon LD (1972) Automatic recognition of print and script. Proc IEEE 60(10):1165–1176CrossRefGoogle Scholar
 2.Davis RH, Lyall J (1986) Recognition of handwritten characters  a review. J Image Vis Comput 4(4):208–218CrossRefGoogle Scholar
 3.Bishop M (1995) Neural networks for pattern recognition. Oxford University Press, New YorkGoogle Scholar
 4.LeCun Y, Matan O, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jacket LD, Baird HS (1990) Handwritten zip code recognition with multilayer networks. In: Proceedings of 10th international conference on pattern recognition, vol 2, pp 35–40, Atlantic City, USAGoogle Scholar
 5.Tappert CC, Suen CY, Wakahara T (1990) The state of the art in online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 12(8):787–808CrossRefGoogle Scholar
 6.Weideman WE, Manry MT, Yau HC, Gong W (1995) Comparisons of a Neural Network and a NearestNeighbor Classifier via the Numeric Handprint Recognition Problem. IEEE Trans Neural Net 6(6):1524–1530CrossRefGoogle Scholar
 7.Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradientbased learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
 8.Shrivastava SK, Gharde SS (2010) Support Vector Machine for Handwritten Devanagari Numeral Recognition. Int J Comput Appl 7(11):9–14Google Scholar
 9.Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
 10.WenLi J, ZhengXing S, Bo Y, WenTao Zheng, WenHui X (2006) Userindependent online handwritten digit recognition. In: International conference on machine learning and cybernetics, pp 3359–3364, Dalian, ChinaGoogle Scholar
 11.ChengLin L, Jaeger S, Nakagawa M (2004) Online recognition of Chinese characters: the stateoftheart. IEEE Trans Pattern Anal Mach Intell 26(2):198–203CrossRefGoogle Scholar
 12.Zhitao G, Jinli Y, Yongfeng D, Junhua G (2009) Handwritten Chinese characters recognition based on PSO neural networks. In: Second international conference on intelligent networks and intelligent systems, pp 350–353Google Scholar
 13.Pourmohammad A, Ahadi SM (2009) Using singlelayer neural network for recognition of isolated handwritten Persian digits. In: 7th International conference on Information. Communicat Sig Proc, pp 1–4, MacauGoogle Scholar
 14.AlOmari FA, AlJarrah O (2004) Handwritten Indian numerals recognition system using probabilistic neural networks. Adv Eng Inform 18:9–16CrossRefGoogle Scholar
 15.Desai AA (2010) Gujarati handwritten numeral optical character reorganization through neural network. Pattern Recogn 43:2582–2589zbMATHCrossRefGoogle Scholar
 16.Liu C, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of stateoftheart techniques. Pattern Recogn 36:2271–2285zbMATHCrossRefGoogle Scholar
 17.AlOmari S, Sumari P, AlTaweel SA, Husain AJA (2009) Digital recognition using neural network. J Comput Sci 5(6):427–434CrossRefGoogle Scholar
 18.ElAlfy EM (2010) Offline recognition of handwritten numeral characters with polynomial neural networks using topological features. Lect Notes Comput Sci 6085:173–183CrossRefGoogle Scholar
 19.Watkins CD, Sadun A, Marenka S (1995) Nowoczesne metody przetwarzania obrazu. WNT, WarszawaGoogle Scholar
 20.Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Pattern Anal Mach Intell 17(3):312–315CrossRefGoogle Scholar
 21.Gonzales R, Woods RE (2002) Digital image processing. Prentice Hall, New JerseyGoogle Scholar
 22.Gupta R, Kaur R (2008) Skeletonization algorithm for numeral patterns. Int J Sig Proc Image Proc Pattern Recogn 63–72Google Scholar
 23.Lecun Y. NIST data base repository. http://yann.lecun.com/exdb/mnist/
 24.Tadeusiewicz R (1993) Sieci neuronowe. Akademicka Oficyna Wydawnicza, WarszawaGoogle Scholar
 25.Widrow B, Hoff ME (1960) Adaptive switching circuits. IRE WESCON Conv Rec 4:96–104Google Scholar
 26.Kohonen T (1995) Selforganizing maps. Springer, BerlinCrossRefGoogle Scholar
 27.Osowski S (2006) Sieci neuronowe do przetwarzania informacji. WNT, WarszawaGoogle Scholar
 28.Martinetz M, Berkovich S, Schulten K (1993) Neuralgas network for vector quantization and its application to timeseries prediction. IEEE Trans Neural Net 4(4):558–569CrossRefGoogle Scholar
 29.Tadeusiewicz R, Korohoda P (1997) Algorytmy i metody komputerowej analizy i przetwarzania obrazow. Poldex, KrakowGoogle Scholar