Exploratory Analysis of MNIST Handwritten Digit for Machine Learning Modelling

  • Mohd Razif Shamsuddin
  • Shuzlina Abdul-RahmanEmail author
  • Azlinah MohamedEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 937)


This paper is an investigation about the MNIST dataset, which is a subset of the NIST data pool. The MNIST dataset contains handwritten digit images that is derived from a larger collection of NIST data which contains handwritten digits. All the images are formatted in 28 × 28 pixels value with grayscale format. MNIST is a handwritten digit images that has often been cited in many leading research and thus has become a benchmark for image recognition and machine learning studies. There have been many attempts by researchers in trying to identify the appropriate models and pre-processing methods to classify the MNIST dataset. However, very little attention has been given to compare binary and normalized pre-processed datasets and its effects on the performance of a model. Pre-processing results are then presented as input datasets for machine learning modelling. The trained models are validated with 4200 random test samples over four different models. Results have shown that the normalized image performed the best with Convolution Neural Network model at 99.4% accuracy.


Convolution Neural Network Handwritten digit images Image recognition Machine learning MNIST 



The authors are grateful to the Research Management Centre (RMC) UiTM Shah Alam for the support under the national Fundamental Research Grant Scheme 600-RMI/FRGS 5/3 (0002/2016).


  1. 1.
    Grother, P., Hanaoka, K.: NIST special database 19 hand printed forms and characters 2nd Edition, National Institute of Standards and Technology (2016) Available: Accessed 20 July 2018
  2. 2.
    Grother, P.: NIST special database 19 hand printed forms and characters database. National Institute of Standards and Technology, Technical report (1995).,last. Accessed 20 July 2018
  3. 3.
    Kulkarni, S.R., Rajendran, B.: Spiking neural networks for handwritten digit recognition, supervised learning and network optimization (2018)Google Scholar
  4. 4.
    Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing (2018)Google Scholar
  5. 5.
    Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: an extension of MNIST to handwritten letters. Comput. Vis. Pattern Recognit. (2017)Google Scholar
  6. 6.
    Chen, M.C., Sengupta, A., Roy, K.: Magnetic skyrmion as a spintronic deep learning spiking neuron processor. IEEE Trans. Mag. 54, 1–7 (2018). IEEE Early Access ArticlesGoogle Scholar
  7. 7.
    Shah, N., Alessandro, C., Nisar, A., Ignazio, G.: Hand written characters recognition via deep metric learning. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), IEEE Conferences, pp. 417–422. IEEE (2018)Google Scholar
  8. 8.
    Paul, N.W., Sae, K.L., David, B., Gu-Yeon, W.: DNN engine: a 28-nm timing-error tolerant sparse deep neural network processor for IoT applications. IEEE J. Solid-State Circuits 53, 1–10 (2018)CrossRefGoogle Scholar
  9. 9.
    Jiayu, S., Xinzhou, W., Naixue, X., Jie, S.: Learning sparse representation with variational auto-encoder for anomaly detection. IEEE Access, 1 (2018)Google Scholar
  10. 10.
    Amirreza, Y., Garrick, O., Teresa, S.G., Bernabé, L.B.: Active perception with dynamic vision sensors. minimum saccades with optimum recognition. IEEE Trans. Biomed. Circuits Syst. 14, 1–13 (2018). IEEE Early Access ArticlesGoogle Scholar
  11. 11.
    Yap, B.W., Nurain, I., Hamzah, A.H., Shuzlina, A.R., Simon, F.: Feature selection methods: case of filter and wrapper approaches for maximising classification accuracy. Pertanika J. Sci. Technol. 26(1), 329–340 (2018)Google Scholar
  12. 12.
    Mutalib, S., Abdullah, M.H., Abdul-Rahman, S., Aziz, Z.A: A brief study on paddy applications with image processing and proposed architecture. In: 2016 IEEE Conference on Systems, Process and Control (ICSPC), pp. 124–129. IEEE (2016)Google Scholar
  13. 13.
    Azlin, A., Rubiyah, Y., Yasue M.: Identifying the dominant species of tropical wood species using histogram intersection method. In: Industrial Electronics Society, IECON 2015-41st Annual Conference of the IEEE, pp. 003075–003080. IEEE (2015)Google Scholar
  14. 14.
    Bernard, S., Adam, S., Heutte, L.: Using random forests for handwritten digit recognition. In: Proceedings of the 9th IAPR/IEEE International Conference on Document Analysis and Recognition ICDAR 2007, pp. 1043–1047. IEEE (2007)Google Scholar
  15. 15.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63, 3–42 (2006). Engineering, computing & technology: Computer scienceCrossRefGoogle Scholar
  16. 16.
    LeNet-5, convolutional neural networks, Accessed 20 July 2018

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Faculty of Computer and Mathematical SciencesUniversiti Teknologi MARAShah AlamMalaysia

Personalised recommendations