PSO-based optimized CNN for Hindi ASR
- 13 Downloads
Convolutional Neural Network (CNN) is one of the successful deep learning algorithms that have shown its effectiveness in a variety of vision tasks. The performance of this network depends directly on its hyperparameters. Although, designing CNN architectures require expert knowledge of their intrinsic structure or a lot of trial and error. To overcome these issues, there is a need to automatically design the optimal architecture of CNNs without any human intervention. So, we try to eliminate the constraints on the number of convolutional layers and pooling layers and their type etc. from traditional architecture. Biologically inspired approaches have not been extensively exploited for this task. This paper attempts to automatically optimize CNN architecture’s hyperparameters for speech recognition task based on particle swarm optimization (PSO) which is a population based stochastic optimization technique. The proposed method is evaluated by designing CNN architecture for speech recognition task on Hindi dataset. The experimental results show that the proposed method significantly designs the competitive CNN architecture which performs similar as other state-of-the-art methods.
KeywordsCNN Hyperparameter selection PSO Optimization
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- Baker, B., Gupta, O., Naik, N., & Raskar, R. (2016). Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167.
- Bergstra, J. S., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for hyper-parameter optimization. Paper presented at the Advances in Neural Information Processing Systems (NIPS).Google Scholar
- Chen, X., Liu, X., Qian, Y., Gales, M. J., & Woodland, P. C. (2016). CUED-RNNLM—An open-source toolkit for efficient training and evaluation of recurrent neural network language models. Paper presented at the 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP).Google Scholar
- Eberhart, R., & Kennedy, J. (1995). A new optimizer using particle swarm theory. Paper presented at the Sixth International Symposium on Micro Machine and Human Science MHS’95Google Scholar
- Fernando, C., Banarse, D., Reynolds, M., Besse, F., Pfau, D., Jaderberg, M., et al. (2016). Convolution by evolution: Differentiable pattern producing networks. Paper presented at the Proceedings of the Genetic and Evolutionary Computation Conference 2016.Google Scholar
- Fujimoto, M. (2017). Factored Deep Convolutional Neural Networks for Noise Robust Speech Recognition. Paper presented at the Interspeech 2017. http://dx.doi.org/10.21437/interspeech.2017-225
- Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. Paper presented at the Thirteenth International Conference on Artificial Intelligence and Statistics.Google Scholar
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.Google Scholar
- Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. Paper presented at the International Conference on Learning and Intelligent Optimization.Google Scholar
- Kassahun, Y., Edgington, M., Metzen, J. H., Sommer, G., & Kirchner, F. (2007). A common genetic encoding for both direct and indirect encodings of networks. Paper presented at the 9th annual conference on Genetic and evolutionary computation.Google Scholar
- Kingma, D., & Ba, J. (2014). Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 15.
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Paper presented at the Advances in Neural Information Processing Systems (NIPS).Google Scholar
- Larochelle, H., Erhan, D., Courville, A., Bergstra, J., & Bengio, Y. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. Paper presented at the Proceedings of the 24th international conference on Machine learning.Google Scholar
- LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks,3361(10), 1995.Google Scholar
- Loshchilov, I., & Hutter, F. (2016). CMA-ES for hyperparameter optimization of deep neural networks. arXiv preprint arXiv:1604.07269.
- Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. Paper presented at the Proceedings of the 27th international conference on machine learning (ICML-10).Google Scholar
- Real, E., Moore, S., Selle, A., Saxena, S., Suematsu, Y. L., Tan, J., et al. (2017). Large-scale evolution of image classifiers. Paper presented at the Proceedings of the 34th International Conference on Machine Learning, Vol. 70.Google Scholar
- Sainath, T. N., Kingsbury, B., Mohamed, A.-r., Dahl, G. E., Saon, G., Soltau, H., et al. (2013). Improvements to Deep Convolutional Neural Networks for LVCSR. Paper presented at the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).Google Scholar
- Sainath, T. N., Mohamed, A.-r., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. Paper presented at the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).Google Scholar
- Samudravijaya, K., Rao, P., & Agrawal, S. (2000). Hindi speech database. Paper presented at the Sixth International Conference on Spoken Language Processing.Google Scholar
- Schaffer, J. D., Whitley, D., & Eshelman, L. J. (1992). Combinations of genetic algorithms and neural networks: A survey of the state of the art. Paper presented at the [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks.Google Scholar
- Senior, A., Heigold, G., Bacchiani, M., & Liao, H. (2014). GMM-free DNN acoustic model training. Paper presented at the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).Google Scholar
- Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Paper presented at the Advances in Neural Information Processing Systems (NIPS).Google Scholar
- Suganuma, M., Shirakawa, S., & Nagao, T. (2017). A genetic programming approach to designing convolutional neural network architectures. Paper presented at the Proceedings of the Genetic and Evolutionary Computation Conference.Google Scholar
- Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. Retrieved September 29, 2019, from arXiv preprint arXiv:1611.01578.