Multi-level Dense Capsule Networks

  • Sai Samarth R. PhayeEmail author
  • Apoorva Sikka
  • Abhinav Dhall
  • Deepti R. Bathula
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11365)


Past few years have witnessed an exponential growth of interest in deep learning methodologies with rapidly improving accuracy and reduced computational complexity. In particular, architectures using Convolutional Neural Networks (CNNs) have produced state-of-the-art performances for image classification and object recognition tasks. Recently, Capsule Networks (CapsNets) achieved a significant increase in performance by addressing an inherent limitation of CNNs in encoding pose and deformation. Inspired by such an advancement, we propose Multi-level Dense Capsule Networks (multi-level DCNets). The proposed framework customizes CapsNet by adding multi-level capsules and replacing the standard convolutional layers with densely connected convolutions. A single-level DCNet essentially adds a deeper convolution network, which leads to learning of discriminative feature maps learned by different layers to form the primary capsules. Additionally, multi-level capsule networks uses a hierarchical architecture to learn new capsules from former capsules that represent spatial information in a fine-to-coarser manner, which makes it more efficient for learning complex data. Experiments on image classification task using benchmark datasets demonstrate the efficacy of the proposed architectures. DCNet achieves state-of-the-art performance (99.75%) on the MNIST dataset with approximately twenty-fold decrease in total training iterations, over the conventional CapsNet. Furthermore, multi-level DCNet performs better than CapsNet on SVHN dataset (96.90%), and outperforms the ensemble of seven CapsNet models on CIFAR-10 by \(+\)0.31% with seven-fold decrease in the number of parameters. Source codes, models and figures are available at


Deep learning Computer vision Image recognition 


  1. 1.
    Afshar, P., Mohammadi, A., Plataniotis, K.N.: Brain tumor type classification via capsule networks. arXiv preprint arXiv:1802.10200 (2018)
  2. 2.
    Burt, P.J., Adelson, E.H.: The laplacian pyramid as a compact image code. In: Readings in computer vision: issues, problems, principles, and paradigms, pp. 671–679. Morgan Kaufmann Publishers Inc., San Francisco (1987).
  3. 3.
    Cheng, J., et al.: Enhanced performance of brain tumor classification via tumor region augmentation and partition. PLoS ONE 10(10), e0140381 (2015)CrossRefGoogle Scholar
  4. 4.
    Donahue, J., et al.: Decaf: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)Google Scholar
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  6. 6.
    Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 3 (2017)Google Scholar
  7. 7.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012).
  8. 8.
    Larsson, G., Maire, M., Shakhnarovich, G.: Fractalnet: Ultra-deep neural networks without residuals. CoRR abs/1605.07648 (2016).
  9. 9.
    Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011)Google Scholar
  10. 10.
    Rotshtein, P., Vuilleumier, P., Winston, J., Driver, J., Dolan, R.: Distinct and convergent visual processing of high and low spatial frequency information in faces. Cereb. Cortex 17(11), 2713–2724 (2007)CrossRefGoogle Scholar
  11. 11.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)Google Scholar
  12. 12.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014).
  13. 13.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: The all convolutional net. CoRR abs/1412.6806 (2014).
  14. 14.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. CoRR abs/1505.00387 (2015).
  15. 15.
    Szegedy, C., et al.: Going deeper with convolutions. CoRR abs/1409.4842 (2014).
  16. 16.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1701–1708. IEEE Computer Society, Washington, DC (2014).
  17. 17.
    titu1994: Dense net in keras (2017).
  18. 18.
    Xi, E., Bing, S., Jin, Y.: Capsule network performance on complex data. arXiv preprint arXiv:1712.03480 (2017)
  19. 19.
    Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
  20. 20.
    XifengGuo: CapsNet-Keras (2017).
  21. 21.
    Yang, J.Z., Feng, Y., Feng, Q., Chen, W.: Retrieval of brain tumors by adaptive spatial pooling and fisher vector representation. PLoS ONE 11(6), e0157112 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Sai Samarth R. Phaye
    • 1
    Email author
  • Apoorva Sikka
    • 1
  • Abhinav Dhall
    • 1
  • Deepti R. Bathula
    • 1
  1. 1.Indian Institute of Technology RoparRupnagarIndia

Personalised recommendations