Skip to main content
Log in

XnODR and XnIDR: Two Accurate and Fast Fully Connected Layers for Convolutional Neural Networks

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Capsule Network is powerful at defining the positional relationship between features in deep neural networks for visual recognition tasks, but it is computationally expensive and not suitable for running on mobile devices. The bottleneck is in the computational complexity of the Dynamic Routing mechanism used between the capsules. On the other hand, XNOR-Net is fast and computationally efficient, though it suffers from low accuracy due to information loss in the binarization process. To address the computational burdens of the Dynamic Routing mechanism, this paper proposes new Fully Connected (FC) layers by xnorizing the linear projection outside or inside the Dynamic Routing within the CapsFC layer. Specifically, our proposed FC layers have two versions, XnODR (Xnorize the Linear Projection Outside Dynamic Routing) and XnIDR (Xnorize the Linear Projection Inside Dynamic Routing). To test the generalization of both XnODR and XnIDR, we insert them into two different networks, MobileNetV2 and ResNet-50. Our experiments on three datasets, MNIST, CIFAR-10, and MultiMNIST validate their effectiveness. The results demonstrate that both XnODR and XnIDR help networks to have high accuracy with lower FLOPs and fewer parameters (e.g., 96.14% correctness with 2.99M parameters and 311.74M FLOPs on CIFAR-10).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. Advances in neural information processing systems 30 (2017)

  2. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-net: Imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542 (2016). Springer

  3. Jeong, T., Lee, Y., Kim, H.: Ladder capsule network. In: International Conference on Machine Learning, pp. 3071–3079 (2019). PMLR

  4. Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4510–4520 (2018)

  5. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848

  6. Yu, J., Wang, Z., Vasudevan, V., Yeung, L., Seyedhosseini, M., Wu, Y.: Coca: Contrastive captioners are image-text foundation models.arXiv:2205.01917 (2022)

  7. Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: Marrying convolution and attention for all data sizes. Adv Neural Inf Process 34, 3965–3977 (2021)

    Google Scholar 

  8. Xi, E., Bing, S., Jin, Y.: Capsule Network Performance on Complex Data. 1712–03480 (2017) arXiv:1712.03480

  9. Lenssen, J.E., Fey, M., Libuschewski, P.: Group equivariant capsule networks. In: NeurIPS, pp. 8858–8867 (2018)

  10. Bahadori, M.T.: Spectral capsule networks. In: ICLR (2018)

  11. Gu, J., Tresp, V.: Improving the robustness of capsule networks to image affine transformations. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 7283–7291 (2020)

  12. He, P., Zhou, Y., Duan, S., Hu, X.: Memristive residual capsnet: A hardware friendly multi-level capsule network. Neurocomputing (2022). https://doi.org/10.1016/j.neucom.2022.04.088

    Article  Google Scholar 

  13. Jia, X., Li, J., Zhao, B., Guo, Y., Huang, Y.: Res-capsnet: Residual capsule network for data classification. Neural Processing Letters (2022). https://doi.org/10.1007/s11063-022-10806-9

    Article  Google Scholar 

  14. Lin, Z., Gao, W., Jia, J., Huang, F.: Capsnet meets sift: A robust framework for distorted target categorization. Neurocomputing 464, 290–316 (2021). https://doi.org/10.1016/j.neucom.2021.08.087

    Article  Google Scholar 

  15. Lin, Z., Jia, J., Huang, F., Gao, W.: A coarseto- fine capsule network for fine-grained image categorization. Neurocomputing 456, 200–219 (2021). https://doi.org/10.1016/j.neucom.2021.05.032

    Article  Google Scholar 

  16. Kim, J., Jang, S., Park, E., Choi, S.: Text classification using capsules. Neurocomputing 376, 214–221 (2020)

    Article  Google Scholar 

  17. Liang, T., Chai, C., Sun, H., Tan, J.: Wind speed prediction based on multivariable capsnet-bilstm-mohho for wpccc. Energy 250, 123761 (2022). https://doi.org/10.1016/j.energy.2022.123761

    Article  Google Scholar 

  18. Zeng, Q., Xie, T., Zhu, S., Fan, M., Chen,L., Tian, Y.: Estimating the near-ground pm2.5 concentration over china based on the capsnet model during 2018-2020. Remote Sensing 14(3) (2022). https://doi.org/10.3390/rs14030623

  19. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS) 2(4), 303–314 (1989). https://doi.org/10.1007/BF02551274

  20. Yu, D., Seide, F., Li, G.: Conversational speech transcription using context-dependent deep neural networks. In: ICML. ICML’12, pp. 1–2. Omnipress, Madison, WI, USA (2012)

  21. Dauphin, Y., Bengio, Y.: Big neural networks waste capacity. https://www.CoRRabs/1301.3583 (2013)

  22. Ba, L.J., Caruana, R.: Do deep nets really need to be deep? In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. NIPS’14, pp. 2654–2662. MIT Press, Cambridge, MA, USA (2014)

  23. Szegedy, C., Liu, W., Jia, Y., Sermanet, P.,Reed, S., Anguelov, D., Erhan, D., Vanhoucke,V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594

  24. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778 (2016)

  25. Iandola, F.N., Han, S., Moskewicz, M.W.,Ashraf, K., Dally, W.J., Keutzer, K.:SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and<0.5MB model size. arXiv e-prints, 1602-07360 (2016) arXiv:1602.07360

  26. Xie, X., Zhou, Y., Kung, S.-Y.: Exploring highly efficient compact neural networks for image classification. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 2930–2934 (2020). https://doi.org/10.1109/ICIP40778.2020.9191334

  27. Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing Deep Convolutional Networks using Vector Quantization. arXiv e-prints, 1412–6115 (2014). arXiv:1412.6115

  28. Hwang, K., Sung, W.: Fixed-point feedforward deep neural network design using weights +1, 0, and -1. In: 2014 IEEE Workshop on Signal Processing Systems (SiPS), pp. 1–6 (2014). https://doi.org/10.1109/SiPS.2014.6986082

  29. Lin, Z., Courbariaux, M., Memisevic, R., Bengio, Y.: Neural networks with few multiplications. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings (2016). arXiv:1510.03009

  30. Floropoulos, N., Tefas, A.: Complete vector quantization of feedforward neural networks. Neurocomputing 367, 55–63 (2019). https://doi.org/10.1016/j.neucom.2019.08.003

    Article  Google Scholar 

  31. Lybrand, E., Saab, R.: A Greedy Algorithm for Quantizing Neural Networks. Journal of Machine Learning Research, 2010–15979 (2020). arXiv:2010.15979

  32. Yang, Z., Wang, Y., Han, K., Xu, C., Xu, C.,Tao, D., Xu, C.: Searching for Low-BitWeights in Quantized Neural Networks. arXiv e-prints, 2009-08695 (2020). arXiv:2009.08695

  33. Guerra, L., Zhuang, B., Reid, I., Drummond,T.: Automatic Pruning for Quantized Neural Networks. arXiv e-prints, 2002-00523 (2020). arXiv:2002.00523

  34. Arora, S., Bhaskara, A., Ge, R., Ma, T.: Provable bounds for learning some deep representations. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 32, pp. 584–592. PMLR, Bejing, China (2014)

  35. Yoshida, Y., Oiwa, R., Kawahara, T.: Ternary sparse xnor-net for fpga implementation. In: 2018 7th International Symposium on Next Generation Electronics (ISNE), pp. 1-2 (2018). https://doi.org/10.1109/ISNE.2018.8394728

  36. Bulat, A., Tzimiropoulos, G.: Xnor-net++: Improved binary neural networks. In: BMVC (2019)

  37. Liu, Z., Luo, W., Wu, B., Yang, X., Liu, W., Cheng, K.: Bi-real net: Binarizing deep network towards real-network performance. Int J Comput Vis 128, 202–219 (2019)

    Article  Google Scholar 

  38. Zhu, S., Duong, L.H.K., Liu, W.: Xor-net: An efficient computation pipeline for binary neural network inference on edge devices. In: 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), pp. 124–131 (2020). https://doi.org/10.1109/ICPADS51040.2020.00026

  39. Zabidi, M.M., Wong, K.L., Sheikh, U.U.,Abdul Manan, S.S., Hamzah, M.A.N.: Bird sound detection with binarized neural networks. ELEKTRIKA - Journal of Electrical Engineering 21(1), 48–53 (2022). https://doi.org/10.11113/elektrika.v21n1.349

  40. Zhao, Y., Yu, J., Zhang, D., Hu, Q., Liu,X., Jiang, H., Ding, Q., Han, Z., Cheng, J.,Zhang, W., Cao, Y., Zhou, R., Lu, H., Xu, X.,Yang, J.: A 0.02 accuracy loss voltage-mode parallel sensing scheme for rram-based xnornet application. IEEE Transactions on Circuits and Systems II: Express Briefs, 1–1 (2022). https://doi.org/10.1109/TCSII.2022.3157767

  41. Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with em routing. In: International Conference on Learning Representations (2018)

  42. Ribeiro, F.D.S., Leontidis, G., Kollias, S.: Capsule routing via variational bayes. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 3749–3756 (2020)

  43. Zhao, L., Wang, X., Huang, L.: An efficient agreement mechanism in capsnets by pairwise product. In: ECAI (2020)

  44. Mollahosseini, A., Hasani, B., Mahoor, M.H.: Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10, 18–31 (2019)

    Article  Google Scholar 

  45. LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)

  46. Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 (canadian institute for advanced research)

  47. Byerly, A., Kalganova, T., Dear, I.: No routing needed between capsules. Neurocomputing 463, 545–553 (2021). https://doi.org/10.1016/j.neucom.2021.08.064

    Article  Google Scholar 

  48. Touvron, H., Cord, M., Sablayrolles, A., Synnaeve, G., Jégou, H.: Going deeper with Image Transformers. 2103–17239 (2021). arXiv:2103.17239

  49. Mazzia, V., Salvetti, F., Chiaberge, M.: Efficient-capsnet: Capsule network with selfattention routing. Scientific Reports 11 (2021)

  50. Duarte, K., Rawat, Y., Shah, M.: Plm: Partial label masking for imbalanced multi-label classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 2739–2748 (2021)

  51. Yang, H., Li, S., Yu, B.: Routing Towards Discriminative Power of Class Capsules. arXiv e-prints, 2103-04278 (2021) arXiv:2103.04278 [cs.LG]

  52. Cheng, K., Tahir, R., Eric, L.K., Li, M.: An analysis of generative adversarial networks and variants for image synthesis on mnist dataset. Multimed Tools Appl 79(19), 13725–13752 (2020). https://doi.org/10.1007/s11042-019-08600-2

    Article  Google Scholar 

  53. Hirata, D., Takahashi, N.: Ensemble learning in CNN augmented with fully connected subnetworks. arXiv e-prints, 2003–08562 (2020). arXiv:2003.08562

  54. Wang, L., Xie, S., Li, T., Fonseca, R., Tian, Y.: Sample-Efficient Neural Architecture Search by Learning Action Space. 1906–06832 (2019). arXiv:1906.06832

  55. Kosiorek, A.R., Sabour, S., Teh, Y.W., Hinton,G.: Stacked capsule autoencoders. In: Neural Information Processing Systems (2019). arXiv:1906.06818

  56. Yang, Z., Wang, X.: Reducing the dilution: An analysis of the information sensitiveness of capsule network with a practical improvement method. arXiv e-prints, 1903–10588 (2019).arXiv:1903.10588

  57. Yao, H., Regan, M., Yang, Y., Ren, Y.: Image decomposition and classification through a generative model. In: 2019 IEEE International Conference on Image Processing, ICIP 2019 -Proceedings. roceedings - International Conference on Image Processing, ICIP, pp. 400–404. IEEE Computer Society, ??? (2019). https://doi.org/10.1109/ICIP.2019.8802991 . Publisher Copyright: © 2019 IEEE.; 26th IEEE International Conference on Image Processing, ICIP 2019 ; Conference date: 22-09-2019 Through 25-09-2019

  58. Muñoz, J.P., Lyalyushkin, N., Akhauri, Y.,Senina, A., Kozlov, A., Jain, N.: Enabling NAS with Automated Super-Network Generation. arXiv e-prints, 2112–10878 (2021).arXiv:2112.10878

  59. Wightman, R., Touvron, H., Jégou, H.: Resnet strikes back: An improved training procedure in timm.arXiv:2110.00476 (2021)

  60. Chen, X., Hsieh, C.-J., Gong, B.: When vision transformers outperform resnets without pretraining or strong data augmentations. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=LtKcMgGOeLt

  61. Cherti, M., Jitsev, J.: Effect of pre-training scale on intra-and inter-domain full and few-shot transfer learning for natural and medical x-ray chest images. arXiv:2106.00116 (2021)

  62. Mukhometzianov, R., Carrillo, J.: CapsNet comparative performance evaluation for image classification. arXiv e-prints, 1805–11195 (2018). arXiv:1805.11195

  63. Mohaimenuzzaman, M., Bergmeir, C., Meyer,B.: Pruning vs XNOR-net: A comprehensive study of deep learning for audio classification on edge-devices. IEEE Access 10, 6696–6707 (2022)

Download references

Acknowledgements

We would like to thank Ms. Druselle May who helped us proofread the manuscript.

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Contributions

Jian Sun: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization. Ali Pourramezan Fard: Writing - Review & Editing. Mohammad H. Mahoor: Resources, Writing - Review & Editing, Supervision, Project administration.

Corresponding author

Correspondence to Mohammad H. Mahoor.

Ethics declarations

Conflicts of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval

Not applicable

Consent to participate

Not applicable

Consent for publication

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, J., Fard, A.P. & Mahoor, M.H. XnODR and XnIDR: Two Accurate and Fast Fully Connected Layers for Convolutional Neural Networks. J Intell Robot Syst 109, 17 (2023). https://doi.org/10.1007/s10846-023-01952-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-01952-w

Keywords

Navigation