On the Exploration of Convolutional Fusion Networks for Visual Recognition

Conference paper

DOI: 10.1007/978-3-319-51811-4_23

Part of the Lecture Notes in Computer Science book series (LNCS, volume 10132)
Cite this paper as:
Liu Y., Guo Y., S. Lew M. (2017) On the Exploration of Convolutional Fusion Networks for Visual Recognition. In: Amsaleg L., Guðmundsson G., Gurrin C., Jónsson B., Satoh S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science, vol 10132. Springer, Cham


Despite recent advances in multi-scale deep representations, their limitations are attributed to expensive parameters and weak fusion modules. Hence, we propose an efficient approach to fuse multi-scale deep representations, called convolutional fusion networks (CFN). Owing to using 1 \(\times \) 1 convolution and global average pooling, CFN can efficiently generate the side branches while adding few parameters. In addition, we present a locally-connected fusion module, which can learn adaptive weights for the side branches and form a discriminatively fused feature. CFN models trained on the CIFAR and ImageNet datasets demonstrate remarkable improvements over the plain CNNs. Furthermore, we generalize CFN to three new tasks, including scene recognition, fine-grained recognition and image retrieval. Our experiments show that it can obtain consistent improvements towards the transferring tasks.


Multi-scale deep representations Locally-connected fusion module Transferring deep features Visual recognition 

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.LIACS Media LabLeiden UniversityLeidenThe Netherlands

Personalised recommendations