Intelligent video surveillance is an important topic in the field of computer vision and pattern recognition. Significant progress has achieved in the last decades, from object detection, tracking and parsing to activity recognition and video understanding. Visual multimedia learning, which can be treated as the most significant breakthrough in past 10 years, has greatly affected the methodology of computer vision and achieved terrific progress in both academy and industry. Visual multimedia learning is firstly adopted in ImageNet Competition for object categorization, which achieved a 12% progress in 2012 and confirmed the priority of deep learning for computer vision applications. From then on, deep learning has been adopted in all kinds of computer vision applications and many breakthroughs have achieved in sub-areas, like DeepFace on LFW competition for face verification, GoogleNet for ImageNet Competition for object categorization. It can be expected that more and more computer vision applications will benefit from Visual multimedia learning.

The 56 submitted manuscripts were reviewed by experts from both academia and industry. After two rounds of reviewing, 34 manuscripts were accepted for this special issue.