The goal of this special issue was to bring together experts working on applying models, principles, and knowledge of human audio-visual perception and cognition to optimize video processing algorithms and applications. This is an area where our deeper understanding of visual perception to video processing will lead to new breakthroughs in visual information processing.

Attentional focus is an important aspect of perception that helps understand user interest and intent. Determining points of saliency is computationally complex. Compressed-domain methods thus become valuable tools for developing computationally efficient and practical solutions. In “Compressed-Domain Correlates of Human Fixations in Dynamic Scenes” (10.1007/s11042-015-2802-3) authors present a method for detecting points of fixation in H.264/AVC video using motion vectors, block coding modes and coded residuals parsed from a H.164/AVC bitstream.

Egocentric videos are captured using wearable cameras and used to detect a wearer’s point of view. Amount of video captured using wearable cameras is increasing and saliency detection in such videos enables applications such as efficient video summarization. In “Geometrical Cues in Visual Saliency Models for Active Object Recognition in Egocentric Videos” (10.1007/s11042-015-2803-2) authors use geometrical cues to improve saliency detection in videos.

Action recognition from videos is a challenging task with many applications including surveillance and social behavior understanding. Inspired by models of neural response to visual input, the paper entitled “Deep Learning Human Actions from Video via Sparse Filtering and Locally Competitive Algorithms” (10.1007/s11042-015-2808-x) presents a method that combines sparse filtering with locally competitive algorithms for applications in action recognition.

Discovery that user attention can be primed using momentary flashes of light inspired the work presented in “Method and Experiments of Subliminal Cueing for Real-World Images” (10.1007/s11042-015-2804-1) Authors report the use of subliminal cues to retarget attention in images. Such cues can redirect user attention to the target areas that need more attention such as unexpected events in surveillance video.

Neural correlates of visual perception can help understand visual information processing in the brain and enable new class of applications. Neural response, measured using an EEG, is an indicator of user response to visual information processing tasks. In “Improving Object Segmentation by Using EEG Signals and Rapid Serial Visual Presentation” (10.1007/s11042-015-2805-0) authors show that neural response is strong enough to detect when subjects find target images presented rapidly and that using neural response can improve object segmentation.

Eye movements during a visual performance task have rich information that can be exploited to understand visual perception and user response to the visual stimuli. In “The Influence of Color during Continuity Cuts in Edited Movies: An Eye-Tracking Study” (10.1007/s11042-015-2806-z) authors report that color aids saccades after continuity cuts. These findings can be applied to more efficient compression of movie videos.

We hope this special issue is able to highlight the potential of applying models of human visual perception in video processing applications. The guest editors would like to sincerely thank the many reviewers that provided valuable input in shaping this special issue.