International Conference on Internet Multimedia Computing and Services (ICIMCS) is an annual conference series sponsored by ACM SIGMM China Chapter, focused on the latest Internet multimedia management and processing multimedia technologies and applications. ICIMCS 2013 was held in Huangshan, China, in July 2013. The conference has attracted over 100 participants, including researchers from academia and industries across more than ten countries/regions, for sharing their recent works. In total there are 94 regular submissions which cover widely in multimedia content analysis, multimedia signal processing and communications, and multimedia applications and services. There are 69 papers accepted after a rigorous review process.

With regard to this special issue, we have invited the authors of ten papers that have highest review scores from the conference to submit an extended version of their work. After at least two rounds of rigorous reviews, nine of them are accepted. These papers cover key issues in multimedia computing, including modeling for visual classification, image saliency analysis, object detection and tracking, and multimedia copy detection and search.

The first part contains two papers about image classification. The first paper, “A New Discriminative Coding Method for Image Classification”, introduces a locality discriminative coding method. In the method, a feature point is converted into a code vector using local feature-to-class distance instead of k-means quantization. This approach is demonstrated to outperform conventional feature coding methods. In the second paper, “Large-margin Multi-view Gaussian Process”, Xu et al. introduce a large-margin Gaussian process approach for discovering discriminative latent subspace shared by multiple features. The method incorporates the Gaussian process with the large-margin principle, and thus the learned latent subspace for high-dimensional multiple features is more discriminative and effective for a subsequent image classification task.

The second part contains two papers about visual saliency analysis. The first paper, “Image Aesthetics Enhancement Using Composition Based Saliency Detection”, introduces a novel saliency detection method which utilizes the knowledge of photographic composition as priors to improve saliency detection results. In addition, an online parameter selection method is employed for achieving better saliency segmentation results. In the second paper, “Camouflage Texture Evaluation Using Saliency Map”, Xue et al. introduce a histogram-based contrast method to measure saliency, in which the saliency map is computed simply by color and texture separation from a whole image. The saliency map is then used in camouflage effect evaluation.

In the third part of the special issue, we have two papers focusing on object detection and tracking. The paper “Efficient Human Detection in Crowded Environment” introduces an efficient method for detecting humans in crowded environment by combining both motion and appearance clues. Integral images of edge information are used for efficient template matching. Sparse human templates are constructed based on point distribution model and regression analysis. The second paper, “Soft-assigned bag of features for object tracking”, proposes a novel soft-assigned bag of features tracking approach, in which soft assignment is utilized to improve the robustness and discrimination of bag of features object representation.

The fourth part contains three papers that are related to multimedia copy detection and search. The first paper “Redundancy Filtering and Fusion Verification for Video Copy Detection” introduces a novel frame fusion-based video copy detection scheme. The continuous similarity property among neighbor frames is learned for guiding the design of smart frame filtering method so as to greatly reduce the redundancy among frames. Then, two effective path verification methods, which utilize cross-clip verification strategy, are investigated for removing false alarms. In the second paper, “Click-boosting Multi-modality Graph-based Reranking for Image Search”, Yang et al. introduce an approach that combines multi-modality information and user click-through data for image search reranking. In the method, an initial ranked list is first boosted based on the corresponding click data of images and then promoted through multi-modality graph-based learning which produces the reranked result iteratively based on the fusion of multiple image graphs and the click-boosted reranked list. The third paper, “Multi-order Visual Phrase for Scalable Partial-Duplicate Visual Search”, proposes a multi-order visual phrase for partial-duplicate visual search. The multi-order visual phrase contains two complementary clues: center visual word quantized from the local descriptor of each image keypoint and the visual and spatial clues of multiple nearby keypoints. Two multi-order visual phrases are flexibly matched by first matching their center visual words and then estimating a match confidence by checking the spatial and visual consistency of their neighbor keypoints.