International Conference on Internet Multimedia Computing and Services (ICIMCS) is an annual conference sponsored by ACM SIGMM China Chapter. The conference is especially interested in the latest technologies and applications that deal with the web-scale processing and management of heterogeneous data from the Internet for multimedia computing and service. ICIMCS 2011 held in Chengdu, China—the ancient hometown of lovely panda. The conference has attracted around 80 participants, including researchers from academia and industries across ten countries/regions, for sharing their recent works in the topics ranging from visual information analysis and mining, query processing and search, multimedia privacy and security.

This special issue comprises the extended versions of five papers, including two best papers, and three papers from the regular and special sessions of ICIMCS 2011. These papers cover key issues in multimedia computing, including the leveraging of Internet resources for multi-modality fusion and visual classifier learning, preprocessing and indexing of million-scale Internet data, and sharing and streaming of Internet videos. Some of these techniques also demonstrate applications for emerging Internet services, such as video recommendation and product search system, by processing and modeling the heterogeneous forms of resources associated with multimedia data.

The first two papers are about the web-scale processing of Internet data. The paper entitled “Video Recommendation over Multiple Information Sources”, presented by Meng Wang and his colleagues from National University of Singapore, proposes a unified framework that explores heterogeneous information sources for video recommendation. The framework, based on multi-task SVM learning, aggregates multiple ranked lists generated from personal data, social network, and video metadata into an optimized list for recommendation. The framework is experimented on a large video dataset composed of 1-month social activities happened on Facebook and YouTube websites by 76 users. The second paper entitled “Multi-label Multi-instance Learning with Missing Object Tags”, presented by Yi Shen and his colleagues from University of North Carolina at Charlotte, proposes a web-scale learning of object classifiers for free from a collection of user-tagged Flickr images as many as 10 millions. Particularly, the paper addresses three important issues toward fully automatic learning: scalable filtering of spam tags by distributed image clustering; joint modeling of loose tags and missing tags by multiple instance learning that is capable of performing tag prediction; structural learning that takes into account the object relationship to train discriminant classifiers.

The next two papers address the search and mining of visual instances. The paper entitled “Combining Global and Local Matching of Multiple Features for Precise Item Image Retrieval”, co-authored by Haojie Li and his colleagues from Dalian University of Technology and Nanjing University of Science and Technology, presents an iSearch system that aims to facilitate online shopping by allowing users to submit pictures as queries. Different from the commercial products, such as Google Goggles, the system searches for fashion items such as skirts and shoes, which are non-rigid objects. The paper particularly focuses on the feature extraction and representation techniques that enable the precise search of the style and pattern of items specified by users. Different types of codebooks, including for local texture, shape, and region, are learned for feature representation, while triangle-relation-based geometric verification is incorporated for precise online search. The paper entitled “Weakly-supervised Object Localization in Unlabeled Image Collection”, co-authored by Yanyun Qu and her colleagues from Xiamen University, presents an algorithm that mines and localizes semantic objects from the class-specific datasets. The algorithm segments images in a database into regions, and clusters them based on the measures defined upon saliency and commonality properties. Multiple instance learning algorithm is then applied to locate class-specific objects in the database.

Finally, the paper entitled “Robust Wireless Sharing of Internet Video Streams”, co-authored by Steven Nichols and his colleagues from University of Central Florida, proposes a distributed video sharing technique, named Dynamic Stream Merging (DSM), for improving the scalability and robustness of video streaming in a wireless mesh access network environment. DSM functions independently from video servers and clients, and has the intelligence to handle sudden spikes in demand for certain videos due to specific events. The technique is demonstrated for sharing video data from YouTube in a wireless mesh network.