Collection

Intelligent Information Processing in Mobile Multimedia

With the prevalence of mobile devices (e.g., smartphones, tablets, digital cameras, wearable, and IoT devices), text, sound, images and video have become the main modalities of information being exchanged in our daily life. Emerging technologies, such as nature language processing, machine translation, speech understanding, mobile TV, 3D video, augmented reality and virtual reality, have received significant research interest from both academia and industry. They bring exciting mobile multimedia services and applications for monitoring, entertaining, education, public safety, healthcare, and smart home, city, manufacturing, transportation, etc. Mobile multimedia data is usually collected by mobile devices from different Sensors. Therefore, they have a complex structure and composed by heterogeneous information. Noise of the data, non-universality of single modality, and performance bound of each modality make it extremely difficult to be processed and utilized without having an effective approach. Intelligent information processing has revolutionized text analysis, speech recognition, image and video understanding, and natural language processing in the past thirty years, each involving a single modality in the input data. However, many mobile services and applications involve more than one modality. It is therefore of broad interest to study more complex and difficult issue of intelligent mobile multimedia processing across multiple modalities. Nowadays, there are still many remaining issues are waiting for solutions, three major challenges exist: “What-To-Process”, “How-To-Process”, “How-To-Use”. Firstly, for mobile multimedia tasks, collecting paralleled multimedia data across all modalities can be quite difficult. Hence, leveraging pre-trained representation with desired nature properties of different modalities is often an effective solution to the problem. Secondly, we should focus on special architectures, workflow, methodology for the integration of the representation of unimodal signal for a particular task. Thirdly, selected areas of a broad interest for future applications will be discovered, such as image captioning, text-to-image generation, visual question answering, augmented reality (AR), virtual reality (VR). Special section in Springer Mobile Networks & Applications aims to provide an opportunity for researchers to publish their unique working on new standards, technologies, developments and applications of intelligent information processing in mobile multimedia. The special section will also collect some excellent research articles from the 16th EAI International Conference on Mobile Multimedia Communications (MOBIMEDIA 2023).

Editors

  • Yun Lin

    Harbin Engineering university, China linyun@hrbeu.edu.cn

  • Meiyu Wang

    Harbin Engineering University, China, hrbeu_meiyu@hrbeu.edu.cn

  • Junyi Wang

    Guilin University of Electronic Technology, China, wangjy@guet.edu.cn

Articles (4 in this collection)