Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 7, pp 9189–9201 | Cite as

Guest Editorial: Multimedia for Predictive Analytics

  • Sanjay Kumar Singh
  • Amit Kumar Singh
  • Basant Kumar
  • Subir Kumar Sarkar
  • Karm Veer Arya
Guest Editorial

1 Introduction

Recent technological advancements have led to a deluge of multimedia data from distinctivedomains. With the increase in multimedia data generated by Internet, health care, scientificsensors, financial and manufacturing companies has profoundly transformed our society andwill continue to attract diverse attentions from both technological experts and the society ingeneral. The use of multimedia data for predictive analytics is the process of analyzing,meaningful patterns for predictive modeling. Reasons for using predictive analytics by theorganization or individual are to grow, compete, enforce, improve, satisfy and learn. Potential researchers are now beginning to adapt advanced modern machine learning and patternrecognition techniques, such as ensemble learning, manifold learning, sparse representation,low-rank presentation, compressive sensing and deep learning, to solve related problems inthe complex domain.

The main goal of this special issue is to bring forward the recentadvancements in multimedia for data analytics which consist of data collection, storage, preprocessing, classification, and visualization. More specifically, it looks for the contribution into the state-of-the-art and approaches, methodologies systems in the design, development,deployment and innovative use of multimedia data. This special issue attracted 87 manuscripts and the submissions were strictly reviewed by least three reviewers consisting of guest editors and external reviewers, with 40 high-quality articles accepted in the end. The issue was published in hard copy in Multimedia Tools and Applications, Volume 76, Issue 18, September 2017.

2 Summary of the accepted papers

All accepted articles are classified in seven groups. It includes healthcare, biometrics, steganography /watermarking, object recognition/video quality, social network, finance, advertizing and others. Below, we briefly summarize the highlights of each paper in given above sequence like first summarize health related articles following by biometric related articles and so on. The contributions of the accepted papers can be summarized as follows.

In A Novel and Secure IoT based Cloud centric architecture to perform predictive analysis of users activities in sustainable health centre (https://doi.org/10.1007/s11042-016-4050-6). Gupta et al. present Internet of-Things (IoT) based cloud centric architecture used for predictive analysis of physical activities of the users in sustainable health centers. The architecture is based on the embedded sensors of the equipment. The architecture is composed of a Cloud data center, public cloud, private cloud, and uses the XML Web services for secure and fast communication of information. The proposed architecture is evaluated for its adoption, prediction analysis of physical activities, efficiency, and security. In Unsupervised Detection of Density Changes through Principal Component Analysis for Lung Lesion Classification (https://doi.org/10.1007/s11042-017-4414-6). Netto et al. present a methodology for analysis, quantification and visualization of unsupervised changes in lung lesions, through principal component analysis. From change regions, the author has extracted texture features for lesion classification as benign or malignant. To reach this purpose, two databases with distinct behavior is used, one of which concerning malign under treatment and another indeterminate, but likely benign, lesions. The simulation results demonstrated that the lesion’s density changes in a public database of malignant lesions under treatment is greater than the private database of benign lung nodules. Other contribution was visualization of changes occurring in the lesions over time. Besides, the authors have quantified these changes and analyzed the entire set through volumetry, the most commonly used technique to evaluate progression of lung lesions. In Lung nodules diagnosis based on evolutionary convolutional neural network (https://doi.org/10.1007/s11042-017-4480-9). da Silva et al. propose a deep learning technique combined with the genetic algorithm to classify lung nodules in whether malignant or benign, without computing the shape and texture features. The methodology is tested on computed tomography (CT) images from the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI). The simulation results demonstrate the promising performance of the proposed technique in terms of sensitivity, specificity, accuracy and area under the ROC curve. In A Hybrid Hierarchical Framework for Classification of Breast Density using Digitized Film Screen Mammograms (https://doi.org/10.1007/s11042-016-4340-z). Kumar et al. develop a hybrid hierarchical framework for classification of breast density using digitized film screen mammograms. In this work five hierarchical classifiers designs consisting of 3 PCA-kNN, 3 PCA-PNN, 3 PCA-ANN, 3 PCA-NFC and 3 PCA-SVM classifiers has been proposed. The performance of the PCA-NFC classifier performs better than other hierarchical classifiers. The result achieved by the proposed framework is quite promising and can be used in clinical environment for differentiation between different breast density patterns.

In Automatic Classification of Thyroid Histopathology Images using Multi-classifier System (https://doi.org/10.1007/s11042-017-4363-0). Jothi J and Rajam V propose a computer aided diagnosis (CAD) system to automatically discriminate hematoxylin and eosin (H&E)-stained thyroid histopathology images either as normal thyroid (NT) images or as papillary thyroid carcinoma (PTC) images. The CAD system incorporates a multi-classifier system to maximize the diagnostic accuracy of classification. The Classification of the Shape and texture features is studied using classifiers such as support vector machine (SVM), naive Bayes (NB), K-nearest neighbor (K-nn) and closest matching rule (CMR) either as standalone classifiers or as combinations to form multi-classifier systems. The multi-classifier system which provides the best accuracy is found out experimentally. In Classification of Acute Lymphoblastic Leukemia using Hybrid Hierarchical Classifiers (https://doi.org/10.1007/s11042-017-4478-3). Rawat et al. addresses the problem of segmenting a microscopic blood image into different regions, and then further analyzes those regions for localization of the immature lymphoblast cell. Further, it investigates the use of different geometrical, chromatic and statistical textures features for nucleus as well as cytoplasm and pattern recognition techniques for sub typing immature acute lymphoblasts as per French– American – British (FAB) classification. The exhaustive experiments have been conducted on large number of microscopic blood images selected from ALL-IDB database. The proposed techniques consisting of the segmentation module used for segmenting the nucleus and cytoplasm of each leukocyte cell, feature extraction module, feature dimensionality reduction module that uses principal component analysis (PCA) to mapped the higher feature space to lower feature space and classification module that employs the standard classifiers, like support vector machines, smooth support vector machines, k-nearest neighbour, probabilistic neural network and adaptive neuro fuzzy inference system. In Texture Based on Geostatistic for Glaucoma Diagnosis From Fundus Eye Image (https://doi.org/10.1007/s11042-017-4608-y). de Sousa et al. proposes a method that analyze the texture of the optical disc image region to diagnose glaucoma. The local binary pattern (LBP) is used to represent the optic disc region, and geostatistical functions to describe texture patterns. Further obtained texture features are used for a classification based on support vector machine. The performance of the method is determined in terms of sensitivity, accuracy and specificity in the diagnosis of glaucoma. The simulations results indicate that the texture descriptors (LBP and geostatistical functions) are perform better and achieve a good success rate.

In Automatic method for quantitative automatic evaluation in dynamic renal scintilography images (https://doi.org/10.1007/s11042-017-4715-9). dos Santos et al. presents an automatic technique for quantitative analysis based on renal scintigraphy. The proposed technique is capable of segmenting regions in the image associated with kidney, background, and aorta. Based on parameters obtained from these segmented images, it is possible to compute the glomerular filtration rate, renogram and renal transit time by deconvolution. The result obtained by the methodology is compared with results of a manual analysis made by a specialist, reaching promising results to be used in a clinical use. In Automatic mass detection in mammographic images using particle swarm optimization and functional diversity indices (https://doi.org/10.1007/s11042-017-4710-1). Neto et al. proposes a computational method to assist in detection of masses in dense and non-dense breasts on mammography images. The proposed methodology is divided into different stages. The experiments demonstrate that the best results (determined in terms of sensitivity, specificity, accuracy, false positives rate per image and free-curve receiver operating characteristic) were obtained in case of dense breast tissue with functional diversity indexes along with masks (in-tender and external) and synthetic minority oversampling technique. Further, the proposed methodology can assist in mass detection, providing the radiologist second opinion in the early detection of breast cancer. In Computer-Aided Detection and Quantification in Glistenings on Intra-Ocular Lenses (https://doi.org/10.1007/s11042-017-4474-7). Jitpakdee and Uyyanonvara describe a glistenings detection and quantification software help the clinician to analysis the experimental result to better understand the effect of glistenings with parameters and conditions. The proposed glistenings detection based on a hybrid techniques, mathematical morphology and fine segmentation using two classifiers for comparing the results. The twenty features are proposed based on characteristics of glistenings. The simulation results show that the precision, recall, and F-measure value increase after fine segmentation using both k-NN and Naive Bayes classifier.

In Prediction of Pain Intensity using Multimedia Data (https://doi.org/10.1007/s11042-017-4718-6). Singh et al. designed a framework for predicting pain intensity of patient. The designed framework includes four aspects. Firstly it provides an efficient approach to computational pain quantification. Secondly, it investigates the medical practitioner perception to pain along with the readings of the various tools available. Thirdly, the psychological aspect is taken into consideration to predict how pain is perceived by observers and experts (physicians). Fourthly, genes involved in pain and no pain conditions were taken and classified using machine learning algorithms. For testing purpose publicly available McMaster UNBC Pain Archive database of spontaneous pain expressions is used. In Learning correlations for human action recognition in videos (https://doi.org/10.1007/s11042-017-4416-4). Yi et al. propose a Learning-Correlation Fusion (LCF) framework, which learns the correlations of different action categories and multiple features in videos. The proposed LCF approach is compared with other five benchmark fusion methods on two challenging HMDB51 and UCF101 datasets. The experimental results demonstrate that the proposed approach is superior to its competitors in the accuracy of action recognition. The proposed method is easy to implement, and the code of LCF will be made available to the public. In Verification of Static Signature Pattern based on Random Subspace, REP Tree and Bagging (https://doi.org/10.1007/s11042-017-4531-2). Chandra and Maheshkar propose an effective method for static signature recognition from spontaneous handwritten text images using Bagging, Random subspace (RS) and REP tree classifiers. The method relies on different aspects of writing: the presence of redundant patterns in the writing and its features. The results indicate that the accuracy of RS with the combination of decision tree having different tree size performs better than other considered classifiers. Bagging which also have the same accuracy as RS but have least statistical errors. These experiments are evaluated on MCYT-75 database. In Compressed Sensing Based Acoustic Event Detection in Protected Area Networks with Wireless Multimedia Sensors (https://doi.org/10.1007/s11042-016-4241-1). Singh et al. propose a method for detecting the tree cutting event in a forest area, by identifying the acoustic pattern generated due to an axe hitting a tree bole, with the help of wireless multimedia sensors. A series of operations using the hamming window, wiener filter, Otsu thresholding and mathematical morphology are used for removing the unwanted clutter from the spectrogram obtained from such events. Using the sparse nature of the acoustic signals, a compressed sensing based energy efficient data gathering scheme is devised for accurate event reporting. A network of Mica2 motes is deployed in a real forest area to test the validity of the proposed scheme. Analytical and experimental results prove the efficacy of the proposed event detection scheme. Further, the proposed method is tested and verified for acoustic event detection.

In A realistic 3D articulatory animation system for emotional visual pronunciation (https://doi.org/10.1007/s11042-017-4578-0), Yu et al. generates a realistic 3D articulatory animation system for emotional visual pronunciation. Firstly, the articulatory movements, recorded by Electro-Magnetic Articulatory (EMA), are trained by Hidden Markov Model (HMM) while the fully context-dependent model is taken into account by making full use of the rich linguistic features. Secondly, owing to the independency in the manipulation of articulators, the articulatory movements are more remarkably adjusted to express different emotions. Thirdly, the emotional speech is generated by adjusting the neutral speech parameters, such as fundamental frequency (F0), duration and intensity, based on Praat. After that, the corresponding articulatory movements are synthesized by the HMM prediction rules which are used to drive the head mesh model along with the emotional speech simultaneously. The experiments intend to synthesize accurate emotional speech synchronized animation of articulators based on the system at phoneme level. In Local Centre of Mass Face for Face Recognition under varying Illumination (https://doi.org/10.1007/s11042-017-4579-z). Kar et al. propose a technique to extract illumination insensitive features for face recognition called local centre of mass face (LCMF). In this LCMF approach, the gradient angle between the centre of mass and centre pixel of a selected neighborhood is extracted. Theoretically, it is shown that this feature is illumination invariant using the Illumination Reflectance Model (IRM) and is robust to different illumination variations. Further, this technique does not involve any explicit computation of Luminance (L) component and as centre of mass is an inherent feature of a mass distribution, its slope with the centre pixel of the neighborhood has local edge preserving capabilities. The proposed technique is successfully tested under different illumination variant databases like AR, CMU-PIE, and extended Yale B using standard protocols, and performance is compared with recently reported methods in terms of rank-1 recognition accuracy. The method is also applied on Sketch-Photo pair database like CUHK. For unbiased or fair performance evaluation, the Sensitivity and Specificity are also being measured for the proposed method on all the databases. The proposed method gives better accuracy performance and outperforms other recent face recognition methods. In DeepKnuckle: Reveling the Human Identity (https://doi.org/10.1007/s11042-017-4475-6). Jaswal et al. propose a finger-knuckle-print based biometric system which could be deployed where a large number of user base is rural. Initially, Region of Interest (ROI) part of finger knuckle image has been extracted, enhanced and transformed using the proposed Bubble ordinal pattern (BOP), STAR ordinal pattern (SOP), and Image ray transform (IRT) based locally adapted procedures. A DeepMatching technique has been used to perform non-rigid distortion free matching between multiple features of two Finger Knuckle Images (FKI). Finally, the performance of proposed system has been evaluated using score level fusion rule, revealing improvement in the results.

In Newborn face recognition using Deep Convolutional Neural Network (https://doi.org/10.1007/s11042-016-4342-x). Singh and Om proposes a method for new born face recognition using Deep Convolutional Neural Network (CNN). The method is tested on IIT(BHU) newborn database and found a very interesting results. Further, experimental results demonstrate that the performance of the deep CNN method is better than other state-of-art techniques. In Secure and imperceptible digital image steganographic algorithm based on diamond encoding in DWT domain (https://doi.org/10.1007/s11042-016-3930-0). Atawneh et al. presents an efficient embedding algorithm in the wavelet domain of digital images based on the diamond encoding (DE) scheme. Experimental results depict that the proposed algorithm is more efficient in embedding compared to other methods in terms of embedding payload and image quality. Moreover, the proposed algorithm is attacked by well-known steganalysis software. Results are showing that the proposed algorithm is secure against the powerful universal steganalyzer “ensemble classifier” and the histogram attack. The results also reveal that the proposed algorithm is robust against different image processing attacks such as compression, added noise, and cropping attacks. In SVM based robust watermarking for enhanced medical image security (https://doi.org/10.1007/s11042-016-4215-3). Rai and Singh present a watermarking model in which a double layer security is introduced to ensure the robustness of embedded data. In this model, support vector machine (SVM) is used as a classifier, which classifies a medical image into Non Region of Interest (NROI) and Region of Interest (ROI) regions. The watermark information is embedding into the NROI part of the medical image. The simulation is performed to measure the Peak Signal to Noise Ratio (PSNR) for imperceptibility and Structural Similarity Index (SSIM) to test the robustness.

In Hybrid image watermarking in redundant wavelet domain (https://doi.org/10.1007/s11042-017-4570-8). Singh et al. highlight the usefulness of redundant wavelet transforms in image watermarking using non-subsampled contourlet transform (NSCT) and redundant discrete wavelet transform (RDWT) and singular value decomposition (SVD). NSCT, RDWT and SVD improve the performance of the proposed method against geometrical and image processing attacks for single and dual image watermarking. Qualitative and quantitative evaluations in terms of peak signal to noise ratio (PSNR), correlation coefficient (CC), bit error rate (BER) and structural similarity index metric (SSIM) show that the proposed method is suitable for single and dual image watermarking, and outperforms existing methods. In Image Steganography for Authenticity of Visual Contents in Social Networking (https://doi.org/10.1007/s11042-017-4420-8). Muhammad et al. proposes a secure crystographic framework for authenticity of visual contents using image steganography, utilizing color model transformation, three-level encryption algorithm (TLEA), and Morton scanning (MS)-directed least significant bit (LSB) substitution. The method uses I-plane of the input image in HSI for secret data embedding using MS-directed LSB substitution method. Furthermore, the secret data is encrypted using TLEA prior to embedding, adding an additional level of security for secure authentication. The qualitative and quantitative results verify the better performance of the proposed scheme and provide one of the best mechanisms for authenticity of visual contents in social networks.

In Detecting Ground Control Points via Convolutional Neural Network for Stereo Matching (https://doi.org/10.1007/s11042-016-3932-y). Zhong et al. present a technique to detect ground control points (GCPs) for stereo matching problem. The performance of the method is determined on the KITTI 2012 stereo benchmark dataset. The experiments show that the proposed technique significantly improves the accuracy of disparity maps. In Content Aware Video Quality Prediction model for HEVC Encoded Bitstream (https://doi.org/10.1007/s11042-017-4574-4). Wang et al. present a content based video quality prediction model for high efficiency video coding (HEVC) encoded video stream, which takes into account the quantization parameter (QP) and the newly proposed content type classification (CTC) metric. The experimental results demonstrate that the proposed prediction model can achieve better correlation between the actual peak signal-to-noise ratio (PSNR) and the predicted PSNR in the training and testing process, and outperforms the other existing prediction methods in terms of accuracy. Furthermore, subjective testing results also show a good consistency between the proposed prediction metric and the subjective rankings.

In A biologically inspired spatio-chromatic feature for color object recognition (https://doi.org/10.1007/s11042-016-4252-y). Tian et al. extend the robust scale invariant feature transform (SIFT) feature on processed opponent color channels to obtain a spatio-chromatic descriptor for color object recognition. The color information processing is implemented under a biologically inspired hierarchical framework, where cone cells, single-opponent and double-opponent cells are simulated respectively to mimic the color perception of primate visual system. The biologically inspired method is tested for object recognition task on two public datasets, and the results support the potential of the proposed technique.

In Depth estimation from single monocular images using deep hybrid network (https://doi.org/10.1007/s11042-016-4200-x). Grigorev et al. design a deep hybrid neural network, which is composed of convolutional and recurrent layers (ReNet), where each ReNet layer is composed of the Long Short-Term Memory unit (LSTM). In the proposed network, ReNet layers aim to enrich the features representation by directly capturing global context. The effective integration of ReNet and convolutional layers in the common CNN framework allows us to train the hybrid network in the end-to-end fashion. Experimental evaluation on the benchmarks dataset demonstrated, that hybrid network achieves the state-of-the-art results without any post-processing steps. Moreover, the composition of recurrent and convolutional layers provides more satisfying results. In No-reference image contrast measure using image statistics and random forests (https://doi.org/10.1007/s11042-016-4335-9). De and Masilamani propose a no-reference image quality measure for images with respect to contrast using random forest regression and validate the results using standard datasets. Experimental results on standard datasets show that the method demonstrates promising results when compared to existing no-reference techniques and the proposed method shows high correlation values with human opinion scores.

In Retina Inspired No-reference Image Quality Assessment for Blur and Noise (https://doi.org/10.1007/s11042-017-4418-2). Joshi and Prakash present a blind image quality assessment technique with no-training. The proposed technique considers two important types of distortions viz. noise and blur for quality estimation of an image. The technique is motivated by two significant phenomena of perception in the retina of an eye. First being the center-surround retinal receptive field and second, existence of multiple spatial frequency channels. The proposed technique does not require any training with distorted or pristine images; or subjective human score to predict quality of the image. The authors evaluate the proposed technique on LIVE, CSIQ and TID08 databases and observe that the obtained results match very well with human subjective opinions. The proposed technique outperforms the latest no-training, no-reference (NR) based image quality assessment techniques. In Improved Self-snake based Anisotropic Diffusion Model for Image Denoising using Structure Tensor (https://doi.org/10.1007/s11042-016-4341-y). Kamalaveni et al. present anisotropic diffusion algorithm is an extension of self-snake diffusion filter which estimates edge and gradient directions as eigenvectors of a structure tensor matrix. The unique feature of this proposed anisotropic diffusion algorithm is diffusion rate at various parts of an image matches with the speed of level set flow. In the proposed algorithm an efficient edge indicator function dependent on the trace of the structure tensor matrix is used. The performance of proposed diffusion algorithm is compared with other classical diffusion filters such as conventional perona-malik diffusion, conventional self-snake diffusion methods. In A Novel Large-scale Multimedia Image Data Classification Algorithm based on Deep Learning (https://doi.org/10.1007/s11042-017-4364-z). Li et al. propose a large-scale multimedia image data classification algorithm based on deep learning. The performance of the algorithm is verified, compared with the other state-of-the-art techniques, the proposed one obtains the better efficiency and accuracy.

In Community-based Link Prediction (https://doi.org/10.1007/s11042-016-4270-9). Biswas and Biswas demonstrate a community-based link prediction approach for identifying missing links or the links that are likely to appear in near future. The authors incorporate the notion of community structure in link prediction. An algorithm is designed to account the influence of communities on link prediction. In this work, the method considered recently developed edge centrality measures to compute likelihood scores of missing links. The performance of proposed algorithm is analyzed in terms of three metrics and execution time on both real-world networks and synthetic networks, where ground truth communities are already defined. The time complexity of proposed algorithm is also analyzed. In CyVOD: A Novel Trinity Multimedia Social Network Scheme (https://doi.org/10.1007/s11042-016-4162-z). Zhang et al. proposes a comprehensive framework for multimedia social network, and realized a cross-platform multimedia social networks (MSN) prototype system, to support two kinds of digital rights management (DRM) modes. The proposed framework effectively protects copyrighted multimedia contents against piracy, and supports a more efficient recommendation system for its better handling of the tradeoff between multimedia security and ease of use.

In Stock Prediction using Deep Learning (https://doi.org/10.1007/s11042-016-4159-7). Singh and Srivastava introduce Deep learning for stock prediction and its performance is evaluated on Google stock price multimedia data (chart) from NASDAQ. Experimental results demonstrate that the accuracy performance of the proposed method is better than the Radial Basis Function Neural Network (RBFNN) method. Further, the results of the proposed model are compared with the Recurrent Neural Network (RNN) and it is observed that the accuracy for Hit Rate is improved. The correlation coefficient between the actual and predicted return for Deep Neural Network (DNN) is better than RBFNN and RNN. In A Novel Macroeconomic Forecasting Model based on Revised Multimedia Assisted BP Neural Network Model and Ant Colony Algorithm (https://doi.org/10.1007/s11042-016-4319-9). Kuang et al. propose a macroeconomic forecasting model based on the revised multimedia assisted BP neural network model and the ant colony algorithm. The experiment on the stock, gold, exchange and inflation indicate that the proposed model can predict the price well with the satisfactory result.

In Multimedia Tool as a Predictor for Social Media Advertising - A YouTube Way (https://doi.org/10.1007/s11042-016-4249-6). Gupta et al. attempts to evaluate importance of YouTube as a multimedia tool. Content analysis of hundred YouTube advertisements of the beauty segment has been done to identify their critical success factors like Audio content, Visual Category, Content category, Message Appeals and viewers’ response through the number of views and likes. Accordingly, a framework has been proposed that may be useful for the managers who develop promotional strategies for the organizations. Attention, interest, desire and action (AIDA) model has been used to validate the framework. In Analysis of EEG Signals and Its Application to Neuromarketing (https://doi.org/10.1007/s11042-017-4580-6). Yadava et al. propose a predictive modeling framework to understand consumer choice towards E-commerce products in terms of “likes” and “dislikes” by analyzing EEG signals. The EEG signals of volunteers with varying age and gender were recorded while they browsed through various consumer products. The experiments were performed on the dataset comprised of various consumer products. The accuracy of choice prediction was recorded using a user-independent testing approach with the help of Hidden Markov Model (HMM) classifier. It is observed that the prediction results are promising and the framework can be used for better business model.

In Predicting the Online Performance for Video Service Providers in the Internet (https://doi.org/10.1007/s11042-017-4460-0). You et al. designed a measurement system and deployed it in eleven provinces and cities in China to monitor two popular websites, Youku and Tudou. The analysis indicates that the performance trend of these two service providers followed daily changing patterns, such as rush hour traffic and lower service workloads at midnight; this is consistent with user behaviors. Meanwhile, some machine learning models are implemented and compared as baseline models. In addition, a hybrid model, which is generated by combining different machine learning models, is also studied as the baseline. An investigation shows that time series models are much more suitable to this prediction problem than baseline models in most situations. To alleviate the data sparseness problem in training the predictor, a new predictor that combines different information sources is proposed, thus improving prediction precision. Furthermore, the predictor is quite stable, and the authors have discovered that the average performance estimation is more accurate if the model is updated within 2–3 days, which is useful in video source analysis and recommendation systems.

In Development of a Self Reliant Humanoid Robot for Sketch Drawing (https://doi.org/10.1007/s11042-017-4358-x). Singh et al. demonstrates the capability of humanoid robot in the field of sketch drawing. A H25 NAO humanoid robot is used as a test-bed in this research to conduct this experiment and illustrated the whole process. A calibration matrix is defined which transforms image points to NAO body coordinate system while inverse kinematics has been solved using a gradient descent numerical method. The capability of NAO robot has been tested in the field of sketch drawing. The Denavit-Hartenberg (DH) parameter of the system has been defined in order to measure the working envelope of the right hand as well as to avoid singularities. In Towards an Ontology based Framework for Searching Multimedia Contents on the Web (https://doi.org/10.1007/s11042-017-4350-5). Shrivastav et al. presents a framework for generating annotation of multimedia content based upon outcomes of user-entered description, embedded audio, image analysis, optical character recognition, and by gathering information beyond this from the Web. This framework also presents various types of searching options to the user such as search by image, search by video, search by audio, search by face, and an improved search by text as query. This system is based upon the usage of RDF (Resource Description Framework) ontology for storing description above contents. A prototype system, based on proposed architecture has been implemented and the work has been evaluated based upon the comparative analysis and the performance of the underlying techniques.

In compound fault prediction of rolling bearing using multimedia data (https://doi.org/10.1007/s11042-017-4419-1). Singh et al. propose a method based on the use of Combined Mode Functions (CMF) for selecting the intrinsic mode functions (IMFs) instead of the maximum cross correlation coefficient based EEMD technique, sandwiched with, Convolution Neural Networks (CNN), which are deep neural nets, used as fault classifiers. This composite CNN-CMF-EEMD method overcomes the deficiencies of other approaches, such as the inability to learn the complex non-linear relationships in fault diagnosis issues and fine compound faults like those occurring on small balls of the bearing. The difficult compound faults can be separated effectively by executing CNN-CMF-EEMD method, which makes the fault features more easily extracted and more clearly identified. Experimental results reinforce the effectiveness of using CNN-CMF--EEMD technique for fine compound faults. A comparison of CNN-CMF-EEMD with Artificial Neural Networks (ANN) based ANN-CMF-EEMD shows the capability of CNN as a powerful classifier in the domain of compound fault features of rolling bearing.

3 Conclusion

Contributions of these 40 selected articles basically reflect the new achievements in the field of predictive analytics for multimedia and we hope they can provide a solid foundation for future new approaches and applications. Finally, we would like to thank all authors for their contributions and the reviewers for reviewing these high quality papers for his support and guidance throughout the process.

Notes

Acknowledgements

First and foremost, the guest editors are heartily grateful to Prof. Borko Furht, Editor-in-Chief, Multimedia Tools and Applications (MTAP), for his support, encouragement and guidance throughout the process. We would like to express our gratitude to all the authors who have submitted their work for publication in this special issue. We are also thankful to all the reviewers that offered their time and competences to provide valuable peer reviews. Finally, the editors are greateful to all Editorial Staff and publisher of the journal for their kind support for the processing of the publication of this special issue.

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Sanjay Kumar Singh
    • 1
  • Amit Kumar Singh
    • 2
  • Basant Kumar
    • 3
  • Subir Kumar Sarkar
    • 4
  • Karm Veer Arya
    • 5
  1. 1.Department of Computer Science & EngineeringIndian Institute of Technology (BHU)VaranasiIndia
  2. 2.Department of Computer Science & EngineeringJaypee University of Information TechnologySolanIndia
  3. 3.Department of Electronics and Communication EngineeringMotilal Nehru National Institute of TechnologyAllahabadIndia
  4. 4.Department of Electronics & Telecommunication EngineeringJadhavpur UniversityKolkataIndia
  5. 5.ABV, Indian Institute of Information Technology & ManagementGwaliorIndia

Personalised recommendations