An Approach for Processing Large and Non-uniform Media Objects on MapReduce-Based Clusters
Cloud computing enables us to create applications that take advantage of large computer infrastructures on demand. Data intensive computing frameworks leverage these technologies in order to generate and process large data sets on clusters of virtualized computers. MapReduce provides an highly scalable programming model in this context that has proven to be widely applicable for processing structured data. In this paper, we present an approach and implementation that utilizes this model for the processing of audiovisual content. The application is capable of analyzing and modifying large audiovisual files using multiple computer nodes in parallel and thereby able to dramatically reduce processing times. The paper discusses the programming model and its application to binary data. Moreover, we summarize key concepts of the implementation and provide a brief evaluation.
Unable to display preview. Download preview PDF.