Shot Classification and Keyframe Detection for Vision Based Speakers Diarization in Parliamentary Debates
- Cite this paper as:
- Marín-Reyes P.A., Lorenzo-Navarro J., Castrillón-Santana M., Sánchez-Nielsen E. (2016) Shot Classification and Keyframe Detection for Vision Based Speakers Diarization in Parliamentary Debates. In: Luaces O. et al. (eds) Advances in Artificial Intelligence. CAEPIA 2016. Lecture Notes in Computer Science, vol 9868. Springer, Cham
Automatic labelling of speakers is an essential task for speakers diarization in parliamentary debates given the huge amount of video data to annotate. In this paper, we address the speaker diarization problem as a visual speaker re-identification issue with a special emphasis on the analysis of different shot types. We propose two approaches that makes use of convolutional neural networks (CNN) and biometric traits for keyframe extraction. Experimental results have been evaluated with challenging real-world datasets from the Canary Islands Parliament, and contrasted with a similar approach that does not analyze the shot type. Results show that the use of CNN for shot classification and biometric traits help to improve the performance of the re-identification outcomes in an average rate of 9.8 %.