Multimedia Tools and Applications

, Volume 75, Issue 11, pp 6321–6345 | Cite as

MORE – a multimodal observation and analysis system for social interaction research

  • Anja Keskinarkaus
  • Sami HuttunenEmail author
  • Antti Siipo
  • Jukka Holappa
  • Magda Laszlo
  • Ilkka Juuso
  • Eero Väyrynen
  • Janne Heikkilä
  • Matti Lehtihalmes
  • Tapio Seppänen
  • Seppo Laukka


The MORE system is designed for observation and machine-aided analysis of social interaction in real life situations, such as classroom teaching scenarios and business meetings. The system utilizes a multichannel approach to collect data whereby multiple streams of data in a number of different modalities are obtained from each situation. Typically the system collects a 360-degree video and audio feed from multiple microphones set up in the space. The system includes an advanced server backend component that is capable of performing video processing, feature extraction and archiving operations on behalf of the user. The feature extraction services form a key part of the system and rely on advanced signal analysis techniques, such as speech processing, motion activity detection and facial expression recognition in order to speed up the analysis of large data sets. The provided web interface weaves the multiple streams of information together, utilizes the extracted features as metadata on the audio and video data and lets the user dive into analyzing the recorded events. The objective of the system is to facilitate easy navigation of multimodal data and enable the analysis of the recorded situations for the purposes of, for example, behavioral studies, teacher training and business development. A further unique feature of the system is its low setup overhead and high portability as the lightest MORE setup only requires a laptop computer and the selected set of sensors on site.


Spherical video Audio Database Social interaction Collaboration Metadata Computer vision Speech analysis Web technologies 


  1. 1.
    Amidon E, Flanders N (1967) The role of the teacher in the classroom, (revised edition). Association for Productive Teaching IncGoogle Scholar
  2. 2.
    Anderson H (1939) The measurement of domination and of socially integrative behavior in teachers’ contacts with children. Child Dev 10:73–89CrossRefGoogle Scholar
  3. 3.
    Anderson H, Brewer H (1945) Studies of teachers’ classroom personalities, I. Dominative and socially integrative behavior of kindergarten teachers. Stanford University Press, Stanford, CalGoogle Scholar
  4. 4.
    Anderson H, Brewer J (1946) Studies of teachers’ class-room personalities, II. Effects of dominative and integrative contacts on children’s classroom behavior. Stanford University Press, Stanford, CalGoogle Scholar
  5. 5.
    Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27. doi: 10.1145/1961189.1961199. software available at. CrossRefGoogle Scholar
  6. 6.
    Derry S, Pea R, Engle R, Erickson F, Gpödman R, Hall R, Koschmann T, Lemle J, Sherin M, Sherin B (2010) Conducting video research in the learning sciences: Guidance on selection, analysis, technology, and etchics. J Learn Sci 19(1):3–53CrossRefGoogle Scholar
  7. 7.
    eXist Solutions GmbH (2000) exist-db: Open source native XML database. Online,, accessed April 25, 2014
  8. 8.
    Flanders N (1970) Analysing Teaching Behavior. Addison-Wesley Pub. Co. Reading, MassGoogle Scholar
  9. 9.
    Gatica-Perez D (2009) Automatic nonverbal analysis of social interaction in small groups: A review. Image Vis Comput (Vis multimodal anal hum spontaneous behav) 27(12):1775–1787. doi: 10.1016/j.imavis.2009.01.004 Google Scholar
  10. 10.
    Goldman R (2007) Orion™, an online digital video analysis tool: Changing our perspectives as an interpretive community, Lawrence Erlbaum Associates, Mahwah, NJ, pp 507–520Google Scholar
  11. 11.
    Horn BK, Schunck BG (1981) Determining optical flow. Artif Intell 17(1):185–203CrossRefGoogle Scholar
  12. 12.
    Horn E (1914) Distribution of opportunity for participation among the various pupils in classroom recitations. Contributions to Education, 67Google Scholar
  13. 13.
    International Telecommunication Union (ITU) (2012) P.56: Objective measurement of active speech level. Online,, accessed August 18, 2014
  14. 14.
    IRIS Connect (2012) Classroom observation — lesson observation — teacher professional development. Online,, accessed August 18, 2014
  15. 15.
    Kiema H, Mäenpää M, Leinonen T, Soini H (2014) Peer group counseling as a tool for promoting managers’ communication skills in industrial and planning organizations. In: COLLA 2014, The Fourth International Conference on Advanced Collaborative Networks, Systems and Applications, pp 28–33,
  16. 16.
    Mchenry V (1968) The use of video processes in teacher education. Multi-State Teacher Education Project, Baltimore, MD. & Utah State Board of Education, Salt Lake City. D.C, WashingtonGoogle Scholar
  17. 17.
    Noldus (1989) Software and labs for behavioral research and video tracking.,, accessed April 25, 2014
  18. 18.
    OpenCV (1999) Open source computer vision library. Online.,, accessed December 10, 2014
  19. 19.
    Pea R, Lindgren R (2008) Video collaboratories for research and education: An analysis of collaboration design patterns. IEEE Trans Learn Technol 1(4):235–247. doi: 10.1109/TLT.2009.5 CrossRefGoogle Scholar
  20. 20.
    Pea R, Mills M, Rosen J, Dauber K, Effelsberg W, Hoffert E (2004) The Diver project: Interactive digital video repurposing. IEEE Multimedia:54–61. doi: 10.1109/MMUL.2004.1261108
  21. 21.
    Point Grey Research Inc (2013) Accurate 360° spherical imaging with multiple pre-calibrated sensors. Online.,, accessed February 24, 2015
  22. 22.
    Powell AB, Francisco JM, Maher CA (2003) An analytical model for studying the development of learners’ mathematical ideas and reasoning using videotape data. J Math Behav 22(4):405–435. doi: 10.1016/j.jmathb.2003.09.002 CrossRefGoogle Scholar
  23. 23.
    Puckett RC (1928) Making supervision objective. School Rev 36:209–212CrossRefGoogle Scholar
  24. 24.
    Schreer O, Fieldmann I, Weissig C, Kauff P, Schafer R (2013) Ultrahigh-resolution panoramic imaging for format-agnostic video production. Proc IEEE 101(1):99–114CrossRefGoogle Scholar
  25. 25.
    Siipo A, Laukka S, Seppänen T, Toivanen J, Partala J, Väyrynen E, Lehtihalmes M, Mattila P, Miettunen, J, Heikkilä J (2010) The multi-technological analyses of learning in the future classroom: Preliminary observatory apparatus. In: Martens A, Tavangarian D, Urban B, Hambach S (eds) 3rd International eLBa Science Conference. Fraunhofer Verlag, pp 60–66Google Scholar
  26. 26.
    Soleymani M, Lichtenauer J, Pun T, Pantic M (2012) A multimodal database for affect recognition and implicit tagging. IEEE Trans Affect Comput 3(1):42–55CrossRefGoogle Scholar
  27. 27.
    Stevens R, Cherry G, Fournier J (2002) Video traces: Rich media annotations for teaching and learning. In: Proc. Conference on Computer Supported Collaborative LearningGoogle Scholar
  28. 28.
    Sun X, Foote J, Kimber D, Manjunath B (2001) Panoramic video capturing and compressed domain virtual camera control. In: Proc. ACM Int. Conf. Multimedia, pp 329–347. doi: 10.1145/500141.500191
  29. 29.
    Sun X, Foote J, Kimber D (2005) Region of interest extraction and virtual camera control based on panoramic video capturing. IEEE Trans Multimedia 7:981–990CrossRefGoogle Scholar
  30. 30.
    Teachscape (1999) Teachscape homepage. Online.,, accessed April 25, 2014
  31. 31.
    Transana (2005) Qualitative analysis of software for video and audio data. Online,, accessed January 12, 2014
  32. 32.
    Väyrynen E, Keränen H, Seppänen T, Toivanen J (2005) Performance of f0tool: A new speech analysis software for analyzing large speech data sets. In: Proc 2nd Baltic Conference on Human Language Technologies, Tallinn, EstoniaGoogle Scholar
  33. 33.
    Väyrynen E, Kortelainen J, Seppänen T (2013) Classifier-based learning of nonlinear feature manifold for visualization of emotional speech prosody. IEEE Trans Affect Comput 4:47–56. doi: 10.1109/T-AFFC.2012.35 CrossRefGoogle Scholar
  34. 34.
    Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154. doi: 10.1023/B:VISI.0000013087.49260.fb CrossRefGoogle Scholar
  35. 35.
    Wong WK, Poh YC, Loo CK, Lim WS (2010) Wireless webcam based omnidirectional health care surveillance system. In: Second Int. Conf. Computer Research and Development, Second Int. Conf. Computer Research and Development. doi: 10.1109/ICCRD.2010.178
  36. 36.
    Zhao G, Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928. doi: 10.1109/TPAMI.2007.1110 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Anja Keskinarkaus
    • 1
  • Sami Huttunen
    • 2
    Email author
  • Antti Siipo
    • 3
  • Jukka Holappa
    • 2
  • Magda Laszlo
    • 1
  • Ilkka Juuso
    • 1
  • Eero Väyrynen
    • 1
  • Janne Heikkilä
    • 2
  • Matti Lehtihalmes
    • 4
  • Tapio Seppänen
    • 1
  • Seppo Laukka
    • 3
  1. 1.Department of Computer Science and EngineeringFaculty of Information Technology and Electrical Engineering, University of OuluOuluFinland
  2. 2.Center for Machine Vision Research, Faculty of Information Technology and Electrical EngineeringUniversity of OuluOuluFinland
  3. 3.Research Unit of Psychology, Learning Research Laboratory (LearnLab), Faculty of EducationUniversity of OuluOuluFinland
  4. 4.Logopedics, Faculty of HumanitiesUniversity of OuluOuluFinland

Personalised recommendations