Pattern Recognition and Image Analysis

, Volume 25, Issue 2, pp 237–254 | Cite as

A software system for the audiovisual monitoring of an intelligent meeting room in support of scientific and education activities

  • A. L. Ronzhin
  • A. A. Karpov
Software and Hardware for Pattern Recognition and Image Analysis


This paper presents an analytical review of prototypes of intelligent spaces for scientific and education activities equipped with automatic recording tools and describes the specifics and distinguishing features of the intelligent meeting room developed at the St. Petersburg Institute for Informatics and Automation of RAS. A functional model of the automation of audiovisual monitoring of participants of activities is proposed based on the use of space–time data structuring describing the behavior of the participants within the analyzed room. Modern technologies of digital signal processing and pattern recognition have been used in implementing the proposed functional model of audiovisual monitoring in the smart conference room. New methods have been developed, in particular, the method of registration of participants of activities and the method of audiovisual recording of their presentations. The paper presents a software system for audiovisual monitoring for the automation of support for research and education activities held in a smart conference room. The main goal of the developed system is to identify events in the smart conference room, such as the time when a new user enters the room, when a speech begins, or when an audience member is given the floor. Experimental data on the participants were collected in the course of a simulation of activities where users held a meeting according to a given scenario and at real research and education activities when the participants were informed about the audiovisual recording of their behavior but it did not affect their planned activities in a smart conference room. During tests of the method of registration of participants of the event held in the smart conference room, more than 21000 photographs were taken. The average time required for taking a photograph of a participant was 1.3 s. The average displacement of the participant’s face relative to the photographic center was 9%. The average person’s face took up 30% of the area of the photograph. In addition, accumulated experimental data made it possible to identify places in the conference room from which questions were asked most frequently. The accuracy of pointing of the video camera at the speaker in the presentation area, as well as in rows of seats, was assessed by the size and position of their face in the frame over the entire process, averaging at 90%.


multichannel processing of audiovisual signals computer vision recording automation of presentations spatiotemporal structuring of data intelligent space 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Fillinger, I. Hamchi, S. Degré, L. Diduch, T. Rose, J. Fiscus, and V. Stanford, “Middleware and metrology for the pervasive future,” IEEE Pervasive Comput. Mobile Ubiquit. Syst. 8 (3), 74–83 (2009).CrossRefGoogle Scholar
  2. 2.
    H. Nakashima, H. K. Aghajan, and J. C. Augusto, Handbook of Ambient Intelligence and Smart Environments (Springer, 2010).CrossRefGoogle Scholar
  3. 3.
    S. Renals, T. Hain, and H. Bourlard, “Recognition and understanding of meetings the AMI and AMIDA projects,” in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU’07 (Kyoto, 2007), pp. 238–247.Google Scholar
  4. 4.
    H. Chou, J. Wang, C. Fuh, S. Lin, and S. Chen, “Automated lecture recording system,” in Proc. Int. Conf. on System Science and Engineering (Taipei, 2010), pp. 167–172.Google Scholar
  5. 5.
    D. G. Korzun, I. V. Galov, and S. I. Balandin, “Proactive personalized mobile multi-blogging service on Smart-M3,” in Proc. 34th Int. Conf. on Information Technology Interfaces (ITI 2012) (Cavtat/Dubrovnik, June 25–28, 2012), pp. 143–148.Google Scholar
  6. 6.
    R. Kadirov, E. Cvetkov, and D. Korzun, “Sensors in a smart room: preliminary study,” in Proc. 12th Conf.Open Innovations Framework Program FRUCT (Oulu, 2012), pp. 37–42.Google Scholar
  7. 7.
    T. Hammond, K. Gajos, R. Davis, and H. Shrobe, “An agent based system for capturing and indexing software design meetings,” in Proc. Int. Workshop on Agents in Design (WAID’02) (Cambridge, MA, 2002), pp. 24–26.Google Scholar
  8. 8.
    F. Bennett, T. Richardson, and A. Harter, “Teleporting-making applications mobile,” in Proc. Workshop on Mobile Computing Systems and Applications (Santa Cruz, 1994), pp. 82–84.Google Scholar
  9. 9.
    V. Yu. Budkov, Al. L. Ronzhin, S. Glazkov, and An. L. Ronzhin, EventDriven Content Management System for Smart Meeting Room, Ed. by S. Balandin et al. (Springer-Verlag Berlin Heidelberg, 2011), pp. 550–560.Google Scholar
  10. 10.
    A. Ronzhin, V. Budkov, and A. Karpov, Multichannel System of Audio-Visual Support of Remote Mobile Participant at E-Meeting, Ed. by S. Balandin et al. (Springer-Verlag Berlin Heidelberg, 2010), pp. 62–71.Google Scholar
  11. 11.
    Al. L. Ronzhin and V. Yu. Budkov, “Determination and recording of active speaker in meeting room,” in Proc. 14th Int. Conf. SPECOM’2011 (Kazan, 2011), pp. 361–366.Google Scholar
  12. 12.
    Al. Ronzhin, M. Prischepa, and A. Karpov, A Video Monitoring Model with a Distributed Camera System for the Smart Space Ed. by S. Balandin et al. (Berlin: Springer, 2010), pp. 102–110.Google Scholar
  13. 13.
    K. Kim, T. Chalidabhongse, D. Harwood, and L. Davis, “Realtime foreground-background segmentation using codebook model,” RealTime Imaging 11 (3), 172–185 (2005).Google Scholar
  14. 14.
    P. Viola, M. Jones, and D. Snow, “Detecting pedestrians using patterns of motion and appearance,” Int. J. Comput. Vision 63 (2), 153–161 (2005).CrossRefGoogle Scholar
  15. 15.
    M. N. Favorskaya, “Models and methods for recognizing dynamical images on the base of spatial-time analysis of image series,” Extended Abstract of Doctoral Dissertation in Engineering Sciences (Krasnoyarsk, 2010). Scholar
  16. 16.
    R. M. Yusupov, An. L. Ronzhin, M. Prischepa, and Al. L. Ronzhin, “Models and hardware-software solutions for automatic control of intelligent hall,” Automat. Remote Control 72 (7), 1389–1397 (2011).CrossRefGoogle Scholar
  17. 17.
    P. Cook, C. S. Ellis, M. Graf, G. Rein, and T. Smith, “Project Nick: Meetings augmentation and analysis,” ACM Trans. Inf. Syst. 5 (2), 132–146 (1987).CrossRefGoogle Scholar
  18. 18.
    Al. L. Ronzhin, “Audiovisual recording system for e-learning applications,” in Proc. Int. Conf. on Computer Graphics Theory and Applications GRAPP’12 (Rome, 2012), pp. 515–518.Google Scholar
  19. 19.
    Y. Rui, A. Gupta, J. Grudin, and L. He, “Automating lecture capture and broadcast: technology and videography,” Multimedia Syst. 10, 3–15 (2004).Google Scholar
  20. 20.
    Al. L. Ronzhin, and A. A. Karpov, “System of audiovisual streams recording and synchronization for the smart meeting room,” Sci. Visual. 3 (4), 28–30 (2011).Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2015

Authors and Affiliations

  1. 1.St. Petersburg Institute for Informatics and AutomationRussian Academy of SciencesSt. PetersburgRussia
  2. 2.ITMO UniversitySt. PetersburgRussia

Personalised recommendations