Keywords

1 Introduction

In today’s globalized society, research studies, surveys, and discussions are occurring across regions everywhere. In the environmental field especially, because it deals with phenomena of a global scale as well as region-specific issues, groups are commonly created for educational research that go beyond regional or national borders, as well as joint instruction, joint research, joint surveys, and research discussions. However, researchers that travel for courses or discussions are naturally faced with physical and financial limitations that force a reduction of frequency of these types of communication. For example, it is simply not possible to hold lectures with all participants at the same time every week, or have the type of mini-discussions or meetings that are so frequent in normal research laboratories, when spread across different countries and regions.

Meanwhile, the availability of Internet around the world and ICT technologies have helped to break down such location-specific barriers. One example of this is massive open online courses (MOOCs) (Broun 2013; Mackness et al. 2010) that make courses publicly available to many regions and persons; however, MOOCs do not have two-way capability, and only enable one-way transmission of courses for consumption. Also, MOOCs are best suited to courses with large numbers of students situated around the world, but are not so well suited for more average-sized courses such as those involving close interactions between students and instructors, such as courses that use group work, discussions or presentations to encourage a vigorous real time debate.

While many so-called e-learning tools have been proposed over the last several years (Keremidchieva and Yankov 2001; Nagataki et al. 2011; Nakabayashi 2011), we present now an Interactive Multimedia Education System (IMES) tool that emphasizes the interactivity described above, that enables instructors and students to see each others’ faces, and enables high definition sharing of discussion materials, enabling lecture and discussion, while enabling each party to point at presentation materials (Realmedia 2013). This system enables two-way sharing of camera images and audio in real-time for both instructors and students. It also enables sharing presentation materials (such as PowerPoint presentation files) as well as laser pointer tracks, enabling instructors to even show students what they are pointing at as they speak. Students can also use laser pointers, enabling two-way communication for questions or discussion. A key characteristic of this system is enabling comprehensive collaboration and equal participation from all sites connected to the course.

We actually used IMES once a week to conduct remote lectures connecting nine universities in nine countries in Africa and Asia, clearly showing the system’s effectiveness through actual multipoint, multi-region international lecture courses.

This paper explains the key technological elements of IMES in conducting multipoint, two-way remote courses with universities in Asia and Africa, and then describes out of these the most important “basic video transmission technology,” “presentation material sharing system,” “screen/pointer action sharing system,” and “offline features including recording and editing.”

2 Key Technical Elements of IMES

Figure 22.1 shows the basic transmit/receive concept of IMES.

Fig. 22.1
figure 1

The basic transmit/receive concept of IMES

Two screens are essentially provided for each site, one screen for displaying the camera images of the instructor and other course sites at the same time, and another screen for displaying PowerPoint or other presentation materials used for the course.

IMES is comprised of the following four basic technologies. Items in brackets [] are for multipoint use.

  1. 1.

    Enable two-way camera video between instructors and student(s) [video from all sites is combined and shared on a single screen]

  2. 2.

    Students can hear instructors, but instructors can also hear audio of student questions [if multiple parties are speaking at once, audio from all sites is heard as mixed together]

  3. 3.

    Presentation materials (PowerPoint, etc.) can be shared with sufficiently high screen quality that even small print or graphics are legible

  4. 4.

    It is also possible to share where a student or instructor is pointing at on presentation materials [all sites can use pointers at the same time]

High definition video conferencing system products have come onto the market in recent years (Sony 2008), and have enjoyed good adoption rates for uses such as connecting branch offices of companies. However, they are not well suited in some ways for use in courses connecting multiple locations for educational institutions such as universities. For example, the following can be cited as problems in realizing the four key elements mentioned above.

  • The system must be work over normal Internet connections that are available to anyone, but the reality is that some Internet connections are fast, some are slow, and some are very slow. The system must work over slow Internet connections between different universities in different countries.

    • (Commercial videoconferencing systems presume the availability of higher speed circuits than are generally available at universities, and there is little consideration for sharing presentations such as PowerPoint. For making a presentation, for example, even though a high speed connection is required, the graphics quality is typically poor, and gets worse for multipoint conferences.)

  • Specialized equipment is generally required for connecting more than ten locations (MCU—Multiple Communication Unit), which is extremely expensive.

In other words, a system is needed that overcomes these issues while realizing points (1) to (4) above.

3 Basic Video Transmission

Video is one of the most basic and persuasive tools for sharing information, but because it consumes so much bandwidth, there are technological difficulties to its successful implementation, with much research and development on how to overcome this.

Because large volumes of data (video/audio) travel in one direction, or in multiple directions to multiple destinations over the Internet, the basic technological requirement is to be able to compress, transmit and play back in real time.

Conventional analog television transmission resolution is equivalent to 640 × 480 pixels when shown on a computer screen. This is normally transmitted at an actual rate of 30 frames/s. High definition television transmits at 1,920 × 1,080 pixels (varies slightly depending on codec used) at 30 frames/s this—a dramatic increase in video clarity.

But the data size of such video is extremely large. It requires three bytes to represent each one of these pixels in natural color at 16.77 million colors. This results in a basic calculation of 216 Mbits (megabits)/s to transmit just 1 s of conventional television, or 1,458 Mbit/s for high-definition television (HDTV)—much higher than the transmission rate of a typical Internet connection (100 Mbit/s for even the fastest circuits). For this reason, technologies are needed that compress the original image with an acceptable loss of image information. Digital terrestrial broadcasts can be recorded in high definition today using the HDV format at a rate of 25 Mbits/s, and this can be further reduced to 2–8 Mbits/s using the H.264 profile included in recent-model hard disk video recorders, enabling encoding/decoding video that appears natural to the human eye. The Internet connections that are typically available to us in overseas course locations are in the range of 1–3 Mbits/s transmission, so it is this type of advanced encoding/decoding technology that has made it possible to send HDTV-grade video in real time over the Internet.

We had originally developed a system that used proprietary encoding/decoding, but with the emergence on the market of affordable, good performance video conferencing systems based on H.264, the current IMES promotes the use of commercially available products.

However, with up to about 12 parties simultaneously connecting around the world in international courses, the problem arose of not reaching the point of purchasing the high-cost MCU previously mentioned, but not being able to meet the requirements with a video conferencing system that includes multipoint communication functionality (up to about six locations). We therefore developed a method for multi-site communication for up to 20 locations, by connecting relatively affordable equipment able to relay up to six locations, and then using a specialized PC to control these externally.

4 Presentation Material Sharing System

Presentation materials such as PowerPoint slides are commonplace in university course environments. Especially in science-related courses, there may frequently be descriptions written in small characters, formulae with suffices, or detailed drawings or photographs. Therefore, when conducting courses across multiple locations, it is necessary to continually deliver and share high quality presentation images from the lecture site to all student sites. However, sending presentation images containing large amounts of data places puts a heavy burden on networks. For example, the XGA resolution typically used when showing a PowerPoint presentation on a screen with a projector is 1,024 by 768 pixels. If the color information for each pixel requires three bytes, then the entire screen requires 19 Mbits. While this is less than for video, it is impossible to continually send presentations, smoothly turn pages and play animations over the Internet (an animation requires at least 10 frames/s). The even higher SXGA resolution is growing in popularity—this resolution requires 32 Mbits for one screen. Compression technologies can be used, but visual artifacts are more obvious when compressing presentations than videos, preventing the use of high compression rates.

In order to resolve this issue, the method that we developed for IMES sends shared presentation materials just before the course. The only data sent in real time during the course is page turns and laser pointer traces, which is a lot less data. However, the necessary presentation images for the entire course must be effectively delivered to all course locations that are joining that course. We solved this problem by setting up a simple server called a Presentation Multipoint Control Unit (PMCU). Course presentation materials are uploaded as files to this server in advance, and each student site can download the server files at any time before the course. The current PMCU is a simple software that can run on a laptop computer, and is of sufficiently high performance to support more than 20 delivery sites.

We call this presentation sharing method “Mode-B” presentation.

A conceptual diagram is shown in Fig. 22.2.

Fig. 22.2
figure 2

The conceptual diagram of Mode-B presentation

Presentations are converted to data files and pre-sent using Mode-B either using an image capture device from the instructor’s PC to import to the control PC, or the original source data being converted to data file directly if it has PowerPoint or other specific file format. The former method supports any presentation software and can even convert animations into frame sequences, enabling simple sharing of animated presentations to all sites.

Mode-B not only prevents degraded screen quality of presentation materials, but also frees network bandwidth for camera video and audio because there is no load on the network for sending presentation images during remote learning, improving overall quality.

Mode-B is an extremely efficient and reliable communication method, but sometimes during specialized courses or discussions, it becomes necessary to also transmit text written on a whiteboard or images presented on a physical projector. Instructors sometimes also need to necessary to urgently share small presentations created on the spot for announcements or the like. To address these requirements, IMES also provides “Mode-C” to directly acquire images of cameras, physical projectors or instructor computers and “slowly” distribute them to each connected site as to not put a heavy load on the network.

5 Screen/Pointer Action Sharing System

Remote learning or specialized field discussions frequently use electronic media or presentation software (PowerPoint, etc.), requiring interaction through actions such as directly pointing at or writing on part of the screen. Video conferencing systems are designed mainly for environments where (printed) presentation materials can be shared in advance, and do not tend to emphasize interaction through the screen. However, most instructors are only able to prepare materials for that day or the previous day at most, and interaction is extremely important for classes or courses where they point at the screen that day, sometimes asking questions or involving students in a discussion (university professors especially prefer to use laser pointers to explain things while looking at students).

It is desirable to share a single screen across the network to multiple sites, allow any site to mark the screen using laser pointers, and enable all sites to see this at the same time. However, it is surprisingly difficult to share presentation screens over the network using personal computers. While it would be simple to just point a camera at the PC screen and transmit that video, the image becomes blurred and distorted when compressed and decompressed, and is not suitable for showing a PC screen in fine and clear detail. For this reason, modern video conferencing systems import and send video signals directly from PCs (the H.264 compression standard is tuned for presentations in this case, or a specialized H.239 compression/transmission technology is used). However, it is difficult to share marks in such a case. In other words, when the instructor is using a laser pointer or pointing stick, sites other than the instructor site cannot see where the instructor is pointing on the screen. Other methods exist such as moving a mouse cursor on a screen or making marks using a tablet device, but these limit the ability of instructors to move about and are generally not preferred.

To address this problem, a “multipoint laser marking system” was developed for IMES. The concept is shown in Fig. 22.3.

Fig. 22.3
figure 3

The concept of multipoint laser marking system

The actual process is as follows. The most significant point is that an infrared (IR) laser pointer is used, whose light is not visible to the human eye.

  1. 1.

    The instructor points use a special IR laser pointer to point at the screen.

  2. 2.

    An IR camera detects the laser spot location on the screen.

  3. 3.

    A control PC reads the laser spot and creates a marking overlay screen (a transparent screen on which the pointer track or other marks are drawn), based on the IR spot coordinates.

  4. 4.

    The control PC imports the instructor’s PC screen and overlays the transparent marking screen from (3) above.

  5. 5.

    For Mode-B, the currently displayed presentation screen’s ID number and marking information (pointer location, color, track length) are transmitted to each student site. For Mode-C, the presentation screen and marking information are transmitted.

  6. 6.

    By the same process as (4) above, pictures are created at student sites from overlaid presentation images and marking images, and marking screens from each other site are also overlaid and displayed on all site screens.

This enables an environment in which the presentation screen of one site is presented at the same time to multiple student sites, and any site can also mark on the screen. Figure 22.4 shows a screen example for one on one two-way marking. This system enables instructors and students, even when remote, to engage in discussion and share markings, as if they are looking at a single screen. We call this Shared Screen, and are actively promoting in remote courses. It is regarded as being extremely effective by instructors and students alike.

Fig. 22.4
figure 4

A screen example for one on one two-way marking

6 Offline Features Including Recording and Editing

As described above, IMES is especially effective when one course or class is shared among multiple locations. However, there are cases where students or instructors want to record lectures and distribute them later (the international courses we observed included participants from nine countries in locations such as Asia and Africa, with as many as 6 h time zone difference. They somehow made do by starting class at 4pm. This will become impossible if any participants join from North or South America).

Therefore, we also developed as an external device for IMES a system to enable recording and storing lectures in high resolution, and making edits, as needed, for standalone transmission at a later time. The problem here is that IMES sends and receives two screens—video of the instructor/students and the presentation screen. When recorded as two separate video streams, both timelines must be kept in careful sync, or editing becomes extremely difficult when importing into editing software (also called non-linear editing). There is also no video or audio for the presentation screen, because it is difficult to shift the time of one in the editing software and align it with the other. To address these problems, we developed a simultaneous video recording tool.

This simultaneous video recording tool saves the two video types as separate files on to a single hard disk, and keeps their timelines in sync when they are imported into editing software as content for editing. During the editing process, they are typically combined into a single widescreen image (the 16:9 aspect ratio used in high definition broadcasts). During presentation, the screen on the right is the original presentation screen (4:3 aspect ratio), and laser pointer markings are embedded as images. Instructor video other video is placed in the space left over at the left side of the screen. Of course, other editing is possible as well, such as swapping the two screens depending on the course/class, or using wipe or other transition effects. An example of an edited screen is shown in Fig. 22.5 above.

Fig. 22.5
figure 5

An example of an edited screen

Users love this combined screen view, so as an added feature, IMES currently makes available a “low-bitrate A/V distribution system” as a web view (at a lower resolution) at all times. It is convenient when the signal is lost during a class or to monitor class conditions.

7 Conclusion and Topics for Future Discussion

As described above, we developed and continually improved a system under the concept of multi-point, real time and bidirectional courses, actually using the system throughout the year in a course held once a week, by placing equipment in eight overseas countries. The IMES system we built was comprised of commercially available video conferencing systems and a control PC with the remote presentation system software described above, using one of three configurations from rack mount to laptop PC, depending on the time of configuration and size of the site (Fig. 22.6).

Fig. 22.6
figure 6

(a) Rack mount type; (b) Desktop PC type; (c) laptop PC type of External appearance of IMES

IMES was proven to be extremely effective through a series of international course experiences. Field work is extremely important in environmental studies as was the case here, and the system showed its value in the importance of discussing things before field work, such as sharing opinions or going over detailed schedules or procedures, and for summarizing or presenting post-field work data. The field work of an international team and pre- and post-field work discussion work went very smoothly, thanks to the use of IMES.

IMES is currently being evaluated in classes other than environmental studies as described above, such as at the medical faculty of a nearby university (not our own university), being used to hold seminars with remote graduate students in one professor’s laboratory, as well as being used for non-recurring events such as symposiums.

Several class scenes are shown in Fig. 22.7 above.

Fig. 22.7
figure 7

Several class scenes using IMES

However, the skillful use of IMES depends not just on student not simply feeling he or she is watching a “video course,” but how skillfully the instructor can bring the people from the remote side closer to his or her side. Instructors should call on many students and invite them to speak even when they don’t volunteer, announce the results of group work, showing the importance of the strength of course coordinators in making the course more interactive.

Only pointer tracks can be shared at this time, but there remain many development desires, such as sharing over an electronic board instead of whiteboards, simultaneous control of various software packages in every location, or enabling presentation of even more types of presentation materials.

In any case, it is without a doubt that communication tools such as IMES that break down space and time barriers will grow even more essential the global society of tomorrow.