Database (Lecture) Streams on the Cloud

This is an experience report on teaching the undergrad lecture Big Data Engineering at Saarland University in summer term 2020 online. We describe our teaching philosophy, the tools used, what worked and what did not work. As we received extremely positive feedback from the students, we will continue to use the same teaching model for other lectures in the future.


Concept of this course: Learning by Application
Planned structure for every two weeks of lectures: 1. Concrete application: XY 2. What are the underlying the data management and analysis issues?how certain techniques from the database world would solve exactly that problem.Then, we would spent quite some time to make the transfer: how does that technology help in this application?This principle structure is summarized in Figure 1.
Table 1 shows the applications discussed and their mapping to topics.As Saarland University decided to shorten summer term 2020 by starting only in May rather than in April, we also had to shorten the material.In addition, all courses had to be designed to allow all students to start in May only.If we provided material before May, we had to make sure to go through it again in May.Due to these additional constraints, we decided to offer material for students who did not know Python yet as we planned to use Jupyter notebooks to explain certain concepts in the actual lecture.That Python introductory material was then again repeated in two lectures in May.
Moreover, throughout the lecture we also recommended old videos from 2013/14 to students in case they wanted alternative explanations.For all material that we created, we paid attention that our notation was consistent with [Kemper and Eickler(2015)] in order to enable students to easily lookup yet another explanation in that book.
Also note that we left out considerable material that we felt like is not up to the reality of modern databases anymore, e.g.normal forms whose importance can be debated in the light of modern non-scalar SQL-types 3 .
All videos, including the older ones from 2013/2014 are publicly available on our YouTube-channel 4 .The pdfs of our slides are available through our website 5 .If you are a lecturer and want to have access to the sources, send us an email.
We also created Jupyter notebooks 6 which were shown in actual lecture and used for the exercises.

Topic Learning objectives
Python (Part

Technology
In this section, we describe the hardware and software we used for our lecture.

Hardware
In terms of hardware, we experimented a lot until converging on the following setup.We used an existing 2016 MacbookPro, a 32 inch monitor, a thunderbolt dock (Elgato Thunderbolt 3 Pro Dock), a dynamic microphone with internal pop-filter (Rode Procaster), a mike preamp (TritonAudio FetHead), a sound interface (Focusrite Scarlett Solo 3rd Gen), and two speakers (Yamaha HS7 Standard).That's it!An existing iPad pro was not used in the lecture 7 .The office used for streaming was a standard office room without any extra soundproofing (one exception: see Section 4.3).
With this setups the sound quality of the audio is awesome and makes a huge difference over any setup we used before.In particular, previous issues with background noise are gone.The same holds for the camera setup which outperforms any dedicated webcam we tried.

Internet Connection
An important and surprising issue we ran into early was how to connect to the Internet.Initially we had quite some issues when connecting the computer via WiFi.After experimenting with several setups, it became clear that WiFi is simply not stable enough for live streaming purposes.This is not so much a problem of bandwidth but of channel conflicts and electromagnetic interferences from other devices, as well as latency issues.Though we carefully debugged the WiFi connection, however, eventually we concluded that a wired ethernet connection should be preferred over WiFi whenever possible.The same holds for audio and video conferences.
As LAN was not available in some home offices, we used a LAN via power option (AVM FritzPowerline 510E Set).Though powerline may lower the available bandwidth, it is much more stable than WiFi.In general, in terms of Internet bandwidth note that even a relatively weak but stable uplink of 2Mb/sec may be enough to stream a lecture that only shows slides in the video stream.If you want to stream camera input, you need a higher bandwidth though.

Zoom, Teams or YouTube Streaming?
Initially we planned to use Zoom for the actual lecture and additionally stream the Zoom lecture to YouTube.This can directly be done from any Zoom meeting.However, shortly before the semester started we decided against using Zoom for several reasons: (1) ongoing discussions on privacy and data protection issues with Zoom, (2) relatively poor audio/video quality due to heavy lossy compression, (3) impossible to make the YouTube stream publicly available if at any time clear text names or webcam videos from students can be seen in the stream.
Similar concerns are true for MS Teams.After some brief investigation it turned out that the privacy and security issues with MS Teams were at least comparable to Zoom or even worse, e.g.SSO with cleartext university password on microsoft.comwebsite.In addition, we experienced that the video and audio quality of MS Teams is simply not acceptable: we observed super-heavy compression artifacts, dichotomy of picture and sound.In addition, we found the UI of MS Teams to be extremely confusing.For the lecture, we decided to use a solution that fixes all of the above issues: YouTube's livestream feature.In order to send a stream from your computer you have to install a streaming software locally.There aren't many options.We decided to use Open Broadcaster Software (OBS)8 , see Figure 3. Commercial options are also around, e.g.Wirecast.OBS worked well for our purposes.The only problem we ran into was that OBS is not really optimized for macOS.This results in a relatively high CPU load which in turn leads to notable fan noise.We had to fix this by positioning the Macbook as far away from the microphone as possible and improvising some soundproofing between the Macbook and the microphone (a chair covered by a thick blanket).
By default, the head of the YouTube stream is cached to be able to cope with network failures.We observed a latency of about 10 seconds.We configured this to a shorter value to allow for more interaction with students (see Section 4.4)9 .
We also disabled the livestream chat to avoid spam; video comments were allowed though.This setup has many advantage: we received great audio and video quality (1080p).We could stream audio and video without any notable compression artifacts or distorted audio.As the stream is automatically archived while being streamed, students can also pause the video, watch it time-shifted or at any point in time later.In addition, all video format issues for different devices are automatically handled by YouTube.Moreover, we do not put any extra load on the university network (if a video is watched from outside the network) which would have to be paid by our university (if billed per volume which is true for Saarland University).

Audience Response System
To allow for interaction with the students, we required an audience response system.We wanted something simple: a moderated chat where students could type in questions, a moderator (the tutor in chief) would select suitable questions, and forward them to the lecturer who would then drag the question to the stream and answer them live during the lecture.
Many different tools are available in this space, and after some investigation we decided to use frag.jetzt10 .It worked well for us.For this tool only the lecturer has to log in to make sure that for each lecture students can use the same URL.See Figure 4 for an example.
The only problem we encountered with this tool is the missing possibility for the lecturer to ask back within the tool.

Content Management System (CMS)
As lecture system, i.e. the system where students register, where all links to all materials are available, and where homework is handed in, we used an existing internal system.This system has been in use at the department for more than ten years for physical courses as well.It is in many dimensions superior to other systems like Moodle (which we used for three years).The biggest advantage is simplicity.In contrast, Moodle's UI is too complex and confusing.It is sometimes hard to find things and in the student's evaluations it was often heavily criticized.

Communication Tools
Throughout the course, we used additional tools when needed.In the following, we summarize three tools and provide use-cases for each of them.
We used Discord for office hours, tutorials, and mini office hours during the lecture break and right after each lecture.Discord is an audio conference tool widely used in gaming community.The idea is to define virtual rooms which depending on your access rights, users are allowed to join anytime.Once you enter a room, anyone in that room can hear what you say and you can hear immediately what others in that room are saying.Though, at first, this sounds a bit like a standard video conference tool, in fact, it feels more like walking through different (physical) rooms.Discord also allows for screen-sharing and video camera support.Discord was popular among students due to its familiarity and popularity in the gaming community.To join our server, students needed an invitation link that was only visible for students registered in our CMS.
We 'boosted'11 our Discord server to improve screen-sharing capabilities with a better resolution.In addition, we installed a paid third-party Discord bot called VoiceMaster+.It allows students to create their own temporary voice channels.This was used by students to watch the lecture stream 'together' to have a bit of a lecture hall feeling, in tutorials to create breakout sub-rooms, and for students' learning groups.Moreover, we also implemented multiple-choice questions for text channels (Pollbot).This was heavily used in the tutorials.
We also considered using Zoom or MS Teams for tutorials (cf.Section 4.3).We announced initially that we would only consider these tools as a backup.In the end there was no need for this and the tutors and most students were happy with Discord.Sometimes we observed stability issues with Discord.In addition, screen sharing and video is still a bit buggy and far behind Zoom's abilities.Still, Discord is a great and very helpful tool that will surely further improve in future versions.
We used our CMS allowing students to give anonymous feedback.This was used about ten times in total in the entire semester.
We provided a textual forum.For this we used Discourse12 .We witnessed several interesting discussions and witty student's comments.We made sure to answer questions quickly and the lecturer involved himself into answering the hard questions.Due to Covid-19, the forum was more frequently used than in previous years.
In the tutorials, our students also used a tool for shared scribbling: A Web Whiteboard (AWW) 13 .Obviously, this sounds a bit like we used a zoo of tools -which is actually true.Even though this is only a very small zoo, it still is a zoo.To fix that problem, we tried to integrate the tools wherever possible and make everything easily accessible.In the end, all the student-facing tools were web-based anyways, so the divide among the different tools was not as severe as it would have been for different desktop applications.

Tutorials, Assignments, and Office Hours
In addition to the lecture, we wanted to give the students the opportunity to deepen their understanding of the course material and ask questions.Therefore, we offered tutorials, office hours, and assignments.In the following, we first discuss our requirements for each of the offers and then explain how we implemented these requirements with the tools discussed in Section 4.

Tutorials
In the past, we offered tutorials for up to 30 students in seminar rooms at our university.Students could vote for their favorite time slots in our CMS and were then assigned accordingly.This year, tutorials were held on our Discord server with a similar number of participating students.For some time slots two tutorials were offered simultaneously.Therefore, we created two categories, one for each virtual seminar room.Each category was structured as follows: Plenum voice/video channel: Participants and tutor meet here at the beginning of the tutorial.The tutor shares screen, gives instructions, and moderates the discussion.
Temporary voice/video channels: Students form groups of up to five to discuss exercises.
Plenum text channel: Tutors can send files and give instructions in case students are not in the plenum voice channel.Students can ask the tutor to join their temporary channel for questions.
In previous iterations of this lecture, tutors presented the solutions to the assignment handed in most recently.However, we observed that students tended to be very passive and simply wrote down the solutions in these types of tutorials.In fact, we could have simply provided a prerecorded video explaining the solutions but we wanted the students to participate in the tutorial actively.
Therefore, we decided against presenting solutions and instead prepared additional smaller exercises that could be solved during the tutorials.These exercises were explained by the tutors in a presentationstyle screencast.Depending on the type of exercise, they were either discussed in the plenum or students were given 10 minutes to form small breakout groups and solve the exercises.We prepared exercises of the following three types: 1. Multiple-choice exercises: These exercises were usually discussed in the plenum.Tutors would use our Pollbot to post a poll on the text channel and students would vote by reacting to the post.These exercises were especially helpful to gain insight on which topics were already well understood.
2. Written exercises: These exercises were solved in smaller groups using temporary channels.Students had to apply concepts from the lecture to small problems.They often used AWW to work together on a solution.

Discussion exercises:
We often asked open questions that engaged the students to participate in a discussion, either in the plenum or in small groups.
We learned that providing exercises to be solved in smaller groups significantly helps students to actively participate in the tutorials.We plan to also keep this approach once tutorials are held physically at the university again.

Assignments
From previous iterations of the lecture, we learned that encouraging the students to continuously recap and apply the concepts presented in the lecture leads to successful participation in the course.Therefore, we made correctly solving the majority of weekly assignments a prerequisite for admission to the exam.
Assignments should not only verify that students understood the concepts but also challenge good students to think further.Thus, assignments were divided into three parts: 1. Application tasks: Students had to directly apply concepts from the lecture to familiar problems.

2.
Transfer task: Students had to apply concepts to new problems and argue about implications.

3.
Programming task: Students had to implement algorithms in empty cells in the Jupyter Notebook from the lectures.We provided basic unit tests for them to check their implementations.
Assignments were published on, handed in, and graded using our internal CMS after each hand-in.Since we chose a different approach for our tutorials this year, all solutions to the assignments were also published on our CMS.As the lecture is aimed at undergrad students, we wanted to support them in practicing teamwork.We decided that assignments could be handed in by groups of up to three members.Since meeting physically to work on the assignments was not an option, we created a workspace category on our Discord server where students could create temporary channels for video calls, sharing their screen, and discussing the assignments.

Office Hours
Two years ago, we introduced an office hour.In contrast to the tutorials that have a fixed structure, we wanted to provide the students with a more open offer to stop by, discuss individual problems, and get help with technical issues.We held office hours once a week for two hours in a seminar room.However, most of the time, participating students just wanted to solve the assignment on-site and have a contact person in case of difficulties.
Therefore, we decided to offer office hours of one hour each.Office hours were held by two tutors on our Discord server.Participating students would enter a waiting area (i.e. a public voice channel) and tutors would then accompany them to a private table (i.e. a temporary voice channel).The temporary channels were configured such that other students could not join them to avoid distractions and keep conversations private.If a student wanted to join the conversation, the tutor had to move the student actively to the channel.Students mainly used office hours to receive help with technical difficulties, mostly setting up our environment for the Jupyter Notebooks.They shared their screen in the temporary channel where tutors then guided them towards a solution.Having shorter virtual office hours also resulted in fewer students with concrete questions and problems taking part.

Advantages and Issues of the Model
In summary, we are very happy with our teaching model.We also received tremendous positive feedback from our students in two different course evaluations.

Future Plans
In future, though we are not planning to change much, we aim to further improve the concept and fix the remaining issues.Here are some thoughts.
1. How to fix the missing visual backchannel problem?It would help the lecturer a lot to be able to see at least the faces of a subset of the students14 .One solution could be to offer some students to additionally join a video conference call (Zoom or Discord) where they must switch their camera on.However, it is unclear what the incentive for students could be.In particular, that incentive should be something that does not penalize students not willing to join that video call.
2. In online teaching, there is an epic debate on whether the lecturer should switch the video camera on.What is the added value of this?In this lecture, we did not use the camera.However, we believe that, in particular for new students, it may be more helpful than for older students.Therefore we are considering this for an upcoming first semester lecture.
3. Due to many requests on YouTube, in the long run, we want to make all material available in English.

3 . 32 Figure 1 :
Figure 1: Slide explaining the teaching philosophy of the course to students (translated)

Figure 4 :
Figure 4: Dragging a question from the audience asked in frag.jetztinto the livestream

Table 1 :
Course agenda and their learning objectives.In 2019, we used one additional application: data journalism.There we explained graph databases and security issues like SQL injection.