Tech United Eindhoven @Home 2019 Champions Paper

van der Burgh, M. F. B.; Lunenburg, J. J. M.; Appeldoorn, R. P. W.; van Beek, L. L. A. M.; Geijsberts, J.; Janssen, L. G. L.; van Dooren, P.; van Rooy, H. W. A. M.; Aggarwal, A.; Aleksandrov, S.; Dang, K.; Hofkamp, A. T.; van Dinther, D.; van de Molengraft, M. J. G.

doi:10.1007/978-3-030-35699-6_43

M. F. B. van der Burgh¹²,
J. J. M. Lunenburg¹²,
R. P. W. Appeldoorn¹²,
L. L. A. M. van Beek¹²,
J. Geijsberts¹²,
L. G. L. Janssen¹²,
P. van Dooren¹²,
H. W. A. M. van Rooy¹²,
A. Aggarwal¹²,
S. Aleksandrov¹²,
K. Dang¹²,
A. T. Hofkamp¹²,
D. van Dinther¹² &
…
M. J. G. van de Molengraft¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11531))

Included in the following conference series:

Robot World Cup

1531 Accesses
3 Citations

Abstract

This paper provides an overview of the main developments of the Tech United Eindhoven RoboCup @Home team. Tech United uses an advanced world modeling system called the Environment Descriptor. It allows straightforward implementation of localization, navigation, exploration, object detection & recognition, object manipulation and robot-robot cooperation skills based on the most recent state of the world. Other important features include object and people detection via deep learning methods, a GUI, speech recognition, natural language interpretation and a chat interface combined with a conversation engine. Recent developments that aided with obtaining the victory during RoboCup 2019 include pointing detection, usage of HSR’s display, a people detector and the addition of a custom keyboard in the chat interface.

You have full access to this open access chapter, Download conference paper PDF

homer@UniKoblenz: Winning Team of the RoboCup@Home Open Platform League 2017

Deep Learning Methods Integration for Improving Natural Interaction Between Humans and an Assistant Mobile Robot in the Context of Autonomous Navigation

A Flexible and Scalable Architecture for Human-Robot Interaction

1 Introduction

Tech United Eindhoven^{Footnote 1} (established 2005) is the RoboCup student team of Eindhoven University of Technology^{Footnote 2} (TU/e), which joined the ambitious @Home League in 2011. The RoboCup @Home competition aims to develop service robots that can perform everyday tasks in dynamic and cluttered ‘home’ environments. Multiple world vice-champion titles have been obtained in the Open Platform League (OPL) of the RoboCup @Home competition during previous years, and this year, whilst competing in the Domestic Standard Platform League (DSPL) for the first time, the world championship title was finally claimed. In the DSPL, all teams compete with the same hardware; all teams compete with a Human Support Robot (HSR), and use the same external devices. Therefore, all differences between the teams regard only the software used and implemented by the teams.

Tech United Eindhoven consists of (former) PhD and MSc. students and staff members from different departments within the TU/e. This year, these team members successfully migrated the software from our TU/e built robots, AMIGO and SERGIO, to HERO, our Toyota HSR. This software base is developed to be robot independent, which means that the years of development on AMIGO and SERGIO are currently being used by HERO. Thus, a large part of the developments discussed in this paper have been optimized for years, whilst the DSPL competition has only existed since 2017^{Footnote 3}. All the software discussed in this paper is available open-source at GitHub^{Footnote 4}, as well as various tutorials to assist with implementation. The main developments that resulted in the large lead at RoboCup 2019, and eventually the championship, are our central world model, discussed in Sect. 2, the generalized people recognition, discussed in Sect. 4 and the head display, discussed in Sect. 5.3.

2 Environment Descriptor (ED)

The TU/e Environment Descriptor (ED) is a Robot Operating System (ROS) based 3D geometric, object-based world representation system for robots. ED is a database system that structures multi-modal sensor information and represents this such that it can be utilized for robot localisation, navigation, manipulation and interaction. Figure 1 shows a schematic overview of ED.

ED has been used on our robots in the OPL since 2012 and was also used this year in the DSPL. Previous developments have focused on making ED platform independent, as a result ED has been used on the PR2, Turtlebot, Dr. Robot systems (X80), as well as on multiple other @Home robots.

ED is a single re-usable environment description that can be used for a multitude of desired functionalities such as object detection, navigation and human machine interaction. Improvements in ED reflect in the performances of the separate robot skills, as these skills are closely integrated in ED. This single world model allows for all data to be current and accurate without requiring updating and synchronization of multiple world models. Currently, different ED plug-ins exist that enable robots to localize themselves, update positions of known objects based on recent sensor data, segment and store newly encountered objects and visualize all this in RViz and through a web-based GUI, as illustrated in Fig. 9. ED allows for all the different subsystems that are required to perform challenges to work together robustly. These various subsystems are shown in Fig. 2, and are individually elaborated upon in this paper.

2.1 Localization, Navigation and Exploration

The ed_localization^{Footnote 5} plugin implements AMCL based on a 2D render of the central world model.With use of the ed_navigation plugin^{Footnote 6}, an occupancy grid is derived from the world model and published. With the use of the cb_base_navigation package^{Footnote 7} the robots are able to deal with end goal constraints. The ed_navigation plugin allows to construct such a constraint w.r.t. a world model entity in ED. This enables the robot to navigate not only to areas or entities in the scene, but to waypoints as well. Figure 3 also shows the navigation to an area. Modified versions of the local and global ROS planners available within move_base are used.

2.2 Detection and Segmentation

ED enables integrating sensors through the use of the plugins present in the ed_sensor_integration package. Two different plugins exist:

1.
laser_plugin: Enables tracking of 2D laser clusters. This plugin can be used to track dynamic obstacles such as humans.
2.
kinect_plugin: Enables world model updates with use of data from a RGBD camera. This plugin exposes several ROS services that realize different functionalities:
1. (a)
  Segment: A service that segments sensor data that is not associated with other world model entities. Segmentation areas can be specified per entity in the scene. This allows to segment object ‘on-top-of’ or ‘in’ a cabinet. All points outside the segmented area are ignore for segmentation.
2. (b)
  FitModel: A service that fits the specified model in the sensor data of a RGBD camera. This allows updating semi-static obstacles such as tables and chairs.

The ed_sensor_integration plugins enable updating and creating entities. However, new entities are classified as unknown entities. Classification is done in ed_perception plugin^{Footnote 8} package.

2.3 Object Grasping, Moving and Placing

The system architecture developed for object manipulation is focused on grasping. In the implementation, its input is a specific target entity in ED, selected by a Python executive and the output is the grasp motion joint trajectory. Figure 4 shows the grasping pipeline.

MoveIt! is used to produce joint trajectories over time, given the current configuration, robot model, ED world model (for collision avoidance) and the final configuration.

The grasp pose determination uses the information about the position and shape of the object in ED to determine the best grasping pose. The grasping pose is a vector relative to the robot. An example of the determined grasping pose is shown in Fig. 5. Placing an object is approached in a similar manner to grasping, except for that when placing an object, ED is queried to find an empty placement pose.

3 Image Recognition

The image_recognition packages apply state of the art image classification techniques based on Convolution Neural Networks (CNN).

1.
Object recognition: Tensorflow™ with retrained top-layer of a Inception V3 neural network, as illustrated in Fig. 6.
2.
Face recognition: OpenFace^{Footnote 9}, based on Torch.
3.
Pose detection: OpenPose^{Footnote 10}.

Our image recognition ROS packages are available on GitHub^{Footnote 11} and as Debian packages: ros-kinetic-image-recognition

4 People Recognition

As our robots need to operate and interact with people in a dynamic environment, our robots’ people detection skills have been upgraded. This skill is upgraded to a generalized system capable of recognizing people in 3D. In the people recognition stack, an RGB-D camera is used as the sensor to capture the scene information. A recognition sequence is completed in four steps. First, people are detected in the scene using OpenPose and if their faces are recognized as one of the learned faces in the robots’database, they are labeled using their known name using OpenFace. The detections from OpenPose are associated with the recognitions from OpenFace by maximizing the IoUs of the face ROIs. Then, for each of the recognized people, additional properties such as age, gender and the shirt color are identified. Furthermore, the pose keypoints of these recognitions are coupled with the depth information of the scene to re-project the recognized people to 3D as skeletons. Finally, information about the posture of each 3D skeleton is calculated using geometrical heuristics. This allows for the addition of properties such as “pointing pose” and additional flags such as ‘is_waving’, ‘is_sitting’, etc.

4.1 Pointing Detection

This year’s tournament challenges involved various non-verbal user interactions such as detecting to what object the user was pointing. In the previous section, our approach to people recognition is explained. This recognition includes information about the posture of each 3D skeleton. Once the people information is inserted into the world model, additional properties can be added to the persons that take also other entities in the world model into account, e.g. “is_ pointing_ at_ entity”. This information is used by the toplevel state machines to implement challenges such as ‘Hand Me That’, the description of which can be found in the 2019 Rulebook^{Footnote 12}. However an additional check is inserted to ensure that the correct operator is found. This check is based on a spatial queries. By using such a query it is possible to filter out people based on their location. Finally, to determine at which entity the operator is pointing, ray-tracing is implemented. Figure 7 shows an example of the ray-tracing.

5 Human-Robot Interface

We provide multiple ways of interacting with the robot in an intuitive manner: WebGUI, Subsect. 5.1, and Telegram™ interface, Subsect. 5.2, which uses our conversation_engine, Subsect. 5.2.

5.1 Web GUI

In order to interact with the robot, apart from speech, we have designed a web-based Graphical User Interface (GUI). This interface uses HTML5^{Footnote 13} with the Robot API written in Javascript and we host it on the robot itself.

Figure 8 gives an overview of the connections between these components and Fig. 9 represents an instance of the various interactions that are possible with the Robot API.

5.2 Telegram™

The Telegram interface^{Footnote 14} to our robots is a ROS wrapper around the python-telegram-bot library. The software exposes four topics, for images and text resp. from and to the robot. The interface allows only one master of the robot at a time. The interface itself does not contain any reasoning. This is all done by the conversation_engine, which is described in the following subsection.

Conversation Engine

The conversation_ engine^{Footnote 15} bridges the gap between text input and an action planner (called action_server). Text can be received from either Speech-to-Text or from a chat interface, like Telegram™. The text is parsed according to a (Feature) Context Free Grammar, resulting in an action description in the form of a nested mapping. In the action description, (sub)actions and their parameters are filled in. This may include references such as “it”.

Based on the action description, the action_server tries to devise a sequence of actions and parameterize those with concrete object IDs. To fill in missing information, the conversation_engine engages with the user. When the user supplies more information, the additional input is parsed in the context of what info is missing. Lastly, it keeps the user “informed” whilst actions are being performed by reporting on the current subtask.

Custom Keyboard, Telegram HMI

The user interface modality as explained above has been extended to reduce the room for operator error by only presenting the user with a limited number of buttons in the Telegram app. This has been realized through Telegrams custom_ keyboards^{Footnote 16} feature. This feature is especially useful if there are only a few options, such as when selecting from a predetermined selection of drinks, as has been shown in our finals during RoboCup 2019.

Since the competition, this feature has been employed to compose commands word-for-word. After the user has already entered, via text or previous buttons, for example “Bring me the ...” the user is presented with only those words that might follow that text according to the grammar, eg. “apple”, “orange” etc. This process iterates until a full command has been composed. This feature is called hmi_ telegram^{Footnote 17}.

5.3 Head Display

For most people, especially people who do not deal with robots in their day-to-day life, interaction with robots is not as easy as one would like it to be. It is often difficult to hear what the robot is saying and it is not always intuitive for people to know when to talk to the robot. To remedy this, the head display of HERO is used. On this display that is integrated in the Toyota HSRs’ ‘head’, a lot of useful information can be displayed. Through the hero_ display^{Footnote 18} a few different functionalities are integrated. As per default, our Tech United @Home logo with a dynamic background is shown on the screen, as depicted in Fig. 10. When the robot is speaking the spoken text is displayed, when the robot is listening a spinner along with an image of a microphone is shown and it is possible to display images.

6 Re-usability of the System for Other Research Groups

Tech United takes great pride in creating and maintaining open-source software and hardware to accelerate innovation. Tech United initiated the Robotic Open Platform website^{Footnote 19}, to share hardware designs. All our software is available on GitHub^{Footnote 20}. All packages include documentation and tutorials. Tech United and its scientific staff have the capacity to co-develop (15+ people), maintain and assist in resolving questions.

7 Community Outreach and Media

Tech united has organised 3 tournaments: Dutch Open 2012, RoboCup 2013 and the European Open 2016. Our team member Loy van Beek has been a member of the Technical Committee during the period: 2014–2017. We also carry out many promotional activities for children to promote technology and innovation. Tech United often visits primary and secondary schools, public events, trade fairs and has regular TV appearances. Each year, around 50 demos are given and 25k people are reached through live interaction. Tech United also has a very active website^{Footnote 21}, and interacts on many social media like: Facebook^{Footnote 22}, Instagram^{Footnote 23}, YouTube^{Footnote 24}, Twitter^{Footnote 25} and Flickr^{Footnote 26}. Our robotics videos are often shared on the IEEE video Friday website.

Notes

References

Quigley, M., et al.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software (2009)
Google Scholar
Fox, D.: Adapting the sample size in particle filters through KLD-sampling. Int. J. Robot. Res. 22(12), 985–1003 (2003)
Article Google Scholar
Fox, D., Burgard, W., Thrun, S.: The dynamic window approach to collision avoidance. IEEE Mag. Robot. Autom. 4(1), 23–33 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, Den Dolech 2, P.O. Box 513, 5600 MB, Eindhoven, The Netherlands
M. F. B. van der Burgh, J. J. M. Lunenburg, R. P. W. Appeldoorn, L. L. A. M. van Beek, J. Geijsberts, L. G. L. Janssen, P. van Dooren, H. W. A. M. van Rooy, A. Aggarwal, S. Aleksandrov, K. Dang, A. T. Hofkamp, D. van Dinther & M. J. G. van de Molengraft

Authors

M. F. B. van der Burgh
View author publications
You can also search for this author in PubMed Google Scholar
J. J. M. Lunenburg
View author publications
You can also search for this author in PubMed Google Scholar
R. P. W. Appeldoorn
View author publications
You can also search for this author in PubMed Google Scholar
L. L. A. M. van Beek
View author publications
You can also search for this author in PubMed Google Scholar
J. Geijsberts
View author publications
You can also search for this author in PubMed Google Scholar
L. G. L. Janssen
View author publications
You can also search for this author in PubMed Google Scholar
P. van Dooren
View author publications
You can also search for this author in PubMed Google Scholar
H. W. A. M. van Rooy
View author publications
You can also search for this author in PubMed Google Scholar
A. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar
S. Aleksandrov
View author publications
You can also search for this author in PubMed Google Scholar
K. Dang
View author publications
You can also search for this author in PubMed Google Scholar
A. T. Hofkamp
View author publications
You can also search for this author in PubMed Google Scholar
D. van Dinther
View author publications
You can also search for this author in PubMed Google Scholar
M. J. G. van de Molengraft
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. G. L. Janssen .

Editor information

Editors and Affiliations

University of Newcastle, Callaghan, NSW, Australia
Stephan Chalup
Google, X, The Moonshot Factory, Munich, Germany
Tim Niemueller
Mahidol University, Nakhon Pathom, Thailand
Jackrit Suthakorn
University of Technology, Sydney, NSW, Australia
Mary-Anne Williams

A HSR’s Software and External Devices

A standard Toyota™ HSR robot is used. To differentiate our unit, it has been named HERO. This name also links it to our AMIGO and SERGIO domestic service robots.

HERO’s Software Description. An overview of the software used by the Tech United Eindhoven @Home robots can be found in Table 1.

Table 1. Software overview

Full size table

External Devices.HERO relies on the following external hardware:

Official Standard Laptop
USB power speaker
Gigabit Ethernet Switch
Wi-Fi adapter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van der Burgh, M.F.B. et al. (2019). Tech United Eindhoven @Home 2019 Champions Paper. In: Chalup, S., Niemueller, T., Suthakorn, J., Williams, MA. (eds) RoboCup 2019: Robot World Cup XXIII. RoboCup 2019. Lecture Notes in Computer Science(), vol 11531. Springer, Cham. https://doi.org/10.1007/978-3-030-35699-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-35699-6_43
Published: 01 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35698-9
Online ISBN: 978-3-030-35699-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tech United Eindhoven @Home 2019 Champions Paper

Abstract

Similar content being viewed by others

homer@UniKoblenz: Winning Team of the RoboCup@Home Open Platform League 2017

Deep Learning Methods Integration for Improving Natural Interaction Between Humans and an Assistant Mobile Robot in the Context of Autonomous Navigation

A Flexible and Scalable Architecture for Human-Robot Interaction

1 Introduction

2 Environment Descriptor (ED)

2.1 Localization, Navigation and Exploration

2.2 Detection and Segmentation

2.3 Object Grasping, Moving and Placing

3 Image Recognition

4 People Recognition

4.1 Pointing Detection

5 Human-Robot Interface

5.1 Web GUI

5.2 Telegram™

5.3 Head Display

6 Re-usability of the System for Other Research Groups

7 Community Outreach and Media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A HSR’s Software and External Devices

A HSR’s Software and External Devices

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation