The introduction of the black and white ball is the major new challenge in the Standard Platform League in 2016. Until the RoboCup 2015, the ball was orange and rather easy to detect. In particular, it was the only orange object on the field. The new ball is mainly white with a regular pattern of black patches, just as a miniature version of a regular soccer ball. The main problem is that the field lines, the goals, and the NAO robots are also white. The latter even have several round plastic parts and they also contain grey parts. Since the ball is often in the vicinity of the NAOs during a game, it is quite challenging to avoid a large number of false positives.
Playing with a normal soccer ball has also been addressed in the Middle Size League (MSL), e. g. [3, 5]. However, the robots in the MSL are typically not white and they are equipped with more computing power than the NAO is, e. g. Martins et al.  presented a ball detection that requires 25 ms at a resolution of \(640\times 480\) pixels on a Intel Core 2 Duo 2 running at 2 GHz. In contrast, the solution presented here is on average more than ten times faster running on an Intel Atom at 1.6 GHz, which allows our robots to process all images that their cameras take.
We use a multi-step approach for the detection of the ball. First, the vertical scan lines our vision system is mainly based on are searched for ball candidates. Then, a contour detector fits ball contours around the candidates’ locations. Afterwards, fitted ball candidates are filtered using some general heuristics. Finally, the surface pattern inside each remaining candidate is checked. Furthermore, the ball state estimation has been extended by some additional checks to exclude false positives that cannot be avoided during image processing.
3.1 Searching for Ball Candidates
Our vision system scans the image vertically using scan lines of different density based on the size that objects, in particular the ball, would have in a certain position of the image. To determine ball candidates, these scan lines are searched for sufficiently large gaps in the green that also have a sufficiently large horizontal extension and contain enough white (cf. Fig. 2a). Candidates that are significantly inside of a detected robot are discarded. In addition, the number of candidates is reduced by only accepting ones that are sufficiently far away from other candidates.
3.2 Fitting Ball Contours
As the position of a ball candidate is not necessarily in the center of an actual ball, the area around such a position is searched for the contour of the ball as it would appear in this part of the image given the intrinsic parameters of the camera and its pose relative to the field plane. The approach is very similar to the detection of objects in 3-D space using a stereo camera system as described by Müller et al. , but we only use a single image instead. Thereby, instead of searching a 3-D space for an object appearing in matching positions in two images at the same time, only the 2-D plane of the field is searched for the ball to appear in the expected size in a single camera image. For each ball candidate, a contrast-normalized Sobel (CNS) image of the surrounding area is computed (cf. Fig. 2b). This contrast image is then searched for the best match with the expected ball contour (cf. Fig. 2c). The best match is then refined by adapting its hypothetical 3-D coordinates (cf. Fig. 2d).
3.3 Filtering Ball Candidates
The fitting process results in a measure, the response, for how well the image matches with the contour excepted at the candidate’s location. If this value is below a threshold, the ball candidate is dropped. The threshold is dynamically determined from the amount of green that surrounds the ball candidate. On the one hand, the less green is around the candidate, the higher the response must be to reduce the amount of false positives inside robots. However, if a ball candidate is completely surrounded by green pixels and the response was high enough to exclude the possibility of being a penalty mark, the ball candidate is accepted right away, skipping the final step described below that might be failing if the ball is rolling quickly. All candidates that fit well enough are processed in descending order of their response. As a result, the candidate with the highest response that also passes all other checks will be accepted. These other checks include that the ball radius found must be similar to the radius that would be expected at that position of the image.
3.4 Checking the Surface Pattern
For checking the black and white surface pattern, a fixed set of 3-D points on the surface of the ball candidate are projected into the image (cf. Fig. 2d). For each of these pixels, the brightness of the image at its location is determined. Since the ball usually shows a strong gradient in the image from its bright top to a much darker bottom half, the pixels are artificially brightened depending on their position inside the ball. Then, Otsu’s method  is used to determine the optimal threshold between the black and the white parts of the ball for the pixels sampled. If the average brightnesses of both classes are sufficiently different, all pixels sampled are classified as being either black or white. Then, this pattern is looked up in a pre-computed table to determine whether it is a valid combination for the official ball. The table was computed from a 2-D texture of the ball surface considering all possible rotations of the ball around all three axes and some variations close the transitions between the black and the white parts of the ball.
3.5 Removing False Positives Before Ball State Estimation
The major parts of B-Human’s ball state, i. e. position and velocity, estimation remained unchanged for many years and consist of a set of Kalman filters. However, the introduction of the new black and white ball required the addition of a few more checks. In previous years, the number of false positive ball perceptions has been zero in most games. Hence, the ball tracking was implemented as being as reactive as possible, i. e. every perception was considered. Although the new ball perception is quite robust in general, several false positives per game cannot be avoided due to the similarity between the ball’s shape and surface and some robot parts. Therefore, there must be multiple ball perceptions within a certain area and within a maximum time frame before a perception is considered for the state estimation process. This slightly reduces the module’s reactivity but is still fast enough to allow the execution of ball blocking moves in a timely manner. Furthermore, a common problem is the detection of balls inside robots that are located at the image’s border and are thus not perceived by our software. A part of these perceptions, i. e. those resulting from our teammates, is excluded by checking against the communicated teammate positions.
The approach allows our robots to detect the ball in distances of up to five meters with only a few false positive detections. Figure 3 shows the statistics of how well the ball was seen by different teams in terms of how long ago the ball was seen by the team, i. e. the robot that saw it most recently. The statistics was created from some of the log files recorded by the TeamCommunicationMonitor  at RoboCup 2016 that were made available at the website of the league. Since the teams analyzed were some of the best in the competitionFootnote 5, it is assumed that the number of false positive ball detections, which would also result in low numbers, is negligible. Although the chart in Fig. 3 suggests that the ball detection worked better indoors, it actually benefited from the good lighting conditions in the outdoor competition. However, since our robots were only walking slowly and fell down quite often, the average distance to the ball was a lot higher, which impeded the perception rate. In a rather dark environment, as on Field A in the indoor competition, balls with lower responses had to be accepted in order to detect the ball at all. This resulted in more false positive detections, in particular in the feet of other robots, because they are also largely surrounded by green.
The runtime is determined by the number of ball candidates that are found. For instance, the log file of player number 2 from the second half of the final shows that the search for ball candidates took 0.135 ms on average and took never longer than 0.816 ms. Computing the CNS image for the candidates took 0.285 ms on average and reached a maximum of 5.394 ms. Checking these candidates took 0.951 ms on average, but sometimes took significantly longer. The maximum duration reached was 10.604 ms. As it rarely happens that the processing of images from the upper and the lower camera take long in subsequent frames, the frame rate was basically 60 Hz all the time.