1 Introduction

Deteriorating air quality due to air pollution is a serious problem in many cities and can lead to an array of health problems [15, 46]. Various studies, such as from Soret et al. [60] or Li et al. [32] have shown that electric vehicles have the potential to reduce air pollution in cities. In addition to electric cars that are charged by cable, there are several upcoming approaches realizing wireless charging of electric vehicles [13]. Wireless power transfer in the context of stationary wireless charging requires a transmitter coil embedded into the ground and a receiver coil attached to the underbody of the vehicle [49].

Efficient wireless charging requires an accurate alignment of the charging components within a given tolerance range [16]. Considering the fact that the charging components are not within the driver’s field of vision, reaching a minimal deviation of the coils is a challenging task. Accordingly, Birrel et al. [7] found in studies that only 5% of the vehicles achieved an accurate position that allowed efficient wireless charging. There are several techniques to reduce misalignment of charging components, such as mechanical [26], RFID [38], or wireless sensor-based methods [47]. Furthermore, there are camera-based positioning systems that rely on a camera integrated into the vehicle [28] or statically attached to the charging station [55].

To make wireless charging suitable for everyday use, it is essential to consider various safety aspects. Particularly, neither living beings nor objects should be exposed to any harm in the context of wireless charging. The presence of foreign objects during the charging process, such as metal objects, can pose potential threats, such as high magnetic field exposure or fire [22]. In order to avoid hazardous situations, several foreign object detection methods exist.

Motivated by the goal of increasing the safety of wireless charging stations, we present a supplementary approach that augments existing foreign object detection techniques. Since there are charging stations that come with a camera-based positioning system, we utilize the given positioning camera to automatically analyze the state of the charging surface. To reuse the existing hardware of the positioning system, we propose a resource-friendly approach that can operate on an embedded device.

To examine the capability of detecting foreign objects, we evaluate our approach by conducting experiments with images of known and unknown object types. Therefore, we provide a dataset that contains images and labels of various foreign objects that are placed on a charging surface and recorded by a positioning camera.

To summarize the main contributions of our research, we present:

  • An approach that can successfully detect foreign objects while reusing the existing hardware of the positioning system

  • A dataset containing labeled images recorded at an operating wireless charging station that operates in an outdoor environment

  • Experiments to examine the capability of successfully detecting known and unknown object types. Results show that our approach achieves up to 18% higher precision and 46% higher detection success for unknown objects.

2 Application context

Within the scope of the TALAKO project [19], we enabled wireless charging of electric vehicles in public urban environments. The systems are specifically designed for taxi drivers who can charge their vehicles while waiting for customers. To gain insights about the application context, our first prototype has been deployed on the property of a taxi company. Subsequently, we established a publicly accessible pilot system next to the Central Station of Cologne in Germany.

Since wireless charging requires a precise alignment of the transmitter and receiver coil, we designed a camera-based driver guidance system that assists the driver to precisely maneuver the vehicle to the charging zone. The driver guidance system, which we describe more detailed in another article [55], is composed of mobile and stationary components. Figure 1 illustrates the charging station and camera-based positioning system.

Fig. 1
figure 1

Illustration of a charging station and camera-based positioning system

In order to navigate the driver, we provide a smartphone application that renders guiding visualizations [56]. When the smartphone receives a signal from the charging station, the application automatically triggers the start of the visual driver guidance. Thus, the driver is not required to haptically interact with the smartphone while driving. For this mechanism, the camera-based positioning system periodically emits BLE-advertisements, which contain information about the vehicle’s current pose.

For seamless integration into the urban application environments, we encapsulate the electronic components of the charging station inside a compact control cabinet with limited capacity. The control cabinet shields the interior from environmental conditions such as rain or snow. However, at the same time, it provides limited ventilation due to the waterproof enclosure. Accordingly, we embedded a low power computer for positioning into the interior arrangement of the control cabinet. Hence, additional functionality is limited to the existing hardware resources.

3 Related work

In this section, we give an overview of foreign object detection techniques in the context of wireless charging, which typically rely on field characteristics [20], sensors [59] or system parameters [30]. While cameras are less expensive than many other sensor types, they are also used to position electric vehicles before initiating the charging process [18, 55]. Thus, charging systems could increase their safety by incorporating the existing cameras for foreign object detection. Accordingly, we also summarize computer vision techniques for foreign object detection in various contexts.

3.1 Wireless charging

Foreign objects potentially impose significant safety hazards in wireless charging. Accordingly, several approaches exist to detect foreign objects. According to Zhang et al. [69] there are system parameter-, field-, and wave-based foreign object detection methods in inductive power transfer systems, whereas Cheng et al. [12] refer to the latter two in the context of wireless charging of electric vehicles. In the context of system parameter detection methods, various system parameters, such as temperature [27] or power loss [30] may indicate the presence of a foreign object. Furthermore, field-based detection methods observe field-based characteristics such as capacitance [20] or inductance [21] variations to detect foreign objects. Wave-based methods utilize sensors such as radars [51], thermal- [59] or hyperspectral cameras [63] as well as sensor combinations including video cameras [6, 17]. Depending on the hardware, wave-based methods can provide high precision, however they can also be costly [12, 69].

3.2 Computer vision

Cameras are less expensive than many other sensor types and can be used for various tasks such as image classification [39] and object detection [70] using computer vision algorithms. Positioning systems can include a single camera that observes the transmitting coil as well as the surrounding area [55]. Charging stations incorporating the aforementioned type of positioning system could cost-effectively increase their safety by utilizing the existing positioning camera for foreign object detection instead of integrating additional sensors. Hence, computer vision algorithms could detect various foreign objects before a vehicle approaches the transmitting coil by employing the positioning system’s camera. Furthermore, existing systems based on e.g. system parameters or field-based foreign object detection methods could be supplemented with information from the mentioned approach.

3.2.1 Indoor

Indoor applications often provide suitable conditions to detect foreign objects. For example, regions of interest will be constantly illuminated and protected from external influences. The approach presented by Lu et al. [40] detects and classifies falling objects in liquids inside transfusion bottles as foreign objects. For detection the approach employs background subtraction and an adapted mean-shift tracker. Furthermore, Al-Sarayreh et al. [2] utilize a neural network on footage recorded in a hypersprectral imaging system setup to detect contamination by foreign objects on meat products. Moreover, X-Ray images of lungs may contain foreign objects due to buttons of worn gowns. Based on the round shape of the buttons, Xue et al. [67] try to detect circular objects by utilizing Circle Hough transform [4] and Viola-Jones algorithm [64].

3.2.2 Outdoor

Outdoor application setups are often exposed to unknown events and circumstances caused by weather and changing illumination conditions. To increase safety and reduce the risk of accidents, various outdoor scenarios demand the detection of foreign objects. In the context of airports, it is crucial to detect and remove various types of objects from runways. Qunyu et al. [52] detect foreign objects on runways by preprocessing an image, subtracting background, postprocessing, and connected component labeling of the foreign object regions. A preliminary experiment demonstrated that all foreign objects were detected on an image. However, several state-of-the-art systems in that context utilize neural networks for foreign object detection, which are trained beforehand to detect known objects. Systems that utilize neural networks are, for example, the approach proposed in [50] which utilizes Microsoft Azure Custom Vision [44], the approach from [48] using YOLOv4 [8] as well as [33, 45] applying YOLOv3 [53] or SSD [37]. Energy infrastructure is another application field that demands a high level of safety, when utilizing neural networks for foreign object detection. Using, for example, FODN4PS proposed by Xu et al. [66], intrusions of foreign bodies in power substations are detected in order to be able to prevent potential failures of power supplies. Furthermore, there is RCNN4SPTL [68] for the inspection of power transmission lines, which can detect entangled foreign objects like balloons or kites.

Across various contexts, like high-voltage transmission lines [65], aviation [3, 14] and conveyor belts [34, 42], recent research utilizes YOLOv8 [24] for foreign object detection. YOLOv8 is a state-of-the-art neural network and the successor of previous YOLO models, which are designed to provide speed and accuracy [23, 62].

4 Analysis

All of the aforementioned approaches that operate in an indoor environment provide a suitable solution for the application scenarios in which they are designed and tested. However, wireless charging stations often operate in an outdoor environment with challenging conditions. Typically, weather and light conditions vary and the set of potentially occurring foreign objects cannot be completely defined. Thus, the foreign object detection system must be robust to environmental conditions and unspecified foreign objects.

Based on the unpredictable appearance of foreign objects, we conclude that techniques based on shapes such as circles [4], rectangles [25] or lines [1] would limit the system. In contrast, background subtraction techniques as used by Xu et al. [52] provide a higher flexibility with respect to the potential object shape. However, generated masks contain much noise if the environmental conditions vary, which can produce false positives.

Many modern systems utilize neural networks, which can be an effective tool for the detection of foreign objects in various contexts. Foreign object detection using neural networks performs effectively, especially when the set of potentially occurring object classes is known in advance. Although neural networks provide highly accurate results, many of them require extensive resources that are not available in our application context.

Accordingly, we conclude that the foreign object detection system should fulfill two fundamental requirements. While many approaches are designed to provide detection of objects with known characteristics, the system should be able to detect unknown object types as well. Furthermore, due to the space and heat constraints of the existing control cabinet, our positioning system runs on an embedded system with limited resources. Thus, the foreign object detection system should be suitable for embedded devices to coexist with the positioning system.

5 Approach

Fig. 2
figure 2

Illustration of the image components and their dimensions

Detection of foreign objects can be facilitated by knowing the potentially occurring object types and their possible states. This enables the application of various approaches like neural networks that can be trained with object-specific data. Since the wireless charging stations operate in an outdoor environment, we cannot specify the set of potential foreign objects. Thus, we present a procedure that enables object type-independent foreign object detection without considering any depicted object features.

5.1 Training stage

Once the charging components are integrated into the ground, we assume the charging surface will be maintained regularly and will not be replaced by the operator. Thus, the appearance of the charging surface will not change significantly, and its appearance will be affected mainly by environmental conditions.

Hence, our approach is based on features extracted from the charging surface that we obtain during a training stage to fit a model for anomaly detection. Anomaly detection is a helpful tool for identifying significant deviations in given data [11]. Therefore, our approach bases on anomaly detection by classifying charging surface regions as normal and diverging occurrences as anomalies.

We define F as the set of all training frames. To reduce the computational effort in the next steps, each frame \(f \in F\) having a size of \(w_f \times h_f\) will be preprocessed by transforming it into grayscale format. In the first step of feature extraction, the granularity parameter \(\gamma \in \mathbb {N}\), where \(\gamma \le \min \left( w_f,h_f\right) \) needs to be defined. During training stage, each frame f will be divided into a set of patches \(\textrm{P}\) having a width \(w_p\) and height \(h_p\):

$$\begin{aligned} w_p=\dfrac{w_f}{\gamma } \qquad h_p=\dfrac{h_f}{\gamma } \end{aligned}$$
(1)

Higher values of \(\gamma \) enable the algorithm to detect the contour of foreign objects more precisely. In contrast, lower values of \(\gamma \) decrease the processing time. Hence, \(\gamma \) has an impact on the granularity of the shape of detected objects while affecting the performance. Each patch p that collides with a bounding box of a foreign object is removed from the set \(\textrm{P}\), resulting in the subset \(\mathrm {P'} \subseteq \textrm{P}\). All patches \(p \in \mathrm {P'}\) are composed of multiple rows r and columns c with a size of \(w_p\) and \(h_p\), respectively, which can be defined as an array \(\textrm{PI}\) containing pixels \(\textrm{pi} \in \textrm{PI}\). Figure 2 gives an overview of the described components. As defined in Eq. 2, we determine the arithmetic mean \(\overline{\textrm{PI}}\) and the variance \(s^2\), resulting in \(n=\left( w_p + h_p\right) * |\mathrm {P'}|\) tuples of \(\left( \overline{\textrm{PI}},s^2\right) \).

$$\begin{aligned} \overline{\textrm{PI}}=\dfrac{\sum _{i=1}^{\left| \textrm{PI}\right| }\textrm{pi}_{i}}{\left| \textrm{PI}\right| } \qquad s^2=\dfrac{\sum _{i=1}^{\left| \textrm{PI}\right| }(\textrm{pi}_i-\overline{\textrm{PI}})^2}{\left| \textrm{PI}\right| -1} \end{aligned}$$
(2)

According to Liu et al. [36], Isolation Forests [35] provide low-linear time complexity, robustness as well as small memory requirement and outperform other approaches, including Local Outlier Factor [9], Random Forests [58] and ORCA [5]. Thus, all extracted feature tuples \(\left( \overline{\textrm{PI}},s^2\right) \) are then used to fit a model \(\mathcal {M}\) for anomaly detection using an Isolation Forest:

$$\begin{aligned} \mathcal {M}=\textrm{fit}\left( \left\{ \left( \overline{\textrm{PI}}, s^2\right) _1, \ldots , \left( \overline{\textrm{PI}}, s^2\right) _{n}\right\} \right) \end{aligned}$$
(3)

5.2 Detection stage

After the training stage, the gathered information can then be used to identify foreign objects on target frames. Figure 3 gives an overview of the steps of the detection process.

As described in the training stage, the detection procedure begins with an extraction of features from the input image. Based on the granularity parameter \(\gamma \) the input frame \(f \in \textrm{F}\) will be divided into patches \(p \in \textrm{P}\). For each pixel array \(\textrm{PI}\) given by the rows r and columns c of an individual patch p, arithmetic mean \(\overline{\textrm{PI}}\) and variance \(s^2\) will be calculated to obtain tuples of \((\overline{\textrm{PI}},s^2)\).

For each given patch \(\textrm{p}\) all \((w_p + h_p)\) tuples of \((\overline{\textrm{PI}},s^2)\) will be classified by the trained model \(\mathcal {M}\). This results in a set \(\mathrm {C_p}\) containing all classifications for a patch p. Classification returns a Boolean value \(b \in \mathbb {B}:=\{\textrm{True}, \textrm{False}\}\), whereas \(\textrm{False}\) indicates the absence and \(\textrm{True}\) the presence of a foreign object in the region of a given row r or column c. In order to reduce the risk of misclassifying a specific region of the charging surface, a patch-based majority vote will be conducted, considering the results of the rows and columns. Based on the set of all \(\textrm{True}\) classifications \(\textrm{C}_\textrm{True}\) and a majority vote threshold \(\theta \), Eq. 4 defines the function \(\textrm{FO}_\textrm{p}(\textrm{C}_\textrm{True})\) that determines if a patch p contains a foreign object. Thus, a patch p is classified as a foreign object if at least \(\theta \) classifications result in \(\textrm{True}\).

$$\begin{aligned} \textrm{FO}_\textrm{p}(\textrm{C}_\textrm{True})= {\left\{ \begin{array}{ll} \textrm{True}, &{} \text {if} \dfrac{\left| \textrm{C}_\textrm{True}\right| }{w_p + h_p}\ge \theta \\ \textrm{False}, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(4)

Based on the information gathered by classifying all patches, a binary mask can be generated. Consequently, the binary mask indicates the presence as well as the location of foreign objects, whereas the resolution of the object contour can be modulated by the granularity parameter \(\gamma \). Optionally, the object regions of the binary mask can be aggregated to bounding boxes.

5.3 Reporting

While, the proposed approach enables to gain knowledge about occurring foreign objects, there is a need to utilize this information, to be able to prevent hazardous situations. As part of our prior research [56], we introduced user interfaces for a driver guidance system. Within this scope, we present a smartphone application that assists the driver with navigating visualizations to reach a position that enables efficient wireless charging. Alongside driver guidance, the mobile application can inform users about the status of the charging station, which can be affected by the presence of foreign objects. Drivers can therefore try to remove the foreign objects before continuing to approach the charging station. If drivers are unable to remove the foreign objects by themselves, the operator of the charging station can be notified.

Fig. 3
figure 3

Overview of the detection process

6 Dataset

Aiming to expand contributions in this field of research, we previously introduced the Foreign Objects in Wireless Charging (FOWC) dataset [57]. The dataset contains 3652 images recorded at an operating wireless charging station. The charging station was constructed as part of the TALAKO project [19] and is frequently accessed by vehicles that are equipped with a wireless charging interface. A transmitter coil was integrated into the ground and embedded into a robust concrete casing. The positioning system utilizes a D-Link DCS-4602EV wide-angle camera, which is installed at a height of approximately 2 ms and focuses the positioning area, including the charging point. The camera records frames with a resolution of \(1920\times 1080\) pixels.

The operation of the charging station was temporarily interrupted to be able to capture the images using the positioning camera. Figure 4 illustrates an example frame from the positioning camera’s perspective. Foreign object detection is limited to the area of the transmitting coil. Hence, the dataset focuses on the region of interest (ROI). We crop the surrounding environment and perform a homography transformation of the ROI to obtain a bird’s eye view perspective as shown in Fig. 5. The resulting images have a resolution of \(394\times 189\) pixels.

There are three different categories of images in the dataset. The first category includes images that do not contain any foreign objects, such as depicted in Fig. 5. The second category includes images that contain a single foreign object from a set of predefined object types, namely can, coin, glasses, hairpin, key, ring and wrench. Table 1 gives an overview of the object categories and information about their size in pixels.

The objects are systematically placed at seven predefined positions, as illustrated with blue crosses in Fig. 6. Figure 7 shows an example image from the third category that contains an extension of the predefined object types and places them at random locations and quantities.

The dataset is split into a test set and a training set containing 1035 and 2617 images, respectively. For all 7 object types, there are 15 images for each of the 7 positions that were illustrated in Fig. 6. In addition, there are 200 images in the random object category, as well as 100 images without objects. The training set is a mixture of all the categories, containing the rest of the recorded images. All foreign objects depicted in the dataset were manually labeled in the Darknet format as used by YOLO [53]. The images and labels are publicly available at https://www.nes.uni-due.de/research/data/.

Fig. 4
figure 4

Camera perspective

Fig. 5
figure 5

Region of interest

Table 1 Overview about the approximate dimensions of the systematically placed objects in the dataset
Fig. 6
figure 6

Single object

Fig. 7
figure 7

Random objects

7 Evaluation

In order to increase safety of an operating wireless charging station, the foreign object detection system should be able to report the presence of foreign objects on the charging surface. Since wireless charging stations operate in an outdoor environment, knowledge about potentially occurring objects is limited. Accordingly, we conduct experiments to examine whether the system successfully detects both known and unknown object types while being trained with a limited amount of data. In addition to the success of the detection, we also analyze the execution time on lightweight devices.

As a benchmark, we employ YOLOv8 [24], which is a state-of-the-art neural network that is utilized in recent research to detect foreign objects in various applications and environments [3, 14, 29, 34, 41,42,43, 65]. Depending on the model size and resources, YOLOv8 can be executed on embedded devices with low latency [31]. Thus, we examine two different models sizes. On the one hand, we utilize the YOLOv8m model, and on the other hand, the lightweight YOLOv8n model. Correspondingly, for our approach, we set a standard configuration of 100 trees for the Isolation Forest (Our 100). However, we also explore the effect of utilizing 5 trees (Our 5), which reduce the required amount of resources.

7.1 Experimental setup

In the first experiment, we train our approach with training images of the presented dataset. The presented dataset depicts big objects like a wrench as well as small objects like a coin in the wireless charging scenario. Our goal is to simulate the limited knowledge about occurring foreign objects.

For each object category, we randomly sample 10 images for a training set as well as 100 images for a test set. In addition, we create a condition called “All” with objects from all categories. Accordingly, the training set of “All” contains an image of each object type and three images with no objects, whereas the test set contains 100 images of all object categories.

For each analyzed image, our approach generates a binary mask, which semantically depicts the regions of foreign objects and regions of the charging surface in contrast. However, our dataset provides bounding box labels, which many object detectors [24, 37, 54] require for training. In order to be able to compare the resulting binary mask to the ground truth data of our dataset, we expand the detected foreign object regions to the shape of a corresponding rectangular bounding box. To measure the precision of the detection, we compute the intersecting area of the predicted bounding boxes \(B_{p}\) and ground truth bounding boxes \(B_{gt}\) in relation to the area of the union of both:

$$\begin{aligned} \textrm{IoU} = \frac{\left[ \square B_{p} \cap B_{gt} \right] }{\left[ \square B_{p} \cup B_{gt} \right] } \end{aligned}$$
(5)

While the intersection over union (IoU) measures the precision of a detector based on the reconstruction of the ground truth shape, the focus of our application is to signalize the presence of foreign objects. Thus, we examine the detection success (DS) which indicates whether the detector correctly reports the presence of foreign objects (FO) on the charging surface:

$$\begin{aligned} \textrm{DS}= {\left\{ \begin{array}{ll} 1, &{} \textrm{FO}_{\textrm{present}} = \textrm{FO}_{\textrm{detected}}\\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(6)

Since unknown object types might reduce the quality of the detection, we analyze the capability of detecting unknown objects in the second experiment. While we reuse the models which have been trained for each object category, we examine their performance when analyzing images of untrained objects types.

In addition to the detection, we explore the time and the resources required to execute the studied techniques. For comparison, we examine the average image processing time for each algorithm while running on a powerful as well as on a lightweight embedded device. Accordingly, the algorithms are executed on an Intel NUC5CPYH with an Intel Celeron N3060 CPU (2.48 GHz) and on a Raspberry Pi 3 B+ with a Broadcom BCM2837B0 CPU (1.4 GHz).

7.2 Detection

In this section, we present the results of the experiments. To get a deeper insight, we first explore the effects of the granularity parameter \(\gamma \) and the majority vote threshold \(\theta \) and the time required for training. While we train YOLOv8 with a default configuration based on the training data of each individual object category, for the following experiments we also optimize our approach by identifying the best configuration of \(\gamma \) and \(\theta \). Accordingly, we examine multiple combinations of \(\gamma \) and \(\theta \), whereas for \(\gamma \) we define the range [5, 90] with steps of 5 and for \(\theta \) the range [0.1, 1] with steps of 0.1. Thus, we select for each category the best combination of parameters to achieve the highest IoU.

Based on the training, we illustrate the effects of \(\gamma \) and \(\theta \) in Figs. 8 and 9, respectively. The Figures summarize the IoU and DS across all object categories by showing the arithmetic mean at each data point. In general, we observe, IoU and DS increase strongly until \(\gamma =25\). However, for \(\gamma >25\), DS grows slower while IoU remains at the same level. For the majority vote threshold, we observe a peak at \(\theta =0.5\) of IoU, while DS increases with decreasing \(\theta \).

Figure 10 illustrates an overview of the training time required for different \(\gamma \) configurations in the 10 training images scenario. The training is conducted on a system with an Intel Core i7 10700k (3.8 GHz) CPU. For both tree configurations, we observe that training time increases linearly with rising \(\gamma \). While the training time for Our 100 takes up to around 8 s, Our 5 requires than 2 s. At the same time, both YOLOv8n and YOLOv8m train significantly longer with more than 1100 and 240 s, respectively.

Fig. 8
figure 8

Observed effect of the granularity parameter \(\gamma \) during training of our approach

Fig. 9
figure 9

Observed effect of the majority vote threshold \(\theta \) during training of our approach

Fig. 10
figure 10

Comparison of the training time required for YOLOv8 and our approach with different \(\gamma \) configurations on a computer with an Intel Core i7 10700k CPU

7.2.1 Known objects

Table 2 Granularity parameter \(\gamma \) and majority vote threshold \(\theta \) selected during optimization for each object category
Fig. 11
figure 11

Performance of our approach and YOLOv8 in terms of the intersection over union (IoU) in the known object condition

Fig. 12
figure 12

Performance of our approach and YOLOv8 in terms of the detection success (DS) in the known object condition

For each object category in the known object condition, the detectors have been optimized with 10 training images and evaluated with 100 test images of the same object category. Figures 11 and 12 depict the arithmetic mean of the IoU and DS results in the known object condition, whereas Table 2 presents the selected parameter combinations.

In terms of precision (IoU), YOLOv8m outperforms the other detectors in most categories. However, YOLOv8n is not able to detect any objects but wrenches, which are on of the the largest objects in the dataset. In contrast, both configurations of our approach perform in a comparable manner. While we observe that our approach performs less precisely than YOLOv8m in categories with a single object type, in the more generic case “All” our approach exceeds the precision of YOLOv8m. With regard to detection success (DS), our approach outperforms YOLOv8m across all categories.

7.2.2 Unknown objects

Fig. 13
figure 13

Performance of our approach and YOLOv8 in terms of the intersection over union (IoU) in the unknown object condition

Fig. 14
figure 14

Performance of our approach and YOLOv8 in terms of the detection success (DS) in the unknown object condition

In the unknown object condition, we simulate the presence of unknown object types on the charging surface. Therefore, we reuse the detectors of the known object condition, which have been optimized to detect objects of a specific type. Accordingly, we use each detector to detect objects of all other categories. Figures 13 and 14 visualize the average IoU and DS achieved by the detectors that have not been trained with data of the corresponding object category.

As in the known object condition, YOLOv8n can only detect wrenches, while YOLOv8m detects objects of all types. Both configurations of our approach provide in most categories similar results and outperform YOLOv8m in terms of IoU as well as DS. Across all categories, YOLOv8m achieves a DS that is lower than 60%. At the same time, both configurations of our approach score a DS, which is around 80% or higher in most categories. In general, we observe that the precision is lower for unknown object types than for known object types. However, in the generic category, which contains all object types, our approach scores results that are comparable to the known object condition.

7.3 Execution time

Fig. 15
figure 15

Comparison of the average image processing time when executing on an Intel NUC. The x-axis shows the \(\gamma \) configurations in the interval of [5,90] in steps of 5

Fig. 16
figure 16

Comparison of the average image processing time when executing on an Raspberry Pi 3 B+. The x-axis shows the \(\gamma \) configurations in the interval of [5,90] in steps of 5

Figures 15 and 16 illustrate the average image processing time of the examined detectors. We observe that the results are comparable across both devices. Independently of the tree configuration, the processing time grows linearly with increasing \(\gamma \). However, while \(\gamma \) is lower than 65, our approach with 100 trees is faster than YOLOv8m. At the same time, our approach with 5 trees is significantly faster than our approach with 100 trees, YOLOv8m and partly faster than YOLOv8n.

7.4 Benefits and limitations

This section discusses benefits and limitations based on the insights gained from the evaluation and operation of the camera system.

7.4.1 Performance

Table 3 Difference between the detection scores of our approach and YOLOv8m in the experiments with known and unknown object conditions

Table 3 presents an overview of the difference between the detection scores of our approach and YOLOv8m. On average, across all object categories in the known condition, YOLOv8m scores the highest precision and surpasses our approach with up to 15% higher IoU. However, in all the other cases that are presented in Table 3 our approach outperforms YOLOv8m with up to 18% higher IoU.

In terms of detection success, our approach achieves significantly higher scores than YOLOv8m. For the generic category “All”, we find that the improvement over YOLOv8m ranges between 37 and 40%. Considering the average across all object categories, we observe a delta with up to 46% DS for Our 5.

In terms of foreign object detection, which is based on a limited amount of data, YOLOv8n can only detect wrenches, which are one of the largest objects examined. However, with approximately 0.6 s on the Intel NUC and around 1.5 s on the Raspberry Pi 3 B+, YOLOv8n provides substantially faster image processing time than YOLOv8m with more than 4 s and close to 8 s respectively. At the same time, Our 5 with the configuration for the generic case “All” with \(\gamma =30\) and \(\theta =0.4\), provides a comparable execution time as YOLOv8n.

7.4.2 Operation

The charging stations are located in areas that provide permanent illumination. While the light makes the taxi more accessible for passengers during the night, it also enables the vehicle tracking of the camera-based positioning system. Since the evaluation images of the dataset have been recorded by the camera of a positioning system, they do not depict significant deviations with respect to the brightness. We acknowledge that the presented approach may perform less robust in a different application scenario. However, as an integral part of the camera system, appropriate illumination is continuously available during operation.

To summarize, YOLOv8n provides fast execution time. However, it can only detect wrenches in our dataset. In contrast, YOLOv8m is significantly slower but can detect known objects with the highest precision. When it comes to unknown objects, YOLOv8m performs considerably less effective than our approach, which learns the features of the charging surface instead of specific objects characteristics. While our approach detects unknown objects significantly more successfully than YOLOv8m, we observe that fewer trees lead to similar results but accelerate image analysis, which makes it suitable for our embedded device scenario.

8 Conclusion

Wireless charging stations should not endanger persons or objects in the environment. To avoid hazardous situations, there are various systems that detect foreign objects that may potentially pose a risk. Motivated by the goal of increasing the safety of wireless charging stations, we propose an approach to augment existing foreign object detection mechanisms. As there are charging stations with a camera-based positioning system, our approach utilizes the existing positioning camera to automatically analyze the state of the charging surface. We aim to provide an approach that can run on an embedded device, to not extend the hardware of the existing positioning system. Thus, the system should be resource-friendly, while effectively reporting the presence of foreign objects.

In advance of utilizing the proposed technique, a training stage is required. During training, target frames depicting the charging surface will be divided into patches based on a granularity parameter. In order to fit an Isolation Forest for anomaly detection, features will be extracted from all rows and columns of patches not containing a foreign object. After training stage, a target image can be analyzed by dividing it into patches according to the defined granularity parameter. Then, all features of the rows and columns of each patch will be analyzed with the fitted Isolation Forest to detect anomalies. Finally, a majority vote is used to determine whether a foreign object exists in a certain patch by checking if more than a defined threshold of the rows and columns contain a foreign object.

To evaluate the capability of detecting foreign objects on the charging surface, we conduct experiments with known and unknown objects. We compare our approach with YOLOv8, which is a state-of-the-art neural network utilized in various application scenarios to detect foreign objects. While our approach detects foreign objects more successfully than YOLOv8m, its image processing time is comparable to that of YOLOv8n.

In addition to foreign objects, other hazards can arise, e.g., people inserting their hands between the charging components. Accordingly, there are several approaches that can detect living objects [10, 20, 51, 61] in the context of wireless charging. To augment existing solutions, future work will focus on designing a camera-based approach that utilizes the available hardware resources of the positioning system.