1 Background and Purpose

AIS (Automatic Identification System) [2] is obliged to be mounted on all passenger ships and vessels over 300 gross tons engaged in international voyages as well as vessels over 500 gross tonnage even not being engaged in international voyages by the SOLAS convention (The International Convention for the Safety of Life at Sea). The installation has been completed from 2002 to 2008. With AIS, it becomes possible to acquire the navigation information such as positions of other ships and their paths to predict their future courses. Researches have been made on integrating navigation information into a superimposed display where navigation information on other vessels captured by vision, radar, automatic collision avoidance aid device ARPA (Automatic Radar Plotting Aid), AIS and so on are integrated and displayed.

NT-NAV was proposed by Hayuma et al. in 2003 [3, 4]. INT-NAV obtains information from multiple sources and integrates radar information, AIS information, Obstacle Zone by Target (OZT) on the image captured by a camera installed on the ship in order to shorten the time taken for information acquisition processing of an arbitrary ship. We gained the same or higher evaluation about the easiness of obtaining various information and superiority of guard work by doing the performance evaluation in shipbuilding simulator in 2007 [5] and in real sea area in 2008 [6] in comparison with the conventional ship maneuvering method. INT-NAV has the problem that the display range depends on the angle of view of the video camera and it is difficult to obtain full circumference navigation information of the ship.

As a method to provide visual information and radar ARPA information simultaneously, the visual recognition support equipment was developed by Hikida et al. in 2009 [7, 8]. The visual recognition support apparatus installs the HUD before the compass and supports the visual inspection by superimposing the radar echo and the ARPA information on the HUD. Due to the angle of view of the HUD, the field of view of the visual recognition device is only in the front, and it is difficult to obtain information on the ship behind, so a normal radar display was attached [9]. Furthermore, a mechanism capable of rotating the HUD around the compass was added to solve the problem of the limited field angle. Evaluation was carried out on this visual recognition support apparatus by a ship maneuvering simulator and actual ship experiments [10]. As a result, it has been confirmed that the work amount and information acquisition time are reduced as compared with the case of using the radar, and the work precision is kept equivalent to the case of using the radar. When ship’s body shakes, however the problem is that the actual ship viewed differs from the display of direction information.

On the other hand, many small vessels such as fishing vessels are rarely equipped with AIS as compared with large vessels being equipped with a variety of navigational aids and can only be recognized by visual or video cameras, so it is difficult to predict their future courses. In 2014, 75% of maritime accidents were made by small vessels [1] in Japan. For this reason, studies to identify vessels from images have been conducted in recent years [11,12,13,14]. These studies are aimed at identifying and tracking vessels, but identifying various obstacles on the sea such as buoys have not been done.

This research aims to develop a system for small vessels which detects and displays obstacles around a ship from maritime navigation images. Obstacles such as other vessels and buoys on the sea are extracted from maritime environmental images captured by the system and classified into three categories. The classifier is trained by Faster R-CNN [15] which is one of image recognition methods. The classifier is tuned by changing parameters of Faster R-CNN. Then, we evaluate detection rates by the classifiers with different parameter values. We display obstacles and examine the accuracy of obstacle detection by using the best classifier and maritime navigation images.

The system that displays information such as AIS has already been made. Obstacles are displayed as shown in Fig. 1. Our system detects and adds obstacles such as small fishing vessels and buoys not equipped with AIS, which have been recognized only by visual inspection, and can recognize many obstacles on the sea. By showing the presence of obstacles to the operator using our system, we think it will be useful for ship navigation and marine accident reduction.

Fig. 1.
figure 1

Display example

2 AIS Information and Camera Image

The server installed in the SHIOJIMARU laboratory collects data from various vessel navigation instruments in the Ship of Tokyo University of Marine Science and Technology (SHIOJIMARU) (total length: 49.63 [m], width: 10.0 [m]). In this research, these data are received by a small server installed on the ship and transferred to a remote place. AIS information is distributed in the signal format [16] defined by NMEA (National Maritime Electronics Association). Various information about ships can be obtained by decoding the AIS information. Its example is shown in Fig. 2.

Fig. 2.
figure 2

AIS information

Information included in AIS is classified into four types: static information, dynamic information, navigation related information, and navigation safety related information. Static information includes ship identification number MMSI (Maritime Mobile Service Identity), ship name, call sign, total length and width, ship type and so on. Dynamic information includes ship position, navigational condition, ground course, ground speed, turning rate, navigation status and so on. Navigational information includes drafts, destinations, loads and so on. Navigation safety information is a text message containing voyages or weather warnings that each ship can arbitrarily create as needed. A list of information included in AIS is shown in Table 1.

Table 1. Classification of AIS

Also, a camera is installed in the upper part of the mast of the ship, and it is possible to capture the state of the surrounding ships on the deck of SHIOJIMARU. In this research, we convert the serial signal flowing from the AIS transceiver installed in the ship and transmit it to the server in the ship via TCP (Transmission Control Protocol) communication. From there, AIS information is transmitted by UDP (User Datagram Protocol) communication to the server on the information display side installed at a remote place. Images captured by the camera are also transmitted to the server on the information display side via the inboard LAN. The transmitted camera images and AIS information are superimposed on the large screen installed on the Etchujima campus at Tokyo University of Marine Science and Technology.

3 Maritime Environment Images

On September 21, 22, and October 6, 2017, we used SHIOJIMARU to photograph Maritime environment images for deep learning. We captured obstacles at sea from various angles using three digital cameras from various shipboards. The resolution of three digital cameras are 3,216 × 2,136 [pixel], 2,592 × 1,728 [pixel], 3,216 × 2,136 [pixel]. Also on 5th October 2017, we captured obstacles at sea near the Tokyo Bay URAGA Waterway Route from the land. We also used another digital camera of 3,216 × 2,136 [pixels]. 7,553 acquired images are employed for the experiments.

First, we extract the areas of obstacles from maritime environment images. Secondly, we classify the extracted regions into three categories of large ships, small boats and buoys. Examples of maritime environment images and extracted obstacle areas are shown in Fig. 3.

Fig. 3.
figure 3

Image example

4 Machine Learning

We employ Faster R-CNN which is one of the image recognition methods and is a fast object detection algorithm based by deep learning. We use the library Caffe [17] which is a framework provided by Berkeley Vision and Learning Center (BVLC) for classification and object detection by Faster R-CNN. We first extract features from maritime environment images input using six layers of convolution neural networks and create a feature map. A window scans the created feature map and outputs the object candidate regions. Detection windows called anchors are used for object detection. The window scans the feature map and outputs some object anomaly regions judged to have a high possibility of obstacles by fitting several preset anchors. The obtained object candidate regions are again processed by a convolution process and classified into the three categories. By repeating learning, the classifier capable of detecting obstacles from maritime environment images is created. The flow of learning and detection of Faster R-CNN is shown in Fig. 4.

Fig. 4.
figure 4

Faster R-CNN

5 Maritime Navigation Images

Maritime navigation images were taken for evaluation on 7th February 2017 by a video camera installed at SHIOJIMARU. These images capture the surroundings of SHIOJIMARU navigating over the sea. Maritime navigation images are composed of the length of 13 min and 3 s with about 30 frames per 1 [s], thus the total 23,490 frames with the screen size 1,920 × 1,080 [pixel]. The obstacles in maritime navigation images are three large ships, one small boat and one buoy. We extracted image frames from the images at 1 [s] intervals, and performed obstacle detection by Faster R-CNN. An example of maritime navigation images is shown in Fig. 5.

Fig. 5.
figure 5

Maritime navigation images example

6 Obstacle Detection Experiment

From 793 extracted image frames, large ships, small boats and buoys were detected by Faster R-CNN. Its detection rate is defined by the following Eq. (1).

$$ Detection \;Rate\,\left[ \% \right]\, = \,\frac{Number\; of\; Frames\; that\; Detected \;Obstacles}{Number\; of \;Frames\; that\; Obstacle \;is\; Captured} $$
(1)

In this research, the experiment was conducted by changing parameters of Faster R-CNN. In Faster R-CNN, when the size of the input image is large, memory errors of the computer occurs. Also, Faster R-CNN resizes the input image before learning and object detection in order to unify the sizes of different input images. In this research, we determine the best classifier by changing the image size, the anchor size and the number of learning iterations. The image size was set to 1,920 × 1,080 [pixels], 2,000 × 1,200 [pixels], 1,500 × 900 [pixels], 1,000 × 600 [pixels]. The anchor size was set to 9 types, 15 types, 18 types, 30 types. The number of learning iterations was 400 thousand times. Then, the detection rate was obtained.

The relationship between the anchor size and the detection rate is shown in Table 2. The image size and the number of learning iterations were unified at 1,000 × 600 [pixels] and 300 thousand times. The 9 kinds of the anchor size and the 18 kinds produce the better detection rate than others. In this research, we conduct the subsequent experiments with the 9 kinds of the anchor size.

Table 2. Anchor size and detection rate

The relationship between the image size and the detection rate is shown in Table 3. The anchor size and the number of learning iterations were unified at 9 kinds and 300 thousand times. The detection rate was good in the case of 1,920 × 1,080 [pixel]. Since the size of maritime navigation images used in the experiment is also 1,920 × 1,080 [pixel], the image size is fixed to 1,920 × 1,080 [pixel] in the subsequent experiments.

Table 3. Image size and detection rate

Finally, the relationship between the number of learning iterations and the detection rate is shown in Table 4. The image size and the anchor size were unified at 1,920 × 1,080 [pixel] and 9 kinds.

Table 4. Number of learning iterations and detection rate

Regarding the detection rate, the same result was obtained for more than 250 thousand times. Looking at the detection rate of 250 thousand times, the highest detection rate was 55.98 [%] of buoys. Although the detection rate got lower overall, we think that it was affected by the distance to our ship and encountering situation. On the sea, obstacles are captured in various shapes and sizes in the image, so that the detection rate was damaged. In the case of the same obstacle, it is considered that the detection rate differs depending on the front, the side, and the back in the image. Also, it is difficult to distinguish obstacles that are far from our ship because they are small in the image. Misclassification is one of the reasons why the detection rate of large ships is lower than that of buoys and small boats. Since there were many cases when large ships were misclassified as small boats, the detection rate of large ships was low. The results are shown in Table 5. Here, parameters of Faster R-CNN are set as follows: the image size being 1,920 × 1,080 [pixel], 9 kinds of anchor size employed and the number of learning iterations as 250 thousand times.

Table 5. Detection Rate shown in Confusion Matrix

The misclassification rate of large ships as small boats was about 40 [%] or more in all three classes of ships. In maritime environment images, large ships are captured large if they are close but small if they are far away. Therefore, distinguishing images of large ships captured far away and those of small boats becomes very hard. An example of detecting obstacles is shown in Fig. 6.

Fig. 6.
figure 6

Example of obstacle detection

7 Display to the Operator

A navigation system must inform the operator of the existence of obstacles at the distance that the operator is made possible the collision avoidance maneuvering. Particularly, the distance is very important from small ships and buoys not equipped with AIS. In the experiment, three large ships, one small boat, and one buoy were captured. Because it existed in the heading direction of the ship, the distance to the ship was shortened with the passage of time. The distance to the ship and the detection rate are summarized in Table 6.

Table 6. Detection rate by distance

The detection rate of large ships was 21 [%] if they were within 4 [nm] and 33 [%] if they were within 3 [nm]. Because the detection rate of small boat was 86 [%] if it is within 3 [nm] and 68 [%] if it is within 4 [nm], most of small boat can be recognized if it is within 3 [nm]. Buoy who was 56 [%] within 3 [nm] became the detection rate of 100 [%] within 2 [nm].

From this experiment, it was confirmed that it is possible to detect and display the AIS non-equipped obstacles of the distance of about 2 [nm] from the ship and display the AIS mounted ships. Therefore, it became possible to show the existence of obstacles which is the main cause of the accident to the navigation support.

In the future it is necessary to make the experiments with the variety of situations, such as backlight and bad weather condition. Also, this research is used by a marine vessel maneuver in a remote place. By a proposed method, it is possible to display obstacles that are difficult to understand in images. Images display in a remote place are affected by the communication situation. Images may be interrupted. Even if images are temporarily interrupted, it may be possible to navigate while avoiding obstacles by sending the position and classification of all obstacles.

8 Conclusion

In this research, we made a classifier that classifies obstacles in captured maritime environment images into three categories using Faster R-CNN. Experiments were carried out on maritime navigation images. The kinds and positions of the detected obstacles are superimposed on the display system as shown in Fig. 1. In addition to AIS information, the operator can acquire information on obstacles not equipped with AIS.

The results are summarized as follows:

  1. (1)

    We created data set of three categories of obstacles in maritime environment images that can be used for machine learning.

  2. (2)

    We detected obstacles using Faster R-CNN. We examined the detection rate by changing image size, anchor size and the number of times of learning which are parameters of Faster R-CNN.

  3. (3)

    We carried out the obstacle detection experiment using maritime navigation images. The detection rate was about 55 [%] for buoys.

  4. (4)

    Large ships were sometimes mistaken for small boats. We will consider how to classify them.

  5. (5)

    The detection rate of small boats with distance of about 3 [nm] from the ship is 86 [%], the detection rate of buoys with distance of about 2 [nm] from the ship is 100 [%].

Future work is to detect targets and movements affecting our own ship, and to study the method to raise prediction accuracy and notify them to the operator effectively. Parameters of Faster R-CNN and category classification of obstacles are related to the detection accuracy. In the case of the same obstacle, we consider to divide categories by the size of the obstacle shown in the image. We will also verify the relationship between the detection rate and the number of data used for learning, as well as the relationship with the shooting locations. We will increase maritime environmental images and maritime navigation images and proceed with these verifications.