1 Introduction

Developing an efficient and environment-friendly logistics system is vital to modern urban life. Increased online shopping has led to an increased freight demand, and a logistics system that supports this change is required. In addition, labor shortages and long working hours have become prominent in the Japanese logistics industry. An efficient logistics system can reduce the environmental emissions from trucks and improve the working conditions of drivers.

Examining empty truck behavior can be useful in developing a more efficient and environmentally friendly logistics system. Unloaded truck movements are inevitable in actual operations; however, their reduction is important for a better logistics system. Joint deliveries by multiple logistics companies are a possible approach to reduce unloaded truck movements. Joint deliveries can be more efficient than individually optimized deliveries. To design a joint delivery, it is essential to understand the existing empty truck behavior characteristics.

Recently, new data have become available for analyzing empty truck behavior. Traditional methods for recording the behavior of commercial vehicles include daily driving reports written by drivers and analog loggers installed in vehicles; however, their data volume is limited. Then, the variations in empty truck behavior (e.g., variations across the week, time of day, and month) remain unreported. Probe vehicle data are widespread and can overcome the limitations of traditional methods. While several studies have analyzed route choices using the data [1, 2], the current study focuses on empty truck behavior.

To better understand empty truck behavior and discuss the potential policies that can be implemented to reduce the frequency of empty trips, this study addresses the following research questions:

  1. 1.

    How does the percentage of empty trucks vary across days of the week, times of day, and months?

  2. 2.

    How is the trip distribution of empty and loaded trucks different?

  3. 3.

    How can the trucks be classified with loaded and unloaded patterns and behavior?

  4. 4.

    How can potential targets (e.g., vehicle groups and time) be detected to reduce the number of empty trips?

This study used probe vehicle data from April 2019 to March 2020 in Kyushu, Japan, to explore the research questions.

A few studies have examined empty truck behavior. Rodrigues and Santos [3] developed a combinatorial optimization problem to reduce the number of empty trucks in Brazil, and the results revealed a decrease in empty trucks and environmental emissions. Samchuk et al. [4] developed a regression model to determine the empty truck rate based on trip distance, truck size, and freight volume. A mathematical model was developed to reduce the number of port-related empty trucks [5, 6]. A UK survey-based study also explored the potential for reducing empty trucks [5]. Some studies classified commercial vehicles using GPS data [7]. However, to the best of our knowledge, no studies have comprehensively examined the trends in empty vehicle behavior by days of the week, times of day, and months. In addition, the clustering of trucks based on loaded and unloaded patterns has not been reported. This study attempts to fill this research gap in the literature.

Note that this study is not classified as hypothesis-testing research but exploratory or hypothesis-generating research. The lack of existing studies on the variation in empty truck behavior has prevented us from providing hypotheses. The large-scale one-year probe vehicle data enabled us to conduct a data-driven study.

2 Data and Method

2.1 Data

The probe vehicle data provided by Transtron Inc.Footnote 1 were used in this study. This probe data is based on data collected from digital tachographs. Digital tachographs can accurately record real-time operation data, such as speed, travel time, distance traveled, and location during driving. Installing tachographs is mandatory in Japan to reduce the number of traffic accidents involving commercial freight vehicles with a maximum loading capacity of 4 tons or more and a gross vehicle weight of 7 tons or more and newly purchased vehicles.

The data include the origin–destination (OD) patterns of commercial vehicles arriving and departing from Kyushu, Japan. Zoning was based on a 1-km mesh. The period covered was from April 1, 2019, to March 31, 2020. Note that the data regarding the trips departing on the following dates were missing and were excluded from the analysis: July 5–8, 2019 and February 18–19, 2020. Table 1 summarizes the data used in this study. Figure 1 shows the trip generation using a 1-km mesh for the period under study.

Table 1 Summary of data
Fig. 1
figure 1

Target area and annual trip generation (trip/year)

This dataset is advantageous for several reasons. First, it comprises a large sample size, covering approximately a quarter of the commercial trucks in Japan. This feature enabled us to conduct analyses over a long time range and wide area coverage. Second, the dataset structure enabled us to construct trip chains using vehicle IDs such that we could assess the daily behavioral patterns of individual vehicles. Third, information regarding the empty or occupied trucks can be used to examine the situation of empty trucks.

2.2 Trip-Chain Determination

The vehicle IDs in the data enable us to track individual vehicles continuously and determine trip chains. The details of the trip-chain determination have been reported elsewhere [8] and are briefly described here.

First, the base mesh for each vehicle was determined as the mesh that observed the most frequent departures and arrivals. A trip chain is defined as the chain of trips from the departure of the base mesh to its return. Data with irregular trip-chain sequences were excluded from the analysis. The arrival mesh becomes the departure mesh for the next trip. Data that violated this sequence included measurements or other errors; thus, they were excluded. Data with more than 24 h of trip-chain travel time were also excluded. Trip chains with an excessively long travel time implied failures to determine the base mesh.

2.3 Empty Ratio Determination

The trip data are classified as empty, occupied, or undefined. Drivers select the corresponding label by pressing a button on a digital tachograph. Empty indicates a trip with no load, while occupied indicates a loaded trip. Undefined denotes a trip in which the driver does not press any button (incomplete data). Table 2 lists the total number of trips and the number of OD pairs for empty, occupied, and undefined trips. This study used the following empty ratio as an evaluation indicator:

Table 2 Total number of trips and OD pairs
$$\mathrm{Empty}\;\mathrm{ratio}=\frac{\mathrm{Total}\;\mathrm{distance}\;\mathrm{of}\;\mathrm{empty}\;\mathrm{trip}}{\mathrm{Total}\;\mathrm{distance}\;\mathrm{of}\;\mathrm{empty}\;\mathrm{and}\;\mathrm{occupided}\;\mathrm{trip}}$$

The higher the empty ratio, the higher the percentage of empty trips. Similarly, we use the percentage of empty trips as another indicator, which is defined as follows:

$$\mathrm{The}\;\mathrm{percentage}\;\mathrm{of}\;\mathrm{empty}\;\mathrm{trips}=\frac{\mathrm{The}\;\mathrm{number}\;\mathrm{of}\;\mathrm{empty}\;\mathrm{trip}}{\mathrm{Total}\;\mathrm{number}\;\mathrm{of}\;\mathrm{empty}\;\mathrm{and}\;\mathrm{occupided}\;\mathrm{trip}}$$

2.4 Analysis Procedure

First, we explored the empty ratio variations with respect to the day of the week, time of day, and month, along with several descriptive analyses. Then, we used the nonhierarchical cluster analysis to classify vehicles with loaded and unloaded trip patterns. This analysis is aimed at identifying the improvement margin for the empty ratio. This procedure is described in detail later.

3 Descriptive Analysis

3.1 Number of Empty/Occupied Trip

Figure 2 shows box-and-whisker plots of the percentage of daily trips by the day of the week and month. In the figure, each daily trip is first standardized as a percentage of the total annual count; subsequently, box-and-whisker plots demonstrate the variations across weeks and months. The daily trips tend to decrease on weekends and holidays than weekdays. The interquartile range was large in May, August, and January, indicating a large daily variation. The result suggests that the behavioral characteristics of August and January are affected by long holidays such as Obon (Japanese summer holiday) and New Year’s.

Fig. 2
figure 2

Box-and-whisker plots of the percentage of daily trips by the day of the week (upper) and month (lower)

Figure 3 shows the percentage of empty trips for trip chains with two to five trips. The percentage of empty trips tends to increase in afterward trips in the trip chain compared with the early trip. The results can be understood by considering the delivery vehicle loading goods on the early trip and unloading goods on subsequent trips. The percentage of empty trips observed in the second trip was lower than those observed in the initial trip for chains containing three to five trips. This implies that some vehicles may have left the base as unoccupied, which were loaded at the arrival point of the first trip.

Fig. 3
figure 3

Percentage of empty trips by trip order for trip chains with two to five trips

3.2 Trip Distance

Figure 4 shows the trip distance distribution. The distance traveled by empty vehicle trips tends to be longer than that of occupied vehicle trips. Figure 5 shows the average trip distance based on trip departure time. For all trips, the average trip distance tends to be longer for trips departing at night than for those departing during the daytime. In addition, the average trip distance tends to be longer for trips departing at approximately 5:00 and 17:00–21:00.

Fig. 4
figure 4

Trip distance distribution

Fig. 5
figure 5

Average trip distance by trip departure time

3.3 Empty Ratio

Figure 6 shows the daily empty ratio, and Fig. 7 shows a box-and-whisker plot of the empty ratio by day of the week and month. The empty ratio tends to increase slightly on Mondays. The quartile range was larger on Sundays and holidays, indicating a large difference in the empty ratio per week. The monthly difference shows that the empty ratio tends to increase in July, with the highest ratio occurring on July 15, which is a national holiday.

Fig. 6
figure 6

Daily empty ratio variation

Fig. 7
figure 7

Box-and-whisker plots of empty ratio by day of the week (upper) and month (lower)

Figure 8 shows a box-and-whisker plot of the empty ratio by departure time. A difference in the ratio between the late night and early morning was observed, whereas the difference between 11:00 and 15:00 tends to be small.

Fig. 8
figure 8

Box-and-whisker plots of empty ratio by departure time

4 Nonhierarchical Cluster Analysis

4.1 Procedure

This study used nonhierarchical cluster analysis (i.e., the k-means method) to classify the vehicles with loaded and unloaded trip patterns. This analysis aims to identify vehicle groups and schedules that can potentially improve the empty ratio. The squared Euclidean distance was used for the distance measurement.

We used the hourly travel times of occupied and empty trips as explanatory variables in the nonhierarchical cluster analysis, with a total size of 17,568. For each vehicle (N = 19,956), the hourly travel times of occupied and empty trips were recorded every 366 d and 24 h, respectively. The travel time was zero if no trips were recorded at that hour, whose maximum was 60 min. The variable size of 17,568 was determined based on 24 (hours) \(\times\) 366 (days) \(\times\) 2 (empty or occupied). The number of clusters was set to five.

We also tried a min-based discrete “run or not” explanatory variables, but the results did not perform well; most samples were clustered on one cluster. Thus, we employed a 60 min-based analysis unit and hourly travel time as explanatory variables.

4.2 Results

Figure 9 compares the cumulative distribution of the OD travel time for each cluster. The dashed vertical line indicates the average travel time. The travel time for empty trips tends to be longer than that for occupied trips in all clusters. For empty and occupied trips, the vehicles in cluster 2 had the longest average OD travel time, whereas cluster 5 had the shortest average OD travel time. The average travel time differences between empty and occupied trips tended to be smaller in clusters 1 and 3 than those observed in the other clusters.

Fig. 9
figure 9

OD travel time distribution of each cluster for empty (upper) and occupied (lower) trips

Figure 10 shows the average OD travel time by time of day. The vehicles in cluster 1 tend to have longer OD travel times during the evening and night. The average OD travel time of vehicles in cluster 2 tends to be longer during the daytime for empty trips. Cluster 3 vehicles tend to have longer OD travel times during the nighttime (19:00–22:00) for empty trips and in the early morning (3:00–5:00) for occupied trips. Clusters 4 and 5 vehicles tend to have shorter OD travel times in all periods.

Fig. 10
figure 10

Average OD travel time distribution per departure time of each cluster for empty (upper) and occupied (lower) trips

Table 3 lists the number of vehicles, total trips, average trips per vehicle, average trips per day, and average trips per trip chain by cluster. Cluster 5 had the largest number of vehicles. Clusters 1 and 3 tend to have more trips per day, whereas clusters 2 and 5 have fewer trips per day.

Table 3 Summary of data per cluster type

The average number of trips per trip chain was highest in cluster 1 and lowest in cluster 2. Clusters 1 and 3 have a higher percentage of occupied trips, whereas cluster 5 tends to have a higher percentage of undefined trips.

Table 4 shows the percentage of empty trips by cluster for the number of trips between two and five trips per trip chain. Clusters 4 and 5 tend to have a high percentage of empty trips, regardless of the number of trip chains and trip order. The empty vehicle trip ratio tends to be higher for the last trip than the first one, suggesting that many vehicles do not carry luggage on the last trip and are traveling to return home to their bases.

Table 4 Percentage of empty trips per trip chain (%)

Figure 11 shows a box-and-whisker plot of the daily percentages of the total trips by day of the week and month for each cluster. The total number of trips for the vehicles in cluster 2 tends to increase significantly in March. Because March is the moving season, they may include commercial vehicles associated with moving companies. The vehicles in cluster 2 have a larger quartile range for the day of the week, indicating the larger differences by day of the week.

Fig. 11
figure 11

Box-and-whisker plots of the percentage of daily trips by day of the week (upper) and month (lower)

Figure 12 shows the percentage of the total trips by departure time. Compared to the other clusters, the total number of trips for cluster 1 tends to increase more during nighttime than daytime. Figure 13 shows that this trend is more evident for occupied trips. Therefore, only the vehicles in cluster 1 are more efficient at night than during the daytime.

Fig. 12
figure 12

Departure time distribution of each cluster

Fig. 13
figure 13

Departure time distribution of empty (upper) and occupied (lower) trips of each cluster

Figure 14 shows a box-and-whisker plot of the empty ratio for each cluster by month and day of the week. The empty ratio is low for vehicles in clusters 1 and 3 and high for those in clusters 4 and 5. In cluster 2, the quartile range is large for the day of the week and month, indicating that the empty ratio fluctuates significantly.

Fig. 14
figure 14

Box-and-whisker plots of empty ratio of each cluster by day of the week (upper) and month (lower)

Figure 15 shows the trends in the empty ratio by departure period. The vehicles in cluster 5 tend to have a higher empty ratio than those in the other clusters. The vehicles in cluster 1 tend to have a lower empty ratio at night. The vehicles in cluster 3 have a lower empty ratio than that in the other clusters between 3:00 and 16:00.

Fig. 15
figure 15

Empty ratio distribution per departure time of each cluster

4.3 Discussion

We summarize the features of the vehicles in each cluster and discuss whether they can improve the empty ratio. The vehicles in cluster 1 tended to make more trips at night (Fig. 12). Moreover, the travel times of occupied trips at night tended to be longer (Fig. 10). The empty ratio at night is low (Fig. 15). These results indicate that the night trips in cluster 1 are already efficient; accordingly, the empty ratio in this cluster can be improved during the daytime.

The vehicles in cluster 2 tended to make trips with longer travel times; this is especially true for empty trips (Fig. 9). The vehicle variation effect on the empty ratio is large (Fig. 14); therefore, we must carefully examine the movement of individual vehicles in cluster 2 to identify whether they can improve the empty ratio.

The vehicles in cluster 3 tended to have longer travel times, between 19:00–23:00 for empty trips and 3:00–6:00 for occupied trips (Fig. 10). The empty ratio is higher at night (Fig. 15); accordingly, there is greater potential for improving the empty ratio during nighttime.

The vehicles in cluster 4 had relatively short travel times (Fig. 9) and tended to have a high percentage of empty trips during the last trip in the chain (Table 4). The overall empty ratio was relatively high (Fig. 15, Table 4); thus, the potential for improving the empty ratio was also high.

The vehicles in cluster 5 had the shortest travel times (Fig. 9) with the highest empty ratio (Fig. 15, Table 4). Thus, the vehicles in clusters 4 and 5 have a higher potential for improving the emptying ratio. Joint deliveries using the vehicles in clusters 4 and 5 are worth investigating.

5 Conclusions

In this study, we analyzed the empty vehicle behavior of commercial vehicles in Kyushu, Japan using probe vehicle data. The findings are summarized as follows:

  1. 1.

    The empty ratio tended to increase slightly in July and fluctuated significantly on Sundays and holidays. The ratio tended to be higher during the daytime.

  2. 2.

    The travel distance and duration of empty trips tended to be longer than those for occupied trips.

  3. 3.

    For a trip chain, the percentage of empty trips tended to increase more for the last trip than for the first.

  4. 4.

    A nonhierarchical cluster analysis can classify vehicles with loaded and unloaded trip patterns. The analysis identified that while a vehicle group can potentially improve the empty ratio during the daytime, the other group does so at night.

These findings can be potentially used to reduce the number of empty trucks and improve the efficiency of logistics systems. For example, a joint delivery system operated by multiple logistics companies will effectively reduce the number of empty trucks; such a system can benefit from the different time-of-day empty ratio pattern results demonstrated in the cluster analysis. In the design, the location information, which is neglected in the current study, will be important.

In addition, observing the data over a longer period of time will enable us to analyze the impact of certain policy changes. For example, the impact of restricting drivers’ working hours on vehicle operations through workstyle reform laws in Japan can be explored. Such a study will be useful for evaluating similar policy reforms.

One limitation of the currently used dataset is that company, driver, and goods information are unavailable for privacy reasons. Logistics operations differ by company or sector, but the differences in the empty truck ratio cannot be examined. Another limitation is that the current OD data do not record expressway usage. The effect of expressway prices on long travel-time trips is vital and should be examined in future studies.