Truck operations

Trucks typically bring containers to the terminal and exchange them for other containers. Full containers may arrive from local exporters, empties may arrive from local consumers, and both fulls or empties may arrive from other, nearby terminals to be transshipped. Similarly, full containers may leave the terminal to go to local importers, empties may go to local exporters to be loaded, and both fulls and empties may go to other, nearby terminals to be transshipped.

Inside the terminal, each truck proceeds through a series of services that depend on the type of container dropped off, and another series of services depending on the type of container picked up. Exactly what these services are depends to some extent on the terminal. For example, in many terminals, a truck will first pass through a gate and then proceed to an area where terminal personnel check the container status, verify the seal, and give authorization for the truck to proceed. There may be a separate area where the truck will park while the driver files documentation, such as customs clearances, and pays terminal charges. Some areas may be devoted to receiving or loading a particular class of container, such as “reefers” (refrigerated containers), which require access to electrical power. Similarly, there may be separate areas for transshipment containers, or import containers that have been cleared by customs.

In general, each service is associated with one or more parking areas within the terminal that we call “yards.” For now, assume that each yard provides at most one service (this need not hold strictly, as discussed below). Furthermore, we assume that this assignment of services to yards does not change during the study period.

Let the sequence of services experienced by a truck be called a path. All told, there may be 10–40 distinct paths that a truck might typically follow. But because some services may be provided by more than one yard, two trucks following the same sequence of services may in fact visit different yards and so different areas inside the terminal.

The present paper shows how, if a truck is equipped with GPS, we can programmatically track its movement, identify the yards that were visited, infer the types of containers dropped off or picked up, and measure the time spent in each yard as well as the time spent driving.

At first glance, this may seem easy, but the inaccuracy of GPS makes interpretation a challenge. GPS-enabled smartphones have a mean accuracy of 4.9 m under ideal conditions (Navigation National Coordination Office for Space-Based Positioning and Timing 2017; William J. Hughes Technical Center NSTB/WAAS T&E Team 2014; van Diggelen and Enge 2015). But in a terminal, conditions are highly unfavorable. Containers, which may be stacked eight high (21 m) might reflect, attenuate, or block GPS signals, so that the GPS may report a location 20 m from actual. This is a real problem because different yards of the same terminal may be separated by only the width of a two-lane road, so that it is often impossible to know whether a truck has stopped in one yard, or the other, or in the road in between.

Another challenge is that GPS in trucks is typically configured for safety monitoring, and not for precise location. In other words, the management simply wants to know that the truck did not speed, or made an unauthorized stop, or deviated en route to the terminal. Consequently, GPS in trucks typically reports at a relatively low frequency (once every 10 or 20 s is typical) to reduce communication costs. Consequently, events of shorter duration may be unobservable.

This lack of spatial and temporal resolution makes the path of a truck, as seen by GPS, somewhat ambiguous. This paper shows how, in most cases, the inherent ambiguity in a path can be resolved. That is, one can programmatically identify the likely sequence of services followed by each truck and estimate the time spent in each service as well as the time spent driving. Furthermore, having identified the path allows inference of the container type brought into the terminal or taken from it. Thus, we can answer such questions as how long does it take to pass through customs with a refrigerated import container? Or how long does it take to drop off an empty dry container?

Measuring terminal performance

Most of the literature on port-performance is devoted to computing an aggregate measure of performance based on the (economics) concept of technical efficiency. Typical approaches are data envelopment analysis or stochastic frontier analysis, which measure how efficiently the port transforms certain inputs into desired outputs. A comprehensive review may be found in Liu (2010) and recent applications of data envelopment analysis may be found, among others, in Lu et al. (2015), Schøyen and Odeck (2013), Wu and Goh (2010), and Zahran et al. (2015).

Our paper takes a more granular approach and attempts to measure performance of just a portion of what goes on in a container terminal, albeit an essential portion: the exchange of containers with the hinterland and specifically those arriving or departing by truck. The importance of this activity is illustrated by the fact that trucks move on average 11,000 containers each day in such ports as the Port of Los Angeles (Port of Los Angeles 2016).

A stream of research has examined the details of terminal operations with the goal of improving them by intelligent control of key equipment, such as gantry cranes or automated guided vehicles (see, for example, Daganzo 1989; Dowd and Leschine 1990; Mennis et al. 2008; and citations in the latter). Our work complements this by demonstrating a way to measure actual performance in the delivery and retrieval of containers by truck.

Up to now, the standard way of measuring performance of this activity is by reporting “truck turn-times”; that is, the difference in times at which each truck enters and leaves the terminal (for example, Lubulwa et al. 2011). However, as a statistic, the truck turn-time concept has certain deficiencies. First, it aggregates all service times with all queueing times and all travel times. Furthermore, Lam et al. (2007) note that the metric may not be reliable because “many terminal operators’ daily turn time reports deduct break time and trouble time to arrive at a net turn time.”

Truck turn-time leaves unexamined the performance of individual services, which depend on the type of container being dropped off/picked up. Typically, an empty dry container would be dropped off at a different location than would, say, a full refrigerated import container, which might also require customs inspection. The time of each service depends on many factors, including the location and layout of the yard, the terminal resources deployed to service the containers, available space, heights to which the containers are stacked, the number of containers in the yard, and the number of trucks visiting.

One way to get a closer look would be to require each service at the terminal to record the start and end times of service provided to each truck. However, this requires extensive record-keeping in a challenging work environment, which may explain why this information is only infrequently available (The Tioga Group, Inc. 2010). Even then, the result does not include the time the truck spends queuing or traveling within the terminal.

The closest to our work is that of Lam (2007), who estimated service times by means of cameras, mounted at five key locations of the terminal, and students stationed at each camera. At the end of each day, the authors matched time-stamped photos of trucks to estimate service times. This is clever, but poses challenges of where to locate and how to maintain the cameras, as well as how to collect the digital photographs.

Our approach requires no record-keeping by the terminal, yet provides more detail that would typically be available from an EDI system. The idea is to use each truck as an unobtrusive “probe” that reveals, by GPS records, the time spent in each service. This has the advantage of being entirely remote and requiring no special equipment beyond that already on most trucks. Furthermore, it is of no cost to the terminal. Indeed, our method can be conducted even without awareness of the terminal.

Our methodology is adapted from that of map-matching, which is the process of aligning a sequence of GPS readings with the road network on a digital map (Alt et al. 2003; Newson and Krumm 2009). In a similar manner, we align the GPS readings with each of the likely sequences of service in a terminal to identify the best match.

While the underlying technology—dynamic programming—is the same, models differ in some fundamental ways. In map-matching, GPS readings are matched to road segments while we are matching a sequence of stops inferred from GPS readings to polygons (terminal yards). In map-matching, the plausible candidates for next road segment are determined by the topology of the road network, while in our case, the candidates for next yard are determined by the type of container (which determines the sequence of services).

Our approach adds an additional twist: After finding the most plausible sequence of services from the GPS trail of a truck, we invoke machine learning to decide whether the result is believable.

Finding the most likely sequence of services

Step 1: find the stops

A GPS typically reports time, latitude, and longitude. It may also report speed and heading, but these can be computed from the other three data elements.

  • Scan the sequence of GPS readings to identify stops, which we take to be any maximal sub-sequence of readings for which the speed is sufficiently small (because of inaccuracy, a GPS may report a speed greater than zero even though stopped; to account for this, we consider any speed less than 5 km/h to be, for all practical purposes, a stop).

  • Convert each such maximal sub-sequence into a single stop, at a location that is the average latitude and longitude of the constituent GPS readings. And if the GPS recorded every Δ seconds, assume the stop began Δ/2 s before the first reading and concluded at Δ/2 s after the last reading.

Step 2: find the path that best matches the sequence of stops

We require a description of the terminal, such as that shown in Fig. 1, in which each yard is circumscribed by a geofence. In addition, we require a list of possible sequences of services that might be required by a truck. Typically, this can be enumerated by listing each type of container that might be dropped off at the terminal and each type that might be picked up and the services required by each. It is also helpful, but not necessary, to know a lower bound on the time required for each service, so we can more easily recognize the corresponding stop.

Fig. 1
figure 1

Layout of a container terminal with yards numbered by the service they provide. Note Each yard is devoted to a service, such as processing documentation or providing power to refrigerated containers, and a service may be associated with multiple yards. Red dots are a series of stops by a truck. In some places, the yards are so close that it is hard to tell from the GPS in which yard, if any, the truck actually stopped (for example, the fifth stop might be within one of the yards labeled 1 or within yard 2—or neither)

To find the most plausible path, we first construct, for each candidate path, the most plausible association of the sequence of GPS stops \(i = 1, \ldots , m\) to the sequence of services \(j = 1, \ldots , n\).

Assume we have associated stops \(1, \ldots , i - 1\) with services \(1, \ldots , j - 1\), and we are now considering GPS stop i and service j. There are four possible interpretations of the relation of stop i to service j:

  1. 1.

    GPS stop i represents arrival to a yard providing service j.

  2. 2.

    GPS stop i represents an additional stop in a yard providing service j. (This might happen if, for example, the truck were creeping forward in a queue and so generating a sequence of stops.)

  3. 3.

    GPS stop i should be ignored because it does not plausibly correspond to any yard providing service j.

  4. 4.

    Service j has been visited but the GPS failed to record the stop and so it appeared to skip it.

We model the plausibility of each interpretation as a cost, where higher cost means less plausible. These costs follow some basic rules that reflect what is known about GPS accuracy and typical times required for each service:

In general, the farther a stop is from a particular yard, the less likely the stop was for a service provided by that yard, and so the higher the cost. Similarly, the less the duration of a stop resembles that expected for a particular service, the less likely the stop was for that service, and so the higher the cost.

There are many reasonable ways to model the costs, consistent with the guidance above. We chose to model the cost of associating stop i with service j as

$${\text{match}}\left( {i,j} \right) = \left( {\frac{{{\text{meters from}}\; i\;{\text{to}}\;j}}{20}} \right)^{2},$$

where we define the distance from GPS stop i to service j to be the shortest distance to the perimeter of any yard providing service j. If the GPS reading is in the interior of such a yard, the distance (and therefore the cost) is defined to be negative. This cost function assesses small penalties for small distances, but increases quickly as distances exceed 20 m, a distance that we believe represents a typical outer limit to GPS error. This severely penalizes implausible matches and strongly rewards matches when the GPS stop lies well within the yard. It is less assertive when the GPS stop is close to the perimeter.

We modeled the cost of failing to find a stop at service j, and so appearing to skip that service, as a linear function that increases with the expected duration of that service. This reflects the fact that, with GPS readings every 10 s, it would be easy to miss a service that required only 10 s or less; but if the service is expected to require 10 min, then one would expect GPS readings to show a stop comparably long. It is unlikely, and therefore of high cost, to assume the GPS entirely missed such a stop.

Finally, we modeled the cost of ignoring GPS stop i as the product of a location cost and a duration cost. The cost of ignoring the location varies inversely with distance to service j and increases directly with the duration of the stop. This assumes that a long stop close to or inside a yard providing service j is more likely to be related to that service. Conversely, a short stop far from service j is unlikely to be for that service.

We can now write the least cost (most plausible) association of GPS stops to services for a particular path (sequence of services) as a dynamic programming recursion. Let \(C(i,j)\) be the most plausible association of stops through i to services through j. Then

$$C\left( {i,j} \right) = \hbox{min} \left\{ {\begin{array}{*{20}l} {{\text{match}}\left( {i,j} \right) + C(i - 1,\;j - 1)} \hfill \\ {{\text{match}}\left( {i,\;j} \right) + C(i - 1,\;j)} \hfill \\ {{\text{skip}}\left( j \right) + C(i,\;j - 1)} \hfill \\ {{\text{ignore}}\left( i \right) + C(i - 1,\;j)} \hfill \\ \end{array} } \right.$$

and the minimum cost association of stops to services is that which corresponds to \(C\left( {m,n} \right)\).

We use this recursion to evaluate the plausibility of each of the standard paths: each path receives a score giving its total cost, and that path with minimum total cost is the best interpretation of the trip as recorded by GPS. For example, among the sequences of expected services, the stops of Fig. 1 best match the sequence 6-8-4-1-0-9-2-6, which corresponds to delivery of an empty container, followed by pick-up of an import container with customs clearance.

Step 3: remove implausible matches

Our GPS dataset included some trips by trucks dispatched for purposes other than swapping containers. Such trips did not follow one of the expected paths; indeed, they may not have visited the terminal at all. Nevertheless, some path will have been identified as the best match, even though it is a very poor match. Such trips should be identified and purged from the study of service times. Unless the trucking company distinguishes between types of trips, purging must be done programmatically. The costs of the best match provide clues to plausibility, but it is not always obvious how to interpret them. For example, a small total cost may be due to a good match or to a bad match to a path with few services.

The problem of separating the implausible best matches from the plausible ones may be viewed as a problem of statistical binary classification, one of the classic problems of machine learning. We chose to recognize implausible matches by means of a classification tree, for which we used the ID3 algorithm (Quinlan 1986). To build the classification tree, we plotted the GPS trails of hundreds of trips and compared the maps with the computed best matches. Each best match was labeled as plausible or implausible (the best match was judged implausible for about 20% of the trips). The tree was then constructed on the following statistics, the first seven of which are the predictor variables generated by the matching, and the last of which was the judgement of an expert human.

  • Number of GPS stops within the terminal.

  • Number of services required by the path.

  • Number of services unmatched.

  • Fraction of services unmatched.

  • Total cost of matching GPS stops with service.

  • Total cost of ignoring some stops as spurious.

  • Total cost of seeming to skip a service required by this path.

  • Was the best match plausible or implausible?

The resultant classification tree revealed a simple test that correctly identified more than half of the implausible best matches, with very few false positives. This test was: If the total cost of ignoring GPS stops is sufficiently large then the best match is implausible if the total cost of matches is sufficiently small. This makes sense because spurious trips may have entered and left through the main gate of the terminal, which matched some GPS stops, but the other GPS stops of the trip might not have visited any service areas and so were ignored in the best match. In other words, this test filtered out many of the trips that were for reasons other than to swap containers.

For the final study, we removed all trips for which the best match was labeled implausible. The remaining trips were those for which we had confidence that the best-matched paths correctly described the sequence of services for each trip. From these, we derived the distribution of times spent in each service, as well as times spent queueing and times spent driving.


Many commercial trucks are now equipped with GPS devices so that their locations can be monitored at a coarse level, such as verifying that the truck went to the correct terminal. We can take advantage of this to measure the performance of truck operations within a container terminal at a finer level of detail than previously. In addition to measuring time in each service, this analysis can also identify the likely container type—dry or refrigerated, import or export or transshipment, etc.—so in many cases we can report processing times for each combination of service and container type. Furthermore, this measurement is unobtrusive: It requires no interruption of operations and no extra record-keeping by the terminal; the only data required are access to GPS records and identification on a map of the various yards within the terminal. The algorithms described above compensate for inaccuracy of GPS by using a Hidden Markov Model to match the GPS trail to plausible patterns of movement within the terminal.

There are some circumstances in which our approach will not work well. To reduce communications costs, the GPS may record only once a minute or even less often, so that some services of shorter duration, such as gate entry/exit when there is no queue of trucks, might be undetectable in the GPS record.

Another possible problem is that some terminals may dynamically assign services to yards when short of space. For example, if the usual yard is full, a truck may park elsewhere while the driver submits documents to customs. It this happens with too many services then services can no longer be reliably distinguished by location and so the interpretations of paths become unreliable.

There are several ways to improve the accuracy of our method. One way would be to filter the GPS readings to remove any trips that were not to swap containers. If this could be done, there would be no need to use machine learning.

Another way to improve the accuracy of our method would be to increase the frequency at which the GPS device records its location. Once per second would provide sufficient detail for our purposes, but this may increase communications costs if readings are transmitted in real time. Alternatively, extra readings could be stored locally on the GPS device and downloaded later, but with the cost of extra handling. It may also be possible to use higher quality GPS devices, but this would increase equipment costs.