Abstract
With the development of mobile positioning technology, a large amount of mobile trajectory data has been generated. Therefore, to store, process and mine trajectory data in a better way, trajectory data simplification is imperative. Current trajectory data simplification methods are either based on spatiotemporal features or semantic features, such as road network structure, but they do not consider semantic features of a trajectory stop. To overcome this limitation, this study presents a trajectory segmentation simplification method based on stop features. The proposed method first extracts the stop feature of a trajectory, then divides the trajectory into move segments and stop segments based on the stop features, and finally simplifies the obtained segments. The proposed method is verified by experiments on personal trajectory data and taxi trajectory data. Compared with the classic spatiotemporal simplification method, the proposed method has higher spatiotemporal and semantic accuracy under different simplification scales. The proposed method is especially suitable for trajectory data with more stop features.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
With the development of mobile positioning technologies, such as the Global Positioning System (GPS), Global System for Mobile Communications (GSM), and Radio Frequency Identification (RFID), a large number of mobile positioning devices with high positioning accuracy and low price have been proposed, including mobile phones, GPS collectors, and personal digital assistants (PDAs). As a result, a large amount of movement trajectory data has been generated, which brings difficulties in data storage and processing. For instance, in the TDrive data set, there are 10,357 taxis, the sampling frequency is 5 s, and each record occupies 40 b (Yuan et al. 2010), so the amount of trajectory data of all taxi trajectory in Beijing city can reach 4 GB per day. Storing and indexing such massive data can cause high economic costs and low time efficiency, and it is challenging to process massive trajectory data, mine hidden features, and extract spatiotemporal patterns in the data. Therefore, it is necessary to perform compression and simplification of trajectory data.
Most trajectory data simplification methods are offline or online simplification methods that use compression ratio and geometric feature preservation, including spatial features, spatiotemporal features, and velocity features, as a compression target.
Because trajectory data are commonly collected on the road network, a trajectory simplification method constrained by the road network and a trajectory data simplification method after map matching have been proposed (Kellaris et al. 2009, 2013; Popa et al. 2015). In this way, the trajectory reduction result is more in line with the real situation.
The common disadvantage of these two types of methods is that when the compression ratio is high, data simplification results may lose semantic features of the original data. To overcome this problem, a semantic trajectory simplification method has been proposed (Schmid et al. 2009; Richter and Schmid 2012). This method first extracts stops of a trajectory in geographical context and then abstractly expresses mobile trajectory to achieve the purpose of compression. Although this method has a high compression ratio, it reconstructs the trajectory through stops, so all the movement information between stops in the trajectory is lost.
To address the abovementioned limitations, this study proposes a semanticsbased trajectory segmentation simplification method (STSS). In this method, first the stop features of a trajectory are extracted first, then the trajectory is divided into stop segments and move segments based on the stop feature, and finally stop segment trajectories and move segment trajectories are simplified by their own methods respectively. The proposed method retains more spatiotemporal and semantic information of data while achieving high compression ratio.
The rest of the article is organized as follows. Section 2 reviews the related work on trajectory data simplification. Section 3 describes the proposed semanticbased trajectory segmentation simplification method. Section 4 verifies the proposed method by experimental tests and compares the data simplification result regarding different compression ratio. Finally, Sect. 5 draws conclusions about the applicability of the method.
Related Works
Trajectory Simplification Based on Spatiotemporal Features
The trajectory simplification method based on spatiotemporal features improves the general curve simplification method by constructing homomorphic spatial distance (Meratnia and de By 2003, 2004), spatiotemporal threedimensional space (Trajcevski et al. 2006; Cao et al. 2006), or velocity features to improve the accuracy of simplification (Gudmundsson et al. 2009). These methods are aimed at maintaining high geometric accuracy and controlling the trajectory error (Muckell et al. 2014). They determine whether to retain or delete trajectory points according to the preset distance (position), angle (direction), and velocity (time) thresholds. They can be roughly divided into offline trajectory simplification methods and online trajectory simplification methods (Lee and Krumm 2011).
The main purpose of the offline trajectory simplification method is to compress trajectory data. The main idea is to retain more spatiotemporal information of data while reducing the amount of trajectory data. Meratnia and de By (2004) introduced the classic line feature simplification method, the DouglasPeucker (DP) method, into trajectory data compression for the first time. This method improves the DP method by constructing the homomorphic space distance, proposing the DP method of homomorphic distance named the topdown timeratio (TDDR) method. After that, a variety of methods have been developed on the basis of the TDDR method and applied to various types of trajectory data simplification tasks (Zhao and Shi 2018).
Because of the dynamic and realtime characteristics of trajectory data, online trajectory simplification has become the focus of the current trajectory compression research. The simplification method based on deduced positioning (Trajcevski et al. 2006; Long et al. 2014) and the simplification method based on region filtering (Potamias et al. 2006; Gudmundsson et al. 2009) have been proposed. Both of these two methods are local optimization methods based on the trajectory data stream. Their main advantage is high efficiency, but their disadvantage is that the simplification accuracy cannot be guaranteed (Muckell et al. 2014). An online trajectory compression with controllable accuracy and compression ratio was proposed by Muckell et al. (2011). In this method, the queue is formed by the current trajectory point series, and the points with the minimum feature value are gradually deleted until the error threshold is exceeded or the compression ratio is reached. In addition, a new online trajectory simplification algorithm based on directed acyclic graph (OLTS) was proposed to apply to online services. This method represents an approximate optimal compression algorithm (Wu et al. 2017).
Trajectory Simplification Based on Road Network
Considering that the trajectory is constrained by a road network, the road network space is used instead of a twodimensional space, and the trajectory is simplified by structural characteristics of the road network (Li et al. 2008; Wu et al. 2015; Zhang et al. 2018), or it is simplified after map matching (Kellaris et al. 2013; Liu et al. 2014; Song et al. 2014). Among them, Li et al. (2008) extracted the characteristic trajectory points by combining the speed and direction characteristic information with the road network characteristic information for logical operation so as to simplify the trajectory data; Zhang et al. (2018) proposed an improved spatial–temporal trajectory compression method with constraints of a road network’s structural features. The advantage of these methods is that the compressed trajectory can retain the characteristics of the road network, but the data does not match to the road network. Thus, a variety of trajectory compression strategies considering the road network constraints were proposed by Kellaris et al. (2009, 2013), including map matching, and different combinations of map matching and compression. However, the map matching accuracy after compression is low. Therefore, most studies aimed to match the road network first and then compressed the trajectory. The biggest disadvantage of this type of method is the low efficiency of the map matching algorithm.
Semantic Trajectory Simplification
A concept of semantic trajectory compression was introduced by Schmid et al. (2009), wherein a semantic representation of a trajectory that consists of semantic locations associated with the trajectory stop features replaces the original trajectory points. Although the compression ratio of this method is very high, it only retains the trajectory points expressing the stops and deletes all the trajectory points in the moving state, so its simplification accuracy cannot be guaranteed. An enhanced semantic trajectory compression was proposed by Feng et al. (2013), wherein a semantic of a trajectory was represented by the speed change. Moreover, Yang et al. (2019) added semantic information to a trajectory through velocity clustering and then combined it with the trajectory space–time simplification method so as to effectively maintain the spatiotemporal characteristics and velocity characteristics of the trajectory. Their method is to hierarchical cluster all points on a single trajectory line based on velocity and each point becomes part of the clustering result, and then simplify trajectory according to the results of hierarchical clustering. In addition, Andrienko and Andrienko (2010) propose a spatial generalization and aggregation method of massive movement data for visualization. Their method can greatly compress data and extract features from data; however, their generalization and aggregation is not based on the trajectory line, but on the feature points after the transformation of all trajectory lines. Thus, in this paper, the proposed method takes a trajectory line as a unit to simplify the trajectory data. It extracts the stop features of the trajectory by clustering, and then the whole trajectory line is divided into the stop segments and the move segments for “divide and conquer” simplification.
Proposed Method
General Idea
As shown in Fig. 1, the general idea of the proposed method is as follows. Firstly, the multilevel stop features of the trajectory are extracted by improving the OPTICS method (Ankerst et al. 1999). Secondly, the trajectory is divided into stop segments and move segments according to the stop features. Thirdly, the stop segments and move segments of the trajectory are simplified by their own method. Finally, the simplified stop segment trajectories and move segment trajectories are merged into the whole trajectory.
Stop Feature Extraction
Stop feature extraction is based on the clustering method of trajectory point string, which represents an improvement of the OPTICS method. Similar to the OPTICS method, the trajectory points clustering algorithm also includes two steps: clusterordering of the trajectory points and clustering structure generation from clusterordering.
ClusterOrdering of Trajectory Points
In the trajectory point string, the distance between two points is no longer a straightline distance between them but a sum of lengths of straightline segments composed of a series of points between the two points.
Definition 1: Distance between trajectory points
Assume \(P\) is a set of trajectory points, and \({p}_{i}\) and \({p}_{j}\) are trajectory points with sequences \(i\) and \(j\) in \(P\), respectively; then, the distance between trajectory points \({p}_{i}\) and \({p}_{j}\), can be calculated as follows:
In the OPTICS algorithm, clusterordering requires searching for the \(\upvarepsilon\) neighborhood of the core point and calculating and sorting reachabilitydistances of all points in the \(\upvarepsilon\) neighborhood in every iteration. In this algorithm, because the trajectory point string is an ordered set, clusterordering of trajectory points does not require sorting reachabilitydistances of all points in the \(\upvarepsilon\) neighborhood of the core point but can directly use the original ordering of trajectory points.
Moreover, according to the calculation formula of a distance between trajectory points, the shortest distance between a point and the other points in the set is one of the distances between this point and its two adjacent points. Therefore, the reachabilitydistance of a point can be calculated only once. In addition, the \(\varepsilon\) neighborhood of the core point can be searched sequentially rather than searching all trajectory points.
The clusterordering process of the trajectory points is shown in Algorithm 1.
The proposed method traverses the trajectory point set \(P\) and calculates the reachabilitydistance of each point in the set \(P\).
First, the algorithm computes the coredistance of the current point \({p}_{i}\) by running the function \(CalculateCoreDistance\left({p}_{i},P,\varepsilon ,MinPts\right)\), which first searches the points set PN of the \(\varepsilon\) neighborhood of pi and then compares the number of points in set PN with \(MinPts\); if the number of points in set PN is less than \(MinPts\), \(c\left({p}_{i}\right)\) is infinity; otherwise, \(c\left({p}_{i}\right)\) is the maximum distance between \({p}_{i}\) and points in the set PN.
Second, the reachabilitydistance of the next point p_{i+1} is calculated by running the function\(CalculateReachabilityDistance\left({p}_{i},c\left({p}_{i}\right)\right)\), which is dependent on the current point coredistance \(c\left({p}_{i}\right)\) and whether the point is in the \(\upvarepsilon\) neighborhood of p_{i}.If \(c\left({p}_{i}\right)\) is not equal to infinity and \({p}_{j}\) is in the \(\upvarepsilon\) neighborhood of p_{i}, then \(r\left({p}_{j}\right)\) is the maximum distance between \(c\left({p}_{i}\right)\) and the linear distance between \({p}_{i}\) and\({p}_{j}\); otherwise, \(r\left({p}_{i}\right)\) is infinity.
ClusterStructure Generation of Trajectory Points
The generation method of trajectory clustering structure is the same as the OPTICS method. In this method, the steepness point is first determined based on the steepness threshold, then the steepness area is extracted, and finally, the clustering structure is generated by matching the steep downward area and steep upward area that meet the clustering conditions. More detailed information on this method can be found in (Ankerst et al. 1999).
MultiLevel Stop Feature Extraction of Trajectory
In the trajectory clustering structure, clusters are not completely independent of each other but can contain each other. There are two types of inclusion relationships between clusters: (1) A cluster contains only one cluster, and the two clusters belong to the same stop feature, so one cluster can be deleted. As shown in Fig. 2a, C3 and C2 are clustering, where C_{3} contains C_{2} and they represent the same stop feature, so C_{2} is deleted. (2) A cluster contains more than one cluster, and they belong to different stop features. As shown in Fig. 2a, C4, C1, and C3 are clustering, where C_{4}, includes C_{1} and C_{3}, and C_{1}and C_{3} are different stop features, so they should be retained. Thus, the hierarchical relationship between them can be represented by a tree structure in Fig. 2b.
The multilevel stop feature extraction algorithm of the trajectory is shown in Algorithm 2. The algorithm input is a trajectory clustering set \(C\); \(c\) is a cluster in set \(C\), which is represented by \((P, s, e)\), where \(P\) denotes the trajectory point string of the cluster, and \(s\) and \(e\) are the positions of the start and end points of the cluster in the original trajectory point string, respectively. The algorithm output is the trajectory stop segment tree set \(N\); \(n\) is a stop segment node in the set \(N\), and it is a tree node represented by \((c, childNodes)\), where \(c\) denotes a cluster and \(childNodes\) stands for all child nodes of node n.
The algorithm first initializes the global clustering range \(\left(gs,ge\right)\) as empty and then traverses cluster set \(C\). Next, it is determined whether the intersection of the global range and the current clustering range and the intersection of the current clustering range and the subsequent clustering range are empty. If both of them are empty, then the current cluster is added to set \(N\) as a tree node, and the global range is the current cluster range. If the former is not empty, but the latter is empty, then the current cluster is added to set \(N\) as a tree node, and the child nodes of the node are found in set \(N\), then the global range is set to the current cluster range; otherwise, it will not be processed.
Three Thresholds Setting
In the proposed method, there are three thresholds: distance threshold ε, number of points threshold \(MinPts\), and steepness threshold \(\xi\). The distance threshold determines the minimum density of a cluster, the number of points threshold determines the minimum number of points in a cluster, and the steepness threshold determines the minimum difference in the density between a cluster and its surrounding scattered points. Therefore, the first two thresholds affect clusterordering of a trajectory, whereas the third threshold affects clusterstructure generation. Ankerst et al. (1999) have been suggested that similar results can be obtained using different ranges of ε and \(MinPts\), as long as the value of the two threshold is not too small.
In the proposed method, the distance threshold \(\upvarepsilon\) represents the minimum moving range of a trajectory stop segment. The trajectory stop feature does not necessarily mean that the moving object stops; it can still move but at a slow speed. Therefore, the distance threshold is expressed as a product of the residence time and moving speed in the trajectory stop feature.
The point number threshold \(MinPts\) denotes the minimum number of points in the stop segment of a trajectory. The point number threshold of a trajectory can be expressed as a ratio of the residence time to the sampling frequency of trajectory points.
The steepness threshold parameter \(\xi\) represents a difference between the density of the stop segment and the density of the move segment of a trajectory. It is affected by the moving mode of a moving object. Generally, a person’s moving mode includes walking, riding, and traveling by car, train, or other means of transportation, so steepness should be set according to the specific mode of transportation.
Simplification of Trajectory Stop Segments
Simplification Method of Single Stop Segment of Trajectory
Since the trajectory stop segment is the stop feature of a location, it can be expressed using a point. For this simplified point, two factors need to be considered. First, the point should be as close as possible to the center of the trajectory stop segment, and second, the point should be the original point in the points set of the trajectory stop segment. Therefore, the point is calculated by the following method. First, the center of the point series in the trajectory stop segment is calculated, and then the distances from the point series in the trajectory stop segment to the center point are compared; the point with the smallest distance is taken as a simplified point.
Simplification Method of Multiple Stop Segments of Trajectory
Owing to the hierarchical relationship between stop segments, it is necessary to merge multiple stop segments when the degree of simplification is increased. The key of merging stop segments is to find stop segments that need to be merged. According to the multilevel stop segments established using the method described in Sect. 3.2.3, the tree structure of stop segments is formed. Based on the hierarchical tree structure of stop segments, the relationship between the reachable distance threshold and stop segments can be established. Therefore, as long as a certain reachable distance is given, the stop segments under the current distance threshold can be obtained.
Simplification of Trajectory Move Segment
Simplification method of trajectory move segment adopts the road network constrained moving trajectory simplification method (Zhang et al. 2018), which is to construct binary line generalization (BLG) tree and sort all trajectory points according to the spatial–temporal characteristics of the trajectory and the structure characteristics of the road network. The method can preserve both the spatiotemporal and the road structure characteristic of original trajectory at the same time.
Trajectory Simplification
Since the trajectory stop segment simplification and trajectory move segment simplification use their simplification thresholds to quantify the simplification scale, it is necessary to establish a quantitative relationship between the stop segment simplification threshold (semantic threshold) and the move segment simplification threshold (spatiotemporal threshold) based on the same simplification scale.
In this study, the function fitting method is used to establish the relationship between the two thresholds. First, the scatter diagram between the two thresholds is constructed based on the simplification scale, and then the polynomial function model is used for fitting.
The scatter plots of the two thresholds for dataset 1 and dataset 2 are presented in Fig. 3, where it can be seen that there is a linear relationship between the two thresholds. Therefore, the linear functional model is used to fit the relationship between the two thresholds. Let \(y\) be the semantic threshold, and \(x\) be the spatiotemporal threshold; then, the function model fitted by dataset 1 is defined as \(\mathrm{y}=23x+26.5\), and the function model fitted by dataset 2 is defined as \(\mathrm{y}=37.3x+38.8\).
Evaluation Method
The quality evaluation indexes of trajectory simplification include spatial–temporal accuracy and semantic accuracy.
Spatial–Temporal Accuracy Evaluation
Since a trajectory is usually distributed on a road network, spatial–temporal accuracy is evaluated by network homomorphic distance error (Zhang et al. 2018). The network homomorphic distance error is calculated by Eq. (2) and illustrated in Fig. 4.
In Eq. (2), \({tra}_{o}\) denotes the original trajectory, \({tra}_{s}\) denotes the simplified trajectory, n is the number of points in \({tra}_{s}\), and \({nhd}_{i}\) represents a distance between trajectory point \({p}_{i}\) and its homomorphic point in the road network.
Semantic Accuracy Evaluation
The semantic accuracy evaluation is to extract the stop features of a simplified trajectory and to compare the result of stop features with that of the original trajectory. The semantic accuracy is calculated by:
where \(NS\left({tra}_{o}\right)\) and \(NS\left({tra}_{s}\right)\) are the numbers of stop features extracted from the original trajectory and from the simplified trajectory, respectively.
Experiments and Results
Experiments on Personal Trajectory Data
Experimental Data
The experimental data were the data of two personal GPS trajectories in the city of Nanjing (Fig. 5), which can be downloaded from the shared database (https://figshare.com/s/6582b3f6b4906ddc5564). The details of the data are shown in Table 1. The sampling interval of dataset 1 was 5 s, and the total duration was approximately 17 h; this dataset included 8606 trajectory points with a length of 66,404 m. The sampling interval of dataset 2 was also 5 s, but the total duration was approximately 7 h; the dataset included 3602 trajectory points with a length of 26,553 m.
Stop features should be extracted from trajectory data before simplification of this proposed method. Therefore, for the experimental data, the three thresholds were set as follows: Considering that the minimum length of stop should not be shorter than 5 min, and the trajectory sampling frequency was 5 s; \(MinPts\) was set to 60. Since it is generally believed that the speed of the stop state should not exceed 1 m/s, the distance threshold \(\upvarepsilon\) should be greater than 300 m. Although a large distance threshold could provide better clustering results, an excessive threshold might cause a calculation burden to the algorithm, so the distance threshold was set to 1000 m based on practical experience. Finally, according to the experiment, the slope threshold ξ was set to 0.02.
The tree structure of stop feature extracted from the experimental data is shown in Fig. 6. As shown in Fig. 6, 41 stop features were extracted from experimental dataset 1, and they were divided into 4 levels; 15 stop features were extracted from experimental dataset 2, and they were divided into 5 levels.
Experimental Design
In the experiment, the accuracies of the proposed method and the TDDR method [7] were compared. The analysis was performed on the series scale of simplification, and the analysis results were compared from two perspectives, the simplification threshold, and the compression ratio.
Simplification threshold is composed of spatiotemporal threshold and semantic threshold. Since the linear relationship between spatiotemporal threshold and semantic threshold is established, the threshold of simplification is represented by spatiotemporal threshold. In the simplification thresholdbased analysis, it was necessary to select the appropriate series of simplification thresholds. Seven spatiotemporal threshold values of 0.25 m, 0.5 m, 1 m, 2 m, 4 m, 8 m, and 16 m were selected and used in multiple experiments.
In the compression ratiobased analysis, to obtain the same compression ratio in the two methods, the processing method of “intelligent damping oscillation” was adopted (Liu et al. 2016). The basic idea of this method is to intelligently adjust the threshold value through a variable step size until the simplification result is consistent with the preset compression ratio.
Assume \(dis\) is an initial threshold, \(step\) is an initial step of threshold adjustment, \(rat\_o\) is a target compression rate, \(rat\_c\) is a current compression rate, and \(tol\) is a tolerance of the target compression rate.
The procedure for this method is as follows: trajectory was simplified by the threshold \(dis\) and the \(rat\_c\) is calculated, and then let \(diff = rat\_crat\_o\). If \(\leftdiff\right < = tol\), end the adjustment; otherwise, modify \(dis\) and simplify trajectory again: if \(diff < 0\) and last \(diff < 0\), then \(dis = dis + step\); if \(diff < 0\) and last \(diff> 0\), then \(step=step/2\), \(dis = dis + step\); if \(diff> 0\) and last \(diff> 0\), then \(dis = dis step\); and if \(diff> 0\) and last \(diff< 0\), then \(step=step/2\), \(dis = dis step\).
Visual Analysis Result
The results of visual analysis are simplified by the TDDR method and the STSS method with a compression ratio of 50%. As shown in Fig. 7, this method directly simplifies multiple trajectory points with high density, namely the stop segment trajectories (e.g., s1, s2, s3, s4, s5) to a single point, while the TDDR method retains more of these trajectory points; however, the STSS method retains more trajectory points than the TDDR method for trajectory points with low density, namely the move segment trajectory (e.g., m_{1}). Some stop segments look like move segments (e.g., s4, s5). This is because moving objects move very slowly and then they are identified as stop features. Therefore, compared with the TDDR method, the STSS method compresses a large number of feature points in the stop segment trajectory, and retain more feature points in the of the move segment trajectory.
Spatial–Temporal Accuracy Analysis Result
The results of the spatial–temporal accuracy comparison of the two methods based on the simplification threshold are shown in Fig. 8, where it can be seen that on the two datasets, the accuracy of TDDR method was higher than that of the proposed method under the same threshold, and their accuracy difference increased with the threshold; and when the threshold was small, the accuracy difference between the two methods was very small, but when the threshold increased to a certain value (e.g., 4 m for dataset 1, 2 m for dataset 2), the accuracy difference expands rapidly.
The abovepresented comparison denotes a precision comparison based on the simplification threshold, which does not necessarily mean that the TDDR method performs better than the proposed method. This result could be because although the thresholds of the two methods were the same, their simplification scales differed. The results of the simplification scale (compression ratio) of the two methods on experimental datasets under different simplification threshold values are presented in Fig. 9, where it can be seen that the compression ratio of the proposed method was significantly higher than that of the TDDR method under the same threshold value. The compression ratio of the proposed method was nearly twice that of the TDDR method when the simplification threshold value was 0.5 m.
The spatial–temporal accuracy of the two methods was analyzed under different compression ratios, and the results are shown in Fig. 10. On the whole, the accuracy of the proposed method was higher than that of the TDDR method. When the compression ratio was small, the accuracy difference between the two methods was also small. However, when the compression ratio was high (e.g., 0.9 for dataset 1, and 0.78 for dataset 2), the proposed method had a smaller error and higher accuracy than the TDDR method.
Semantic Accuracy Analysis Result
The semantic accuracy comparison results of the two methods under different spatiotemporal threshold values are shown in Fig. 11, where it can be seen that compared with the TDDR method, the proposed method extracted more stop features and achieved better semantic accuracy under different thresholds. The accuracy gap between the two methods first increased and then decreased with spatiotemporal scale value. In addition, since the proposed method had a higher compression ratio than the TDDR method at the same threshold, the gap between the two methods was large at the same compression ratio.
Experiments on Taxi Trajectory Data
Since most of the stop features in the taxi trajectory data are caused by getting on and off passengers or waiting for traffic lights, compared with the personal track data, the number of stop features in the taxi trajectory data are smaller, and the stay time of each stop feature is shorter.
Experimental Data
The experimental data is one taxi trajectory data, which is selected from the taxi GPS trajectories dataset during the period of 2–8 February 2008 within Beijing (Yuan et al. 2010), as shown in Fig. 12. The sampling interval of the taxi trajectory dataset is 5 s, and the total number of the trajectory points are 30,156.
Similarly, the three thresholds in the stop feature extraction need to be set. Most taxi stops are caused by getting on and off passengers or waiting for traffic lights, so the minimum length of stop should not be shorter than 1 min, and the trajectory sampling frequency is 5 s; \(MinPts\) is set to 12. Since it is generally believed that the speed of the stop state should not exceed 1 m/s, the distance threshold \(\upvarepsilon\) should be greater than 60 m. Finally, according to the experiment, the slope threshold ξ was set to 0.02. Therefore, 233 stop features are extracted from this dataset, which are divided into two layers only; among them, there are 217 stop features of leaf nodes.
The experimental analysis was compared between the STSS method and the TDDR method on multiple compression ratios. The process is similar to the previous experiment. Firstly, eight spatiotemporal threshold values of 0.5 m, 1 m, 2 m, 5 m, 10 m, 25 m, 50 m, and 100 m were selected. Secondly, the trajectory was simplified by STSS method on these thresholds, and the corresponding eight compression ratios are obtained. Finally, the trajectory was simplified by TDDR method on eight compression ratios.
Visual Analysis Result
The results of visual analysis are simplified by the TDDR method with a 10m simplification threshold and the STSS method with a 2m spatiotemporal threshold and a 50m semantic threshold. As shown in Fig. 13, similar to the previous experiment, compared with the TDDR method, the STSS method compresses a large number of feature points in the stop segment trajectory and retains more feature points in the of the move segment trajectory. In addition, as shown in this figure, most of the stop features of trajectory are located at road intersections, which are caused by vehicles stopping for traffic signals at the intersection.
Spatial–Temporal Accuracy Analysis Result
The spatial–temporal accuracy of the two methods was analyzed under different compression ratios, and the results are shown in Fig. 14. When the compression ratio was small, the accuracy difference between the two methods was also small. However, when the compression ratio was high, the STSS method is better than the TDDR method, and the accuracy gap increases with the increase of compression.
Semantic Accuracy Analysis Result
The semantic accuracy comparison results of the two methods under different compression ratios are shown in Fig. 15, where it can be seen that the semantic accuracy of both methods decreases with compression ratio, but the TDDR method decreases faster. When the compression ratio is less than 0.61, the TDDR method is better than the STSS method; otherwise, the STSS method is better than the TDDR method. The main reason is that in the STSS method, the stop features are simplified when the simplification threshold is very small, so fewer semantic features are extracted after simplification.
Conclusion
This study proposes a semanticbased trajectory segmentation simplification method, which extracts stop features first and then performs segmentation simplification. The proposed method is verified by the experiments and compared with the classis spatiotemporal simplification method, the TDDR method. Based on the comparison results, the following conclusions can be drawn:

(1)
The relationship between the semantic threshold and spatiotemporal threshold under the same simplification scale is linear. The parameter value of the linear functional model is determined by the experimental data.

(2)
The compression ratio of the STSS method is obviously higher than that of the TDDR method under the same simplification threshold, and the difference first increases and then decreases with threshold value.

(3)
The spatiotemporal accuracy of the STSS method is slightly lower than that of the TDDR method under the same simplification threshold. However, the STSS method has a smaller error and higher spatiotemporal accuracy than the TDDR method under the same compression ratio, especially for a large simplification scale.

(4)
Compared with the TDDR method, the proposed STSS method can retain more stop features and has higher semantic accuracy. Obviously, there is a large performance difference between the two methods under the same compression ratio.

(5)
According to the experimental analysis of personal trajectory data and taxi trajectory data, the proposed method can be applied to different types of trajectory data, but it is better for trajectory data with more stop features (e.g., travel trajectory).
In the future, research on compression and simplification of trajectories could be conducted from the perspective of trajectory semantics mining. It should be noted that the purpose of trajectory simplification is not only to reduce the amount of data but also to extract trajectory characteristics at different scales to consider different application scenarios.
References
Andrienko N, Andrienko G (2010) Spatial generalization and aggregation of massive movement data. IEEE Trans Vis Comput Graph 17(2):205–219
Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60
Cao H, Wolfson O, Trajcevski G (2006) Spatiotemporal data reduction with deterministic error bounds. VLDB J 15(3):211–228
Feng S, Xu J, Xu M, Zheng N, Zhang X (2013) EHSTC: An enhanced method for semantic trajectory compression. Acm Sigspatial International Workshop on Geostreaming, ACM Press, New York, pp 43–49
Gudmundsson J, Katajainen J, Merrick D, Ong C, Wolle T (2009) Compressing spatiotemporal trajectories. Comput Geom 42(9):825–841
Kellaris G, Pelekis N, Theodoridis Y (2009) Trajectory compression under network constraints. In International Symposium on Spatial and Temporal Databases, Springer, Berlin, Heidelberg, pp 392–398
Kellaris G, Pelekis N, Theodoridis Y (2013) Mapmatched trajectory compression. J Syst Softw 86(6):1566–1579
Lee WC, Krumm J (2011) Trajectory preprocessing. In: Zheng Y, Zhou X (eds) Computing with spatial trajectories. Springer, New York, pp 3–33
Li X, Lin H, Guo ZY (2008) Reducing vehicle tracking data volume through a networkbased approach. Acta Geodaet Cartograph Sin 37(1):95–101
Liu K, Li Y, Dai J, Shang S, Zheng K (2014) Compressing large scale urban trajectory data. Proceedings of the Fourth International Workshop on Cloud Data and Platforms ACM Press, New York, USA, pp 1–6
Liu MS, Long Y, Fei LF (2016) Line simplification of threedimensional drainage considering topological consistency. Acta Geodaet Cartograph Sin 45(4):494–501
Long C, Wong RCW, Jagadish HV (2014) Trajectory simplification: on minimizing the directionbased error. Proc VLDB Endowment 8(1):49–60
Meratnia N, de By RA (2003) A new perspective on trajectory compression techniques. In Proc. ISPRS Commission II and IV, WG II/5, II/6, IV/1 and IV/2 Joint Workshop Spatial, Temporal and MultiDimensional Data Modelling and Analysis
Meratnia N, de By R A (2004) Spatiotemporal compression techniques for moving point objects. In Advances in Database Technology  EDBT 2004 Springer, Berlin, Heidelberg, pp 765–782
Muckell J, Hwang JH, Patil V, Lawson CT, Ping F, Ravi SS (2011) SQUISH: an online approach for GPS trajectory compression. In Proceedings of the 2nd international conference on computing for geospatial research & applications, com.geo’11, Washington DC, USA, pp 1–8
Muckell J, Olsen PW, Hwang JH, Lawson CT, Ravi SS (2014) Compression of trajectory data: a comprehensive evaluation and new approach. GeoInformatica 18(3):435–460
Popa IS, Zeitouni K, Oria V, Kharrat A (2015) Spatiotemporal compression of trajectories in road networks. GeoInformatica 19(1):117–145
Potamias M, Patroumpas K, Sellis T (2006) Sampling trajectory streams with spatiotemporal criteria. In 18th International Conference on Scientific and Statistical Database Management (SSDBM’06), IEEE Computer Society, Vienna, Austria, pp 275–284
Richter K, Schmid F (2012) Laube P (2012) Semantic trajectory compression: representing urban movement in a nutshell. J Spat Inf Sci 4:3–30
Schmid F, Richter KF, Laube P (2009) Semantic trajectory compression. In International Symposium on Spatial and Temporal Databases, Springer, Berlin, Heidelberg, pp 411–416
Song R, Sun W, Zheng B, Zheng Y (2014) Press: a novel framework of trajectory compression in road networks. Proc vldb Endowment 7(9):661–672
Trajcevski G, Cao H, Scheuermanny P, Wolfsonz O, Vaccaro D (2006) Online data reduction and the quality of history in moving objects databases. In Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access, ACM, Chicago, Illinois, USA, pp 19–26
Wu P, Tan YA, Zheng J, Zhang Q, Li Y, Cheng Z (2015) A hybrid compression framework for large scale trajectory data in road networks. Chin J Electron 24(4):730–739
Wu F, Fu K, Wang Y, Xiao Z (2017) A graphbased min# and erroroptimal trajectory simplification algorithm and its extension towards online services. ISPRS Int J Geo Inf 6(1):19
Yang M, Yan X, Zhang X, Li X (2019) Constrained trajectory simplification with speed preservation. Cartogr Geogr Inf Sci 47(2):110–124
Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun, G, Huang Y (2010) Tdrive: driving directions based on taxi trajectories. In Proceedings of the 18th SIGSPATIAL International conference on advances in geographic information systems, ACM, New York, NY, USA, pp 99–108
Zhao L, Shi G (2018) A method for simplifying ship trajectory based on improved douglaspeucker algorithm. Ocean Eng 166(10):37–46
Zhang L, Liu M, Long Y (2018) Trajectory compression with constraints of road networks. Int Arch Photogramm Remote Sens Spat Inf Sci 42:4
Acknowledgements
The authors would like to thank Dr. Zhang Ling for his help in the language of the article.
Funding
This research was funded by the National Natural Science Foundation of China, grant number 41601499, and Chuzhou University Research Startup Foundation, grant number 2020qd48. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; and in the decision to publish the results.
Author information
Authors and Affiliations
Contributions
Conceptualization, YL and ML. Methodology, ML and GH. Formal analysis, ML. Writing—original draft preparation, ML. Writing—review and editing, ML and GH. Supervision, YL.
Corresponding author
Ethics declarations
Compliance with Ethical Standards
The authors certify that this manuscript has never been published. No data have been fabricated or manipulated (including images) to support our conclusions.
Ethical Approval
The experimental of this manuscript does not contain any researches with human participants or animals.
Informed Consent
This manuscript has been submitted with the consent of all authors.
Conflicts of Interest
The authors declare no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, M., He, G. & Long, Y. A SemanticsBased Trajectory Segmentation Simplification Method. J geovis spat anal 5, 19 (2021). https://doi.org/10.1007/s41651021000885
Accepted:
Published:
DOI: https://doi.org/10.1007/s41651021000885