Skip to main content
Log in

MasterMovelets: discovering heterogeneous movelets for multiple aspect trajectory classification

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

In the last few years trajectory classification has been applied to many real problems, basically considering the dimensions of space and time or attributes inferred from these dimensions. However, with the explosion of social media data and the advances in the semantic enrichment of mobility data, a new type of trajectory data has emerged, and the trajectory spatio-temporal points have now multiple and heterogeneous semantic dimensions. By semantic dimensions we mean any type of information that is neither spatial nor temporal. As a consequence, new classification methods are needed to deal with this new type of data. The main challenge is how to automatically select and combine the data dimensions and to discover the subtrajectories that better discriminate the class. In this paper we propose MasterMovelets, a new parameter-free method for trajectory classification which finds the best trajectory partition and dimension combination for robust high dimensional trajectory classification. Experimental results show that our approach outperforms state-of-the-art methods by reducing the classification error up to \(63\%\), indicating that our proposal is very promising for multidimensional sequence data classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://weather.unisys.com/hurricanes.

  2. There are many other strategies in the literature to find exact and approximate solutions for this specific problem. More details can be found in Kung et al. (1975); Veldhuizen and Lamont (2000); Marler and Arora (2004).

  3. https://developer.foursquare.com/.

  4. https://www.wunderground.com/weather/api/.

  5. A classifier presents the best F-measure performance for a class if there is no other classifier with better F-measure score and there are at least a classifier with lower score. In addition, the sum of the bars in bar plot exceed the number of classes, because of ties.

References

  • Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop, vol 10, pp 359–370. AAAI Press, Seattle

  • Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the ACM international conference on management of data (SIGMOD). ACM, New York, pp 491–502

  • Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1082–1090

  • Dodge S, Weibel R, Forootan E (2009) Revealing the physics of movement: comparing the similarity of movement characteristics of different types of moving objects. Comput Environ Urban Syst 33(6):419–434

    Article  Google Scholar 

  • Etemad M, Soares Júnior A, Matwin S (2018) Predicting transportation modes of gps trajectories using feature engineering and noise removal. In: Advances in artificial intelligence: 31st Canadian conference on artificial intelligence, Canadian AI 2018, Toronto, ON, Canada, May 8–11, 2018, proceedings 31. Springer, pp 259–264

  • Ferrero CA (2019) MasterMovelets code. https://github.com/anfer86/dmkd_masterMovelets_results. Accessed 16 July 2019

  • Ferrero CA, Alvares LO, Bogorny V (2016) Multiple aspect trajectory data analysis: research challenges and opportunities. In: XVII Brazilian symposium on geoinformatics, GEOINFO, Campos do Jordão, SP, Brazil, GEOINFO ’16, pp 1–12

  • Ferrero CA, Alvares LO, Zalewsky W, Bogorny V (2018) Movelets: exploring relevant subtrajectories for robust trajectory classification. In: Proceedings of the 33rd ACM SAC, ACM, Pau, France, pp 1–8

  • Frentzos E, Gratsias K, Pelekis N, Theodoridis Y (2005) Nearest neighbor search on moving object trajectories. In: Proceeedings of the international symposium on spatial and temporal databases. Springer, pp 328–345

  • Furtado AS, Kopanaki D, Alvares LO, Bogorny V (2015) Multidimensional similarity measuring for semantic trajectories. Trans GIS 20:280–298

    Article  Google Scholar 

  • Gao Q, Zhou F, Zhang K, Trajcevski G, Luo X, Zhang F (2017) Identifying human mobility via trajectory embeddings. In: Proceedings of the 26th international joint conference on artificial intelligence (IJCAI). AAAI Press, Melbourne, pp 1689–1695

  • Kung HT, Luccio F, Preparata FP (1975) On finding the maxima of a set of vectors. J ACM 22(4):469–476

    Article  MathSciNet  Google Scholar 

  • Lee JG, Han J, Li X, Gonzalez H (2008) Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. VLDB 1(1):1081–1094 10.14778/1453856.1453972

    Google Scholar 

  • Lines J, Bagnall A (2012) Alternative quality measures for time series shapelets. In: Proceedings of the 13th international conference on intelligent data engineering and automated learning. Springer, Berlin, pp 475–483

  • Marler RT, Arora JS (2004) Survey of multi-objective optimization methods for engineering. Struct Multidiscip Optim 26(6):369–395

    Article  MathSciNet  Google Scholar 

  • Mello RdS, Bogorny V, Alvares LO, Santana LHZ, Ferrero CA, Frozza AA, Schreiner GA, Renso C (2019) Master: a multiple aspect view on trajectories. Trans GIS. https://doi.org/10.1111/tgis.12526

    Article  Google Scholar 

  • Patel D, Sheng C, Hsu W, Lee ML (2012) Incorporating duration information for trajectory classification. In: Proceedings of the 28th ICDE. IEEE, Washington, DC, pp 1132–1143. https://doi.org/10.1109/ICDE.2012.72

  • Rowland MM, Bryant LD, Johnson BK, Noyes JH, Wisdom MJ, Thomas JW (1997) Starkey project: history facilities, and data collection methods for ungulate research. Technical report, US Department of Agriculture, Forest Service, Pacific Northwest Research Station, Portland

  • Shokoohi-Yekta M, Hu B, Jin H, Wang J, Keogh E (2017) Generalizing dtw to the multi-dimensional case requires an adaptive approach. Data Min Knowl Discov 31(1):1–31

    Article  MathSciNet  Google Scholar 

  • Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston

    Google Scholar 

  • ten Holt GA, Reinders MJ, Hendriks E (2007) Multi-dimensional dynamic time warping for gesture recognition. In: Proceedings of the 13th annual conference of the advanced school for computing and imaging, vol 300, p 1

  • Veldhuizen DAV, Lamont GB (2000) Multiobjective evolutionary algorithms: analyzing the state-of-the-art. Evol Comput 8(2):125–147

    Article  Google Scholar 

  • Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of the 18th international conference on data engineering. IEEE, San Jose, pp 673–684

  • Xiao Z, Wang Y, Fu K, Wu F (2017) Identifying different transportation modes from trajectory data using tree-based ensemble classifiers. ISPRS Int J Geo-Inf 6(2):57

    Article  Google Scholar 

  • Yang D, Zhang D, Zheng VW, Yu Z (2015) Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns. IEEE Trans Syst Man Cybern Syst 45(1):129–142

    Article  Google Scholar 

  • Ye L, Keogh EJ (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Discov 22(1–2):149–182

    Article  MathSciNet  Google Scholar 

  • Zheng Y, Chen Y, Li Q, Xie X, Ma WY (2010) Understanding transportation modes based on gps data for web applications. ACM Trans Web TWEB 4(1):1–36

    Article  Google Scholar 

Download references

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001 and through the research project Big Data Analytics: Lançando Luz dos Genes ao Cosmos (CAPES/PRINT process number 88887.310782/2018-00). This work was also supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) and Fundação de Amparo a Pesquisa e Inovação do Estado de Santa Catarina (FAPESC) - Project Match - co-financing of H2020 Projects - Grant 2018TR 1266.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Carlos Andres Ferrero or Vania Bogorny.

Additional information

Responsible editor: Panagiotis Papapetrou.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Computing element distance vectors

In this “Appendix” we detail the function \(ComputeElementDistanceVectors()\), introduced in Algorithm 1, for computing element distance vectors. This function computes the distance between all dimensions of trajectory elements in \(T\) and all elements in \(\mathbf{T }\), and is detailed in Algorithm 4. This algorithm has as input the multidimensional trajectory \(T\) and the set of multidimensional trajectories \(\mathbf{T }\), and as the output a 4-dimensional array \(A_1\) containing all element distance values.

figure d

Algorithm 4 computes for trajectory \(T\) the distance between all multidimensional trajectory elements in \(T\) and all trajectory elements in \(\mathbf{T }\) and stores the distance values in a 4-dimensional array, called \(A_1\) (lines 2 to 12). In this loop, the algorithm explores each trajectory point in \(T\) at position \(j\) (lines 4 to 11) and for the \(j\)-th trajectory point in \(T\) it explores, for each dimension \(d\), each trajectory point in \(T_i\) at position \(k\) (lines 5 to 10). In the most internal loop it performs the distance computation between the \(j\)-th trajectory point in \(T\) and the \(k\)-th trajectory point in \(T_i\), represented by \(T[j]\) and \(T_i[k]\), respectively, at dimension \(d\) (line 7). The distance function \(ElementDistance()\) is specific for each dimension. After computing the distance value, it calculates and stores the square of the distance value into \(A\). The 4-dimensional array \(A\) is indexed by \(i\), \(j\), \(d\), and \(k\), in order to store for the \(i\)-th trajectory the distance values between all pairs \(T[j]\) and \(T_i[k]\), at dimension \(d\).

Appendix B: Computing subtrajectory distance vectors

In this “Appendix” we detail the function \(ComputeSubtrajectoryDistanceVectors()\), introduced in Algorithm 1, for computing subtrajectory distance vectors. This function computes the distance between all subtrajectories of length \(w\) in \(T\) and all subtrajectories in the set \(\mathbf{T }\) with the same length. Computing distance between trajectory elements and subtrajectories taking into account multiple dimensions in an efficient way requires using a dynamic programming strategy, to avoid repeating distance computation. It uses the subtrajectory distance values calculated for length \((w-1)\) and the element distance values in \(A_1\) to compute the subtrajectory distance values for length \(w\). This function is detailed in Algorithm 5, that has as input the multidimensional trajectory \(T\), the set of multidimensional trajectories \(\mathbf{T }\), the arrays \(A_{w-1}\) and \(A_1\), which are the subtrajectory distance values calculated for length \((w-1)\) and the element distance values, respectively, and \(w\) that is the length of the subtrajectory distances to be calculated. The output of this algorithm is a new array \(A_{w}\) containing the distance values for a subtrajectory of length \(w\).

figure e

Similarly to algorithm 4, the four nested loops of Algorithm 5 compute the distances between all subtrajectories of length \(w\) in \(T\) and all subtrajectories of length \(w\) in \(\mathbf{T }\) in all dimensions. This computation is fulfilled with a simple sum of the distance of the subtrajectories of length \((w-1)\) with the distance between the next element of the two subtrajectories of length \((w-1)\) and the result is stored in the array \(A_w\) (lines 7 to 10).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ferrero, C.A., Petry, L.M., Alvares, L.O. et al. MasterMovelets: discovering heterogeneous movelets for multiple aspect trajectory classification. Data Min Knowl Disc 34, 652–680 (2020). https://doi.org/10.1007/s10618-020-00676-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-020-00676-x

Keywords

Navigation