Abstract
Releasing and sharing mobility data, and specifically trajectories, is necessary for many applications, from infrastructure planning to epidemiology. Yet, trajectories are highly sensitive data, because the points visited by an individual can be identifying and also confidential. Hence, trajectories must be anonymized before releasing or sharing them. While most contributions to the trajectory anonymization literature take statistical approaches, deep learning is increasingly being used. We observe that natural language sentences and trajectories share a sequential nature that can be exploited in similar ways. In this paper, we present preliminary work on generating synthetic trajectories using machine learning models typically used for natural language processing. Our empirical results attest to the quality of the generated synthetic trajectories. Furthermore, our methods allow discovering natural neighborhoods based on trajectories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A SF local’s guide to the neighborhoods of San Francisco. https://sfgal.com/sf-locals-guide-to-neighborhoods-of-san-francisco/.
References
Abul, O., Bonchi, F., Nanni, M.: Never walk alone: Uncertainty for anonymity in moving objects databases. In: 2008 IEEE 24th International Conference on Data Engineering, pp. 376–385. IEEE, 7 April 2008
Al-Molegi, A,. Jabreel, M., Ghaleb, B.: STF-RNN: space time features-based recurrent neural network for predicting people next location. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE, 6 December 2006
Cunningham, T., Cormode, G., Ferhatosmanoglu, H., Srivastava, D.: Real-world trajectory sharing with local differential privacy. arXiv preprint arXiv:2108.02084. 4 August 2021
De Montjoye, Y.-A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1–5 (2013)
Domingo-Ferrer, J., Sáinches, D., Blanco-Justicia, A. The limits of differential privacy (and its misuse in data release and machine learning). Commun. ACM 64(7), 33–35 (2021)
Domingo-Ferrer, J., Trujillo-Rasua, R.: Microaggregation- and permutation-based anonymization of movement data. Inf. Sci. 15(208), 55–80 (2012)
Dong, Y., Pi, D.: Novel privacy-preserving algorithm based on frequent path for trajectory data publishing. Knowl. Based Syst. 15(148), 55–65 (2018)
Feng, J., Yang, Z., Xu, F., Yu, H., Wang, M,. Li, Y.: Learning to simulate human mobility. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3426–3433. 25 August 2020
Fiore, W., et al.: Privacy in trajectory micro-data publishing: a survey. Trans. Data Privacy 13, 91–149 (2020)
Gao, Q., Zhou, F., Zhang, K., Trajcevski, G., Luo, X., Zhang, F.: Identifying human mobility via trajectory embeddings. In: IJCAI, vol. 17, pp. 1689–1695, 19 August 2017
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014)
Gramaglia, M., Fiore, M.: Hiding mobile traffic fingerprints with glove. In: Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, pp. 1–13, 1 December 2015
Guerra-Balboa, P., Pascual, A.M., Parra-Arnau, J,. Forné, J.: Strufe. Anonymizing trajectory data: limitations and opportunities (2022)
Hua, J., Gao, Y., Zhong, S.: Differentially private publication of general time-serial trajectory data. In: 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 549–557, IEEE, 26 April 2015
Huang, D., et al.: A variational autoencoder based generative model of urban human mobility. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 425–430. IEEE, 28 March 2019
Jin, F., Hua, W., Francia, M., Chao, P., Orlowska, M., Zhou, X.: A survey and experimental study on privacy-preserving trajectory data publishing. TechRxiv (2021)
Kulkarni, V., Garbinato, B.: Generating synthetic mobility traffic using RNNs. In: Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery, pp. 1–4, 7 November 2017
Luca, M., Barlacchi, G., Lepri, B., Pappalardo, L.: A survey on deep learning for human mobility. ACM Comput. Surv. (CSUR) 55(1), 1–44 (2021)
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In Interspeech 2(3), 1045–1048 (2010)
Piorkowski, M, Sarafijanovic-Djukic, N., Grossglauser, M.: CRAWDAD data set EPFL/mobility (v. 2009–02-24). Traceset: cab, downloaded from February 2009
Rossi, A., Barlacchi, G., Bianchini, M., Lepri, B.: Modelling taxi drivers’ behaviour for the next destination prediction. IEEE Trans. Intell. Transp. Syst. 21(7), 2980–2989 (2019)
Tu, Z., Zhao, K., Xu, F., Li, Y., Su, L., Jin, D.: Protecting trajectory from semantic attack considering \(k\)-anonymity, \(l\)-diversity, and \(t\)-loseness. IEEE Trans. Netw. Serv. Manag. 16(1), 264–78 (2018)
Wang, X., Liu, X., Lu, Z., Yang, H.: Large scale GPS trajectory generation using map based on two stage GAN. J. Data Sci. 19(1), 126–41 (2021)
Xi, L., Hanzhou, C., Clio, A.: trajGANs: using generative adversarial networks for geo-privacy protection of trajectory data. Vision paper (2018)
Xu, M., Han, J.: Next location recommendation based on semantic-behavior prediction. In: Proceedings of the 2020 5th International Conference on Big Data and Computing, pp. 65–73, 28 May 2020
Zheng, Y., Li, Q., Chen, Y., Xie, X., Ma, W.-Y.: Understanding mobility based on GPS data. In: Proceedings of ACM Conference on Ubiquitous Computing (UbiComp 2008), Seoul, Korea, pp. 312–321. ACM Press (2008)
Zheng, Y., Xie, X., Ma, W.-Y.: GeoLife: a collaborative social networking service among User, location and trajectory. IEEE Data Eng. Bull. 33(2), 32–40 (2010)
Zheng, Y., Zhang, L., Xie, X., Ma, W-.Y.: Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of International conference on World Wild Web (WWW 2009), Madrid, Spain, pp. 791–800. ACM Press (2009)
Acknowledgements
This research was funded by the European Commission (projects H2020-871042 “SoBigData++” and H2020-101006879 “MobiDataLab”), and the Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer, FI grant to N. Jebreel). The authors are with the UNESCO Chair in Data Privacy, but the views in this paper are their own and are not necessarily shared by UNESCO.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Blanco-Justicia, A., Jebreel, N.M., Manjón, J.A., Domingo-Ferrer, J. (2022). Generation of Synthetic Trajectory Microdata from Language Models. In: Domingo-Ferrer, J., Laurent, M. (eds) Privacy in Statistical Databases. PSD 2022. Lecture Notes in Computer Science, vol 13463. Springer, Cham. https://doi.org/10.1007/978-3-031-13945-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-13945-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13944-4
Online ISBN: 978-3-031-13945-1
eBook Packages: Computer ScienceComputer Science (R0)