A Privacy-Preserving Semantic Annotation Framework Using Online Social Media

  • Shuo WangEmail author
  • Richard Sinnott
  • Surya Nepal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10966)


Semantic annotation framework that allows enriching locations or trajectories with semantic abstractions of the raw spatiotemporal data benefits understanding the semantic behavior of moving objects. Existing semantic annotation approaches mainly analyze specific parts of a trajectory, e.g. stops, in association with data from 3rd party geographic sources, e.g. (POI) points-of-interest, road networks. However, these semantic resources are static thus miss important dynamic event information. Recent location-based social networking provides a new dynamic and prevalent source of human activity data that can be a potential semantic resource for annotation. However, using the large-scale spatiotemporal data from online social media gives rise to privacy concerns. This paper thus presents a privacy-preserving semantic annotation framework P-SAFE that (i) identifies dynamic region of interest (DRI) from large-scale data provided by location based social networks whilst labelling of DRI into appropriate categories derived from spatial and temporal features of geotags, (ii) aligns trajectories to a set of DRI and enriches trajectories with semantics annotation derived from aligned DRI via THMM model, and (iii) embeds robust privacy-preserving mechanisms under differential privacy in each stage that accesses to raw data. P-SAFE approach tackles the privacy and utility trade-offs for meaningful geographic regions identification and labeling as well as trajectory semantic annotation under differential privacy whilst combining them into a single task. We demonstrate the effectiveness of P-SAFE approach on a dataset of large-scale geotagged tweets and a benchmark trajectory dataset for DRI construction and trajectory semantic annotation evaluation. The experimental results illustrate that P-SAFE not only provides robust privacy guarantees but remains approximate 45–56% accuracy for meaningful geographic regions labelling and 62–76% accuracy for trajectory semantic annotation.


  1. 1.
    Yan, Z., Chakraborty, D., Parent, C., Spaccapietra, S., Aberer, K.: Semantic trajectories: mobility data computation and annotation. ACM Trans. Intell. Syst. Technol. (TIST) 4(3), 49 (2013)Google Scholar
  2. 2.
    Yan, Z., Chakraborty, D., Parent, C., Spaccapietra, S., Aberer, K.: SeMiTri: a framework for semantic annotation of heterogeneous trajectories. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 259–270. ACM (2011)Google Scholar
  3. 3.
    Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. Pers. Ubiquitous Comput. 7(5), 275–286 (2003)CrossRefGoogle Scholar
  4. 4.
    Spaccapietra, S., Parent, C., Damiani, M.L., de Macedo, J.A., Porto, F., Vangenot, C.: A conceptual view on trajectories. Data Knowl. Eng. 65(1), 126–146 (2008)CrossRefGoogle Scholar
  5. 5.
    Rodrigue, J.P., Comtois, C., Slack, B.: The Geography of Transport Systems. Taylor & Francis, Abingdon (2016)Google Scholar
  6. 6.
    Phithakkitnukoon, S., Horanont, T., Di Lorenzo, G., Shibasaki, R., Ratti, C.: Activity-aware map: identifying human daily activity pattern using mobile phone data. In: Salah, A.A., Gevers, T., Sebe, N., Vinciarelli, A. (eds.) HBU 2010. LNCS, vol. 6219, pp. 14–25. Springer, Heidelberg (2010). Scholar
  7. 7.
    Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). Scholar
  9. 9.
    Lee, R., Wakamiya, S., Sumiya, K.: Urban area characterization based on crowd behavioral lifelogs over Twitter. Pers. Ubiquitous Comput. 17(4), 605–620 (2013)CrossRefGoogle Scholar
  10. 10.
    Cheng, Z., Caverlee, J., Lee, K., Sui, D.Z.: Exploring millions of footprints in location sharing services. In: ICWSM 2011, pp. 81–88 (2011)Google Scholar
  11. 11.
    Andrienko, G.L., Andrienko, N.V., Fuchs, G., Raimond, A.M.O., Symanzik, J., Ziemlicki, C.: Extracting semantics of individual places from movement data by analyzing temporal patterns of visits. In: COMP@ SIGSPATIAL, pp. 9–15 (2013)Google Scholar
  12. 12.
    Hasan, S., Zhan, X., Ukkusuri, S.V.: Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In: Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, p. 6. ACM (2013)Google Scholar
  13. 13.
    França, U., Sayama, H., McSwiggen, C., Daneshvar, R., Bar-Yam, Y.: Visualizing the “heartbeat” of a city with tweets. Complexity 21(6), 280–287 (2016)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Li, L., Goodchild, M.F., Xu, B.: Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr. Cartogr. Geogr. Inf. Sci. 40(2), 61–77 (2013)CrossRefGoogle Scholar
  15. 15.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996, pp. 226–231 (1996)Google Scholar
  16. 16.
    Andrés, M.E., Bordenabe, N.E., Chatzikokolakis, K., Palamidessi, C.: Geo-indistinguishability: differential privacy for location-based systems. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 901–914. ACM (2013)Google Scholar
  17. 17.
    Toch, E., Cranshaw, J., Drielsma, P.H., Tsai, J.Y., Kelley, P.G., Springfield, J., Cranor, L., Hong, J., Sadeh, N.: Empirical models of privacy in location sharing. In: Proceedings of the 12th ACM International Conference on Ubiquitous Computing, pp. 129–138. ACM (2010)Google Scholar
  18. 18.
    To, H., Nguyen, K., Shahabi, C.: Differentially private publication of location entropy. In: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, p. 35. ACM (2016)Google Scholar
  19. 19.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  20. 20.
    Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)MathSciNetCrossRefGoogle Scholar
  21. 21.
    McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2007, pp. 94–103. IEEE (2007)Google Scholar
  22. 22.
    Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order Markov models. J. Artif. Intell. Res. 22, 385–421 (2004)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Frias-Martinez, V., Frias-Martinez, E.: Spectral clustering for sensing urban land use using Twitter activity. Eng. Appl. Artif. Intell. 35, 237–245 (2014)CrossRefGoogle Scholar
  24. 24.
    Xue, M., Kalnis, P., Pung, H.K.: Location diversity: enhanced privacy protection in location based services. In: Choudhury, T., Quigley, A., Strang, T., Suginuma, K. (eds.) LoCA 2009. LNCS, vol. 5561, pp. 70–87. Springer, Heidelberg (2009). Scholar
  25. 25.
    He, X., Cormode, G., Machanavajjhala, A., Procopiuc, C.M., Srivastava, D.: DPT: differentially private trajectory synthesis using hierarchical reference systems. Proc. VLDB Endow. 8(11), 1154–1165 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Monash UniversityMelbourneAustralia
  2. 2.University of MelbourneMelbourneAustralia
  3. 3.CSIROSydneyAustralia

Personalised recommendations