Abstract
An often discriminating feature of a location is its social character or how well its visitors know each other. In this paper, we address the question of how we can infer the social contentedness of a location by observing the presence of mobile entities in it. We study a large number of mobility features that can be extracted from visits to a location. We use these features for predicting the social tie strengths of the device owners present in the location at a given moment in time, and output an aggregate score of social connectedness for that location. We evaluate this method by testing it on a real-world dataset. Using a synthetically modified version of this dataset, we further evaluate its robustness against factors that normally degrade the quality of such ubiquitously collected data (e.g. noise, sampling frequency). In each case, we found that the accuracy of the proposed method highly outperforms that of a state-of-the-art baseline methodology.
Notes
- 1.
This anonymization approach is taken by recent mobile phone operating systems.
References
Seeman, T.E.: Social ties and health: the benefits of social integration. Ann. Epidemiol. 6(5), 442–451 (1996)
Kawachi, I., Berkman, L.F.: Social ties and mental health. J. Urban Health 78(3), 458–467 (2001)
Jylhä, M., Aro, S.: Social ties and survival among the elderly in tampere, finland. Int. J. Epidemiol. 18(1), 158–164 (1989)
Baratchi, M., Heijenk, G., van Steen, M.: Spaceprint: a mobility-based fingerprinting scheme for public spaces. arXiv preprint arXiv:1703.09962 (2017)
Petre, A.C., Chilipirea, C., Baratchi, M., Dobre, C., van Steen, M.: WiFi tracking of pedestrian behavior. In: Smart Sensors Networks: Communication Technologies and Intelligent Applications. Elsevier (2017)
Cunche, M., Kaafar, M.A., Boreli, R.: Linking wireless devices using information contained in Wi-Fi probe requests. Pervasive Mob. Comput. 11, 56–69 (2014)
Barbera, M.V., Epasto, A., Mei, A., Perta, V.C.: Signals from the crowd: uncovering social relationships through smartphone probes. In: Proceedings of the 2013 Conference on Internet Measurement Conference, pp. 265–276. ACM (2013)
McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: homophily in social networks. Ann. Rev. Sociol. 27, 415–444 (2001)
Mashhadi, A., Vanderhulst, G., Acer, U.G., Kawsar, F.: An autonomous reputation framework for physical locations based on WiFi signals. In: Proceedings of the 2nd Workshop on Workshop on Physical Analytics, pp. 43–46. ACM (2015)
Cheng, N., Mohapatra, P., Cunche, M., Kaafar, M.A., Boreli, R.: Inferring user relationship from hidden information in WLANs. In: 2012–2012 IEEE Military Communications Conference on MILCOM, pp. 1–6. IEEE (2012)
Scellato, S., Noulas, A., Mascolo, C.: Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1046–1054. ACM (2011)
Eagle, N., Pentland, A.S., Lazer, D.: Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. 106(36), 15274–15278 (2009)
Baratchi, M., Meratnia, N., Havinga, P.J.M.: On the use of mobility data for discovery and description of social ties. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013, NY, USA, pp. 1229–1236 (2013). http://doi.acm.org/10.1145/2492517.2500263
Di Luzio, A., Mei, A., Stefa, J.: Mind your probes: De-anonymization of large crowds through smartphone WiFi probe requests. In: IEEE The 35th Annual IEEE International Conference on Computer Communications, INFOCOM 2016, pp. 1–9. IEEE (2016)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)
Misra, B.: iOS8 MAC randomization analyzed! (2014). http://blog.mojonetworks.com/ios8-mac-randomization-analyzed/. Accessed 21 Nov 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
In this section, we provide algorithmic details on the procedures of the initialization and utilization phase presented before in Sects. 4.4 and 4.5.
Initialization phase: Algorithm 1 shows the pseudo code for the first part of the initialization phase, in which the samples are generated. The inputs of this algorithm are the mobility trace (a collection of detections with form \(\langle d, t\rangle \), with d being a device and t being a timestamp) and a collection of pairwise social tie strengths between some of the devices in the mobility trace. Its output is a collection of samples, each of which is a tuple \(\langle st, \mathbf {MF}\rangle \), with st being a pairwise social tie strength for some pair of devices, and \(\mathbf {MF}\) being the set of values of all the mobility features for the same device pair, calculated in lines 4–7.
Algorithm 2 shows the pseudo code for feature selection and learning a regressor. The input of this algorithm is the collection of samples generated in Algorithm 1 and its output is a regressor trained using the combination of features as selected during feature selection. After computing the mobility features for each pair of devices, we need to determine which features should be supplied to the regressor. The feature selection algorithm performs as follows [15]. Initially the set of selected features is empty. The algorithm moves through the search space in a greedy manner by evaluating features (lines 10–21) and it halts when no new features improve the regression performance (line 18). The performance of the regressor is evaluated using 10-folded cross validation and determining the average of their mean squared errors (line 15). Once the features are selected we proceed to learning the regressor (line 22).
Utilization phase: Algorithm 3 shows the pseudo code for the utilization phase. The algorithm takes the regressor generated in Algorithm 2, the mobility trace, and a timestamp and outputs the aggregate social connectedness score for this mobility trace at that specific timestamp. The algorithm first determines which devices were present at the location at the specific moment in time (line 2). It again uses device timestamps to determine visit starts and ends. After doing so, the method determines the value of mobility features that were given by the feature selection algorithm for each pair of devices present (line 5). Then, these feature values are supplied to the regressor, which predicts the tie strength for each pair (line 6). Finally, the tie strengths are averaged in order to obtain a score of aggregate social connectedness (line 8).
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Brugman, T., Baratchi, M., Heijenk, G., van Steen, M. (2017). Inferring the Social-Connectedness of Locations from Mobility Data. In: Ciampaglia, G., Mashhadi, A., Yasseri, T. (eds) Social Informatics. SocInfo 2017. Lecture Notes in Computer Science(), vol 10540. Springer, Cham. https://doi.org/10.1007/978-3-319-67256-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-67256-4_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67255-7
Online ISBN: 978-3-319-67256-4
eBook Packages: Computer ScienceComputer Science (R0)