Abstract
The cellular network is now nearly an almost ubiquitous and real-time sensor with coverage anywhere and anytime for any device. Mobile network data is a rich source for official statistics, such as human mobility. However, unlike GPS tracks, each mobile device in this data is described without precise knowledge of its spatial characteristics. Furthermore, there is no information about the device’s mobility status (i.e., whether it is moving or not) or speed which are important for behavioral analysis. Common mobility and speed estimations rely on precise location and do not consider privacy leakage risk. In this work, we propose two probabilistic approaches that estimate respectively devices’ mobility and devices’ speed from cellular data and connection likelihood maps for each network cell. Every estimation is computed in a short time and with a short history of data (for speed and for mobility). This constraint may be helpful with the most stringent legal frameworks for mobile operators including the combination of ePrivacy Directive and General Data Protection Regulation (GDPR) in Europe. The proposed approaches are the first we are aware of that allows for both mobility and speed estimation in this context. We experimented on two datasets, obtained from a mobile network operator’s signaling data and the associated GPS tracks of many consenting users. Our speed estimations are over 20% more accurate than common ones based on mobile sites and we provide confidence intervals for each estimation. Mainly due to mobile network uncertainty, our approach for speed estimation are relatively inaccurate at low speeds and the movement detection could remain unclear. However our approach for mobility estimation fills this gap.










Similar content being viewed by others
Notes
Regarding the reproducibility of the solution, the users of the 1st dataset have agreed to let us use their mobile data to test our methods. It is possible to contact the authors to discuss potential access to the raw data. Concerning the scalability of the solution, the users of the 2nd dataset have accepted that we use their mobile data to test our methods. We can only publish the results, but not the raw data, as this purpose is not included in the consent signed by the users.
References
Attar, A.E.: Estimation robuste des modèles de mélange sur des données distribuées (2012). https://api.semanticscholar.org/CorpusID:40602371
Bayes, T.: Lii. an essay towards solving a problem in the doctrine of chances. by the late rev. mr. bayes, frs communicated by mr. price, in a letter to john canton, amfr s. Philos. Trans. Roy. Soc. Lond. 53, 370–418 (1963)
Bhattacharyya, A.: On a measure of divergence between two multinomial populations. Sankhya Indian J. Stat. (1933–1960) 7(4), 401–406 (1946)
Blondel, V.D., Decuyper, A., Krings, G.: A survey of results on mobile phone datasets analysis. EPJ Data Sci. 4, 1–55 (2015)
Bonnetain, L.: Unlocking the potential of mobile phone data for large scale urban mobility estimation. PhD thesis, Université de Lyon (2022)
Bufort, A., Lebocq, L., Cathabard, S.: Data-driven radio propagation modeling using graph neural networks. TechRxiv (2023)
Chambreuil, P., Jeon, J.Y., Barba, T.: The value of network data confirmed by the covid-19 epidemic and its expanded usages. Data Policy 4, e4 (2022)
Chao, P., Xu, Y., Hua, W., et al.: A survey on map-matching algorithms. In: Databases Theory and Applications: 31st Australasian Database Conference, ADC 2020, Melbourne, VIC, Australia, February 3–7, 2020, Proceedings 31, pp 121–133. Springer (2020)
Chen, C.H.: A cell probe-based method for vehicle speed estimation. IEICE Trans. Fund. Electron. Commun. Comput. Sci. 103, 265–267 (2020). https://doi.org/10.1587/transfun.2019TSL0001
Chung, J., Kannappan, P., Ng, C., et al.: Measures of distance between probability distributions. J. Math. Anal. Appl. 138, 280–292 (1989). https://doi.org/10.1016/0022-247X(89)90335-1
De Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., et al.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1–5 (2013)
del Peral-Rosado, J.A., Raulefs, R., López-Salcedo, J.A., et al.: Survey of cellular mobile radio localization methods: from 1g to 5g. IEEE Commun. Surv. Tutor. 20(2), 1124–1148 (2018). https://doi.org/10.1109/COMST.2017.2785181
Deville, P., Linard, C., Martin, S., et al.: Dynamic population mapping using mobile phone data. Proc. Natl. Acad. Sci. 111(45), 15888–15893 (2014)
Dong, H., Man, J., Jia, L., et al.: Traffic speed estimation using mobile phone location data based on longest common subsequence. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp 2819–2824. IEEE (2018)
Fiore, M., Katsikouli, P., Zavou, E., et al.: Privacy in trajectory micro-data publishing: a survey. Trans. Data Privacy 13, 91–149 (2020)
Garnier, J., Méléard, S., Touzi, N.: Aléatoire. Dpt de Mathématiques Appliquées, Ecole polytechnique (2019)
Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008)
Graells-Garrido, E., Peredo, O., García, J.: Sensing urban patterns with antenna mappings: the case of Santiago, Chile. Sensors 16(7), 1098 (2016). https://doi.org/10.3390/s16071098
Hellinger, E.: Die orthogonalinvarianten quadratischer formen von unendlichvielen variabelen. W. Fr Kaestner (1907)
Järv, O., Tenkanen, H., Toivonen, T.: Enhancing spatial accuracy of mobile phone data using multi-temporal dasymetric interpolation. Int. J. Geogr. Inf. Sci. 31(8), 1630–1651 (2017)
Ji, Q., Jin, B., Cui, Y., et al.: Using mobile signaling data to classify vehicles on highways in real time. In: 2017 18th IEEE International Conference on Mobile Data Management (MDM), pp 174–179. IEEE (2017)
Katsikouli, P., Fiore, M., Furno, A., et al.: Characterizing and removing oscillations in mobile phone location data. In: 2019 IEEE 20th International Symposium on “A World of Wireless, Mobile and Multimedia Networks” (WoWMoM), pp 1–9. IEEE (2019)
Kiefer, S.: On computing the total variation distance of hidden markov models (2018). arXiv preprint arXiv:1804.06170
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Lai, W.K., Kuo, T.H.: Vehicle positioning and speed estimation based on cellular network signals for urban roads. ISPRS Int. J. Geo Inf. 5(10), 181 (2016)
Lindsay, B.G.: Mixture models: theory, geometry and applications. In: NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5, pp. i–163 (1995). http://www.jstor.org/stable/4153184
Luo, A., Chen, S., Xv, B.: Enhanced map-matching algorithm with a hidden Markov model for mobile phone positioning. ISPRS Int. J. Geo Inf. 6(11), 327 (2017). https://doi.org/10.3390/ijgi6110327
Meersman, F.D., Seynaeve, G., Debusschere, M., et al.: Assessing the quality of mobile phone data as a source of statistics. In: Statistics, Belgium (2016)
Mohamed, R., Aly, H., Youssef, M.: Accurate real-time map matching for challenging environments. IEEE Trans. Intell. Transp. Syst. 18(4), 847–857 (2016)
Newson, P., Krumm, J.: Hidden markov map matching through noise and sparseness. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 336–343 (2009)
Obradovic, D., Lenz, H., Schupfner, M.: Fusion of map and sensor data in a modern car navigation system. VLSI Signal Process. 45, 111–122 (2006). https://doi.org/10.1007/s11265-006-9775-4
Ogulenko, A., Benenson, I., Omer, I., et al.: Probabilistic positioning in mobile phone network and its consequences for the privacy of mobility data. Comput. Environ. Urban Syst. 85, 101550 (2021). https://doi.org/10.1016/j.compenvurbsys.2020.101550
Pullano, G., Valdano, E., Scarpa, N., et al.: Evaluating the effect of demographic factors, socioeconomic factors, and risk aversion on mobility during the covid-19 epidemic in France under lockdown: a population-based study. Lancet Digit. Health 2, e638–e649 (2020). https://doi.org/10.1016/S2589-7500(20)30243-0
Pyo, J.S., Shin, D.H., Sung, T.K.: Development of a map matching method using the multiple hypothesis technique. In: ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No. 01TH8585), pp 23–27. IEEE (2001)
Qi, Y., Yu, C., Suh, Y.J., et al.: Gps tethering for energy conservation. In: 2015 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1320–1325. IEEE (2015)
Ricciato, F., Widhalm, P., Pantisano, F., et al.: Beyond the “single-operator, cdr-only’’ paradigm: an interoperable framework for mobile phone network data analyses and population density estimation. Pervasive Mob. Comput. (2016). https://doi.org/10.1016/j.pmcj.2016.04.009
Ricciato, F., Lanzieri, G., Wirthmann, A., et al.: Towards a methodological framework for estimating present population density from mobile network operator data. Pervasive Mob. Comput. 68, 101263 (2020). https://doi.org/10.1016/j.pmcj.2020.101263
Tennekes, M., Gootzen, Y.A.: A bayesian approach to location estimation of mobile devices from mobile network operator data (2021). arXiv preprint arXiv:2110.00439
Wang, F., Chen, C.: On data processing required to derive mobility patterns from passively-generated mobile phone data. Transport. Res. Part C Emerg. Technol. 87, 58–74 (2018). https://doi.org/10.1016/j.trc.2017.12.003
Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference, vol. 26. Springer (2004)
Wu, W., Wang, Y., Gomes, J.B., et al.: Oscillation resolution for mobile phone cellular tower data to enable mobility modelling. In: 2014 IEEE 15th International Conference on Mobile Data Management, pp. 321–328. IEEE (2014)
Yamartino, R.J.: A comparison of several “single-Pass’’ estimators of the standard deviation of wind direction. J. Appl. Meteorol. Climatol. 23(9), 1362–1366 (1984)
Zheng, Y.: Trajectory data mining: an overview. ACM Trans. Intell. Syst. Technol. 6(3), 1–41 (2015)
Acknowledgements
This work is a part of a research project carried out at Orange Innovation in collaboration with the Internet Physics Chair (Mines Paris - PSL University). This work is (partially) supported by the EIPHI Graduate School (contract ANR-17-EURE-0002). Some of the aspects discussed in this article are the subject of two Orange patent protection applications (that can be found on Espacenet (https://worldwide.espacenet.com/?locale=fr_EP) under the name of the corresponding author).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Eq. 3
To demonstrate Eq. (A), we rely rigourously on Garnier et al. (2019, pages 123–127).
Let us consider \(\begin{array}{lrcl} g_0 : &{} {\mathbb {R}}^4 &{} \longrightarrow &{} {\mathbb {R}}_+ \\ &{} (x_1,y_1, x_2, y_2) &{} \longmapsto &{} \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2} \\ \end{array}\)
which is measurable.
Therefore,
We introduce now \(\begin{array}{lrcl} g' : &{} {\mathbb {R}}^4 &{} \longrightarrow &{} {\mathbb {R}}^3 \times {\mathbb {R}}_+ \\ &{} (x_1,y_1, x_2, y_2) &{} \longmapsto &{} (x_1,y_1, x_2, \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}) \\ \end{array}\)
The function \(g'\) is clearly not bijective. However, we can partition \({\mathbb {R}}^4\) such that
and note these 3 sets respectively \(E_1\), \(E_2\), and \(E_3\) so \({\mathbb {R}}^4 = E_1 \cup E_2 \cup E_3\). Let us remark that \(E_1\) and \(E_2\) are open sets of \({\mathbb {R}}^4\) for the Euclidean norm. This can be shown by saying that \(E_1\) (resp. \(E_2\)) are the reciprocal image of the open space \(]0, +\infty [\) (resp. \(]-\infty , 0[\)) by the continuous application \(\begin{array}{lrcl} k : &{} {\mathbb {R}}^4 &{} \longrightarrow &{} {\mathbb {R}} \\ &{} (x_1,y_1, x_2, y_2) &{} \longmapsto &{} y_2-y_1 \\ \end{array}\)
We can also notice that \(E_3\) is a linear subset of \({\mathbb {R}}^4\) which is a normed linear subset. \(E_3\) being not equal to \({\mathbb {R}}^4\), its interior is empty and its measure is null.
Let us note h a bounded and continuous function defined from \({\mathbb {R}}^4\) to \({\mathbb {R}}\). Therefore,
thanks to the assumption of independence of \((X_1, Y_1)\) and \((X_2, Y_2)\) and omitting the integrands in each integral. The third integral (over \(E_3\)) equals to 0 because the interior of \(E_3\) is empty, and we want to do the change of variable \(d = \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}\) in the two other.
Then, we consider \(\begin{array}{lrcl} g'_{1} : &{} E_1 &{} \longrightarrow &{} F \\ &{} (x_1,y_1, x_2, y_2) &{} \longmapsto &{} (x_1,y_1, x_2, \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}) \\ \end{array}\)
with \(F = \{(x_1, y_2, x_2, d) \in {\mathbb {R}}^3 \times {\mathbb {R}}^*_+ / d > |x_2-x_1|\}\).
The function \(g'_{1}\) is a bijection continuously differentiable, therefore its inverse function can be defined as \(\begin{array}{lrcl} g'^{(-1)}_{E_1} : &{} F &{} \longrightarrow &{} E_1 \\ &{} (x_1,y_1, x_2, d) &{} \longmapsto &{} (x_1,y_1, x_2, y_1 - \sqrt{d^2 - (x_2-x_1)^2}) \\ \end{array}\)
The associated Jacobian matrix is therefore
and the absolute value of its determinant is
We can do the same with \(E_2\) and define the bijective continuously differentiable function \(\begin{array}{lrcl} g'_{2} : &{} E_2 &{} \longrightarrow &{} F \\ &{} (x_1,y_1, x_2, y_2) &{} \longmapsto &{} (x_1,y_1, x_2, \sqrt{(x_2-x_1)^2 + (y_2-y_1)^2}) \\ \end{array}\)
and therefore obtain
Thus, we can show that
with
and \({\varvec{1}}\) being the indicator function.
Finally, we can obtain
Appendix B: Direction estimation model
We consider a device making two events as in Sect. Main assumptions, and we define,
the random variable representing the direction of movement between \(e_1\) and \(e_2\). This is based on the third assumption made in Sect. Main assumptions, and because \((X_i, Y_i)\) are random variables giving respectively Lambert II coordinates (plane projection). We keep noting \(f_i\) the spatial probability density of presence related to \((X_i, Y_i)\) and \(C_i\) at time \(t_i\), which can be obtain as in Sect. Cell coverage modelling. The second assumption made in Sect. Main assumptions (independence) is here necessary to show, in the same way as in Appendix A, that \(\Theta _{12}\) has density \(f_{\Theta _{12}}\), and for all \(\theta \in [0, 2\pi [\),
with
and \({\varvec{1}}\) being the indicator function.
Integrating the density (28) over a direction interval \([\theta _a, \theta _b]\) included in
and noting
(i.e., all the location pairs corresponding to an angle of more than \(\theta _a\) and less than \(\theta _b\)), one obtains, after inversion and with the change of variable \(y_2 = y_1 + (x_2-x_1)\tan (\theta )\), a much more convenient form for numerical calculation:
with
At this point, we can do the same remarks as in Sect. Speed density: two network events but considering direction (angle) instead of distance or speed. Then we can follow the method defined in Sect. Oscillation phenomenon mitigation and Sect. Speed density: multiple network events with only slight changes in order to deduce a direction probability density (of a variable \(\Theta\)).
Then, computing a mean direction and a confidence interval requires a bit of caution. A direction probability density p is a circular distribution and its mean direction is \({\mathbb {E}}(\Theta ) = \arg (m)\) with \(m = \int _{0}^{2\pi } p(\theta )e^{i\theta }d\theta\), i being the imaginary unit.
To the best of our knowledge, there is no computation of confidence intervals for circular distributions, but it is possible to estimate standard deviations. One possible estimation is the one propose in Yamartino (1984), \(\sigma _{\Theta } = \arcsin (\epsilon )(1+(\frac{2}{\sqrt{3}} - 1)\epsilon ^{3})\) where \(\epsilon = \sqrt{1-(Re(m)^2 + Im(m)^2)}\) where Re(m) and Im(m) are the real part and imaginary part of m.
However, this proposed variant to compute a direction of movement does not give as satisfactory results as the proposed speed estimation model. The direction estimated is accurate qualitatively (i.e., looking at some examples on a map), but it is not much more accurate than naive approaches (i.e., based on mobile sites and a rough direction estimation). Therefore, we left this as future work and only present in this paper the experimentation made for speed and mobility estimation (see Sect. Experimentation).
It is also possible to propose the same kind of approach for mobility estimation but based on direction (angle) densities, computed as in this “Appendix”. The intuition is that when a movement occurs, the direction density of a mobile device gets tight whereas it is wide when the mobile device is static (mainly because of the oscillation phenomenon and the relatively wide covered areas). We intend to investigate further this approach in future work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Scholler, R., Alaoui-Ismaïli, O., Renaud, D. et al. In-stream mobility and speed estimation of mobile devices from mobile network data. Transportation (2024). https://doi.org/10.1007/s11116-024-10494-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s11116-024-10494-5

