Probabilistic Analysis of a Large-Scale Urban Traffic Sensor Data Set
Real-world sensor time series are often significantly noisier and more difficult to work with than the relatively clean data sets that tend to be used as the basis for experiments in many research papers. In this paper we report on a large case-study involving statistical data mining of over 100 million measurements from 1700 freeway traffic sensors over a period of seven months in Southern California. We discuss the challenges posed by the wide variety of different sensor failures and anomalies present in the data. The volume and complexity of the data precludes the use of manual visualization or simple thresholding techniques to identify these anomalies. We describe the application of probabilistic modeling and unsupervised learning techniques to this data set and illustrate how these approaches can successfully detect underlying systematic patterns even in the presence of substantial noise and missing data.
KeywordsProbabilistic modeling MMPP traffic loop sensors Poisson Markov
Unable to display preview. Download preview PDF.
- 4.Ihler, A., Hutchins, J., Smyth, P.: Adaptive event detection with time-varying Poisson processes. In: ACM Int’l Conf. Knowledge Discovery and Data mining, pp. 207–216 (2006)Google Scholar
- 5.Ihler, A., Hutchins, J., Smyth, P.: Learning to detect events with Markov-modulated poisson processes. TKDD 1(3) (2007)Google Scholar
- 6.Jacobson, L.N., Nihan, N.L., Bender, J.D.: Detecting erroneous loop detector data in a freeway traffic management system. Transportation Research Record 1287, 151–166 (1990)Google Scholar
- 7.PeMS. Freeway Performance Measurement System, http://pems.eecs.berkeley.edu/
- 8.Recker, W., Marca, J.: Institute of Transportation Studies, UC Irvine, personal communicationGoogle Scholar
- 9.Scott, S.: Bayesian Methods and Extensions for the Two State Markov Modulated Poisson Process. PhD thesis, Harvard University (1998)Google Scholar