Probabilistic Analysis of a Large-Scale Urban Traffic Sensor Data Set

  • Jon Hutchins
  • Alexander Ihler
  • Padhraic Smyth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5840)


Real-world sensor time series are often significantly noisier and more difficult to work with than the relatively clean data sets that tend to be used as the basis for experiments in many research papers. In this paper we report on a large case-study involving statistical data mining of over 100 million measurements from 1700 freeway traffic sensors over a period of seven months in Southern California. We discuss the challenges posed by the wide variety of different sensor failures and anomalies present in the data. The volume and complexity of the data precludes the use of manual visualization or simple thresholding techniques to identify these anomalies. We describe the application of probabilistic modeling and unsupervised learning techniques to this data set and illustrate how these approaches can successfully detect underlying systematic patterns even in the presence of substantial noise and missing data.


Probabilistic modeling MMPP traffic loop sensors Poisson Markov 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bickel, P., Chen, C., Kwon, J., Rice, J., van Zwet, E., Varaiya, P.: Measuring traffic. Statistical Science 22(4), 581–597 (2007)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Chen, C., Kwon, J., Rice, J., Skabardonis, A., Varaiya, P.: Detecting errors and imputing missing data for single-loop surveillance systems. Transportation Research Record 1855, 160–167 (2003)CrossRefGoogle Scholar
  3. 3.
    Gelman, A.: Bayesian Data Analysis. CRC Press, Boca Raton (2004)zbMATHGoogle Scholar
  4. 4.
    Ihler, A., Hutchins, J., Smyth, P.: Adaptive event detection with time-varying Poisson processes. In: ACM Int’l Conf. Knowledge Discovery and Data mining, pp. 207–216 (2006)Google Scholar
  5. 5.
    Ihler, A., Hutchins, J., Smyth, P.: Learning to detect events with Markov-modulated poisson processes. TKDD 1(3) (2007)Google Scholar
  6. 6.
    Jacobson, L.N., Nihan, N.L., Bender, J.D.: Detecting erroneous loop detector data in a freeway traffic management system. Transportation Research Record 1287, 151–166 (1990)Google Scholar
  7. 7.
    PeMS. Freeway Performance Measurement System,
  8. 8.
    Recker, W., Marca, J.: Institute of Transportation Studies, UC Irvine, personal communicationGoogle Scholar
  9. 9.
    Scott, S.: Bayesian Methods and Extensions for the Two State Markov Modulated Poisson Process. PhD thesis, Harvard University (1998)Google Scholar
  10. 10.
    Scott, S., Smyth, P.: The Markov modulated Poisson process and Markov Poisson cascade with applications to web traffic data. Bayesian Statistics 7, 671–680 (2003)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jon Hutchins
    • 1
  • Alexander Ihler
    • 1
  • Padhraic Smyth
    • 1
  1. 1.Dept. of Computer ScienceUniversity of CaliforniaIrvine96297-3435

Personalised recommendations