Abstract
Given the wide application of automatic fare collection systems in transit systems across the globe, smartcard data with on- and/or off-boarding information has become a new source of data to understand passenger flow patterns. This paper uses Nanjing, China as a case study and examines the possibility of using the data cube technique in data mining to understand space–time travel patterns of Nanjing rail transit users. One month of smartcard data in October, 2013 was obtained from Nanjing rail transit system, with a total of over 22 million transaction records. We define the original data cube for the smartcard data based on four dimensions—Space, Date, Time, and User, design a hierarchy for each dimension, and use the total number of transactions as the quantitative measure. We develop modules using the programming language Python and share them as open-source on GitHub to enable peer production and advancement in the field. The visualizations of two-dimensional slices of the data cube show some interesting patterns such as different travel behaviors across user groups (e.g. students vs. elders), and irregular peak hours during National Holiday (October 1st–7th) compared to regular morning and afternoon peak hours during regular working weeks. Spatially, multidimensional visualizations show concentrations of various activity opportunities near metro rail stations and the changing popularities of rail stations through time accordingly. These findings support the feasibility and efficiency of the data cube technique as a mean of visual exploratory analysis for massive smart-card data, and can contribute to the evaluation and planning of public transit systems.












Similar content being viewed by others
References
Agard, Bruno, Morency, Catherine, Trépanier, Martin: Mining public transport user behaviour from smart card data. IFAC Proc. Vol. 39(3), 399–404 (2006)
Andrienko, G.L., Andrienko, N.V.: Interactive maps for visual data exploration. Int. J. Geogr. Inf. Sci. 13(4), 355–374 (1999)
Bagchi, M., White, P.: What role for smart-card data from bus systems? Munic. Eng. 157(1), 39–46 (2004)
Bagchi, M., White, P.R.: The potential of public transport smart card data. Transp. Policy 12(5), 464–474 (2005)
Chu, K.A., Chapleau, R.: Enriching archived smart card transaction data for transit demand modeling. Transp. Res. Rec. J. Transp.Res. Board 2063, 63–72 (2008)
Chu, K.A., Chapleau, R., Trépanier, M.: Driver-assisted bus interview: passive transit travel survey with smart card automatic fare collection system and applications. Transp. Res. Rec. J. Transp. Res. Board 2105, 1–10 (2009)
Curries, G., Mesbah, M.: Exploring transit operations performance at a network level using AVL and new GIS visualization methods. The 90th Annual Meeting of the Transportation Research Board (CD-ROM), Washington (2011)
Farzin, J.: Constructing an automated bus origin-destination matrix using farecard and global positioning system data in Sao Paulo, Brazil. Transp. Res. Rec. J. Transp. Res. Board 2072, 30–37 (2008)
Hofmann, M., O’Mahony, M.: Transfer journey identification and analyses from electronic fare collection data. In: Proceedings of the Intelligent Transportation Systems, pp. 34–39. IEEE (2005)
Jang, W.: Travel time and transfer analysis using transit smart card data. Transp. Res. Rec. J. Transp. Res. Board 2144, 142–149 (2010)
Kusakabe, T., Iryo, T., Asakura, Y.: Estimation method for railway passengers’ train choice behavior with smart card transaction data. Transportation 37(5), 731–749 (2010)
Li, S., Dragicevic, S., Castro, F.A., Sester, M., Winter, S., Coltekin, A., Cheng, T.: Geospatial big data handling theory and methods: a review and research challenges. ISPRS J. Photogramm. Remote Sens. 115, 119–133 (2016)
Liao, C., Liu, H.: Development of data-processing framework for transit performance analysis. Transp. Res. Rec. Transp. Res. Board 2143, 34–43 (2010)
Lu, C.T., Boedihardjo, A.P., Shekhar, S.: Analysis of spatial data with map cubes: highway traffic data. Geographic Data Mining and Knowledge Discovery, 2nd edn, pp. 69–97 (2009)
Ma, X.: Smart card data mining and inference for transit system optimization and performance improvement. Doctoral dissertation, University of Washington (2013)
Ma, X., Wang, Y.: Development of a data-driven platform for transit performance measures using smart card data and GPS data. J. Transp. Eng. 140(12), 04014063 (2014)
Ma, X., Wu, Y., Wang, Y.: DRIVE net: an e-science of transportation platform for data sharing, visualization, modeling, and analysis. Transp. Res. Rec. Transp. Res. Board 2215, 37–49 (2011)
Ma, X., Wu, Y.J., Wang, Y., Chen, F., Liu, J.: Mining smart card data for transit riders’ travel patterns. Transp. Res. Part C Emerg. Technol. 36, 1–12 (2013)
Ministry of Transport of the People’s Republic of China.: The Chinese transit metropolis program. http://zizhan.mot.gov.cn/zhuantizhuanlan/gonglujiaotong/gongjiaods/201310/t20131030_1505116.html (2011) (in Chinese)
Morency, C., Trepanier, M., Agard, B.: Measuring transit use variability with smart-card data. Transp. Policy 14(3), 193–203 (2007)
Munizaga, M.A., Palma, C.: Estimation of a disaggregate multimodal public transport Origin–Destination matrix from passive smartcard data from Santiago, Chile. Transp. Res. Part C Emerg. Technol. 24, 9–18 (2012)
Nanjing Planning Bureau. Nanjing Transport Annual Report (2013) (in Chinese)
Nanjing Statistic Bureau. Nanjing Economy and Society Developed Statistical Bulletin (2014) (in Chinese)
Pelletier, M.P., Trépanier, M., Morency, C.: Smart card data use in public transit: a literature review. Transp. Res. Part C Emerg. Technol. 19(4), 557–568 (2011)
Riedel, H.U.: Chinese metro boom shows no sign of abating. Int. Railw. J. 54(11), 46–48, 50, 52 (2014)
Seaborn, C., Attanucci, J., Wilson, N.: Analyzing multimodal public transport journeys in London with smart card fare payment data. Transp. Res. Rec. J. Transp. Res. Board 2121, 55–62 (2009)
Shekhar, S., Lu, C. T., Liu, R., Zhou, C.: CubeView: a system for traffic data visualization. In: Proceedings of IEEE 5th International Conference on Intelligent Transportation Systems, pp. 674–678 (2002)
Shekhar, S., Lu, C. T., Liu, A.: High performance spatial visualization of traffic data (No. CTS 04-04,). University of Minnesota, Center for Transportation Studies (2004)
Song, Y., Miller, H.J.: Exploring traffic flow databases using space-time plots and data cubes. Transportation 39(2), 215–234 (2012)
Sun, Y., Xu, R.: Rail transit travel time reliability and estimation of passenger route choice behavior: analysis using automatic fare collection data. Transp. Res. Rec. J. Transp. Res. Board 2275, 58–67 (2012)
Sun, L., Lee, D. H., Erath, A., Huang, X.: Using smart card data to extract passenger’s spatio-temporal density and train’s trajectory of MRT system. In: Proceedings of the ACM SIGKDD international workshop on urban computing pp. 142–148. ACM (2012)
Trépanier, M., Habib, K.M., Morency, C.: Are transit users loyal? Revelations from a hazard model based on smart card data. Can. J. Civ. Eng. 39(6), 610–618 (2012)
Utsunomiya, M., Attanucci, J., Wilson, N.: Potential uses of transit smart card registration and transaction data to improve transit planning. Transp. Res. Rec. J. Transp. Res. Board 1971, 119–126 (2006)
Zhao, J., Frumin, M., Wilson, N., Zhao, Z.: Unified estimator for excess journey time under heterogeneous passenger incidence behavior using smartcard data. Transp. Res. Part C Emerg. Technol. 34, 70–88 (2013)
Zhu, W., Hu, H., Huang, Z.: Calibrating rail transit assignment models with genetic algorithm and automated fare collection data. Comput. Aided Civ. Infrastruct. Eng. 29(7), 518–530 (2014)
Acknowledgements
This study was sponsored by the International Cooperation and Exchange of the National Natural Science Foundation of China (No. 51561135003) and the Key project of National Natural Science Foundation of China (No. 51338003).
Author information
Authors and Affiliations
Corresponding author
Appendix: Python modules to process and visualize massive smartcard data
Appendix: Python modules to process and visualize massive smartcard data
The original transaction data recorded within metro transit system is commonly archived as a set of binary files; each file stores transactions records for one day. Therefore, we developed scripts using Python programming language to process the raw data, create data cube, realize operations on data cube, and visualize the two-dimensional plots of the operation results. We uploaded the Python scripts to GitHub which is an online development platform to host code, manage projects, and share with others. We have described the module functions at the beginning of each modules and introduced the project in the README file. We made this repository as an open source at: https://github.com/y7song/Research-Collaborations/tree/2016_UMN_YinglingFan_TransitCube. Figure 13 shows the flowchart for all modules in the repository along with their functions.
Rights and permissions
About this article
Cite this article
Song, Y., Fan, Y., Li, X. et al. Multidimensional visualization of transit smartcard data using space–time plots and data cubes. Transportation 45, 311–333 (2018). https://doi.org/10.1007/s11116-017-9790-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11116-017-9790-2


