Abstract
The modern data collection and analysis methods enhance reasonable and effective data-driven decision making in transport planning and have significantly expanded the scope of potential data applications. But if the goal is to use data in transport models for evaluating and predicting, the quality of data becomes crucial.
This research is focused on data pre-processing issues such as data understanding, data exploring, and data transformation as an important part of data analysis life cycle rather than transformation models themselves. These phases involve many different tasks and many of the data preparation activities are routine, tedious, and time consuming.
In order to resolve this problem the data preparation framework for Markov-modulated linear regression model, considering the limitations and assumptions, was developed. This kind of regression model can be used for public transport passenger flow prediction or other transport planning tasks and suggests that the model parameters vary randomly in accordance with the external environment. The developed framework is applied on data concerning Riga tram route trip validation captured by e-ticket system “E-talons” and the Latvian Environment, Geology and Meteorology Centre database. R software is used in conjunction with a set of libraries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
TRB’s Transportation Research E-Circular 208: Transformational Technologies in Transportation: State of the Activities
van Oort, N., Cats, O.: Improving public transport decision making, planning, and operations by using Big Data, September 2015
Big data and transport (2016). http://internatinsportforum.org. Accessed 29 May 2016
Big Data’s Implications for Transportation Operations: An Exploration. White Paper—19 December 2014 FHWA-JPO-14-157. www.its.dot.gov/index.htm
International Transport forum (OECD) Big Data and Transport Understanding and assessing options (2015). http://www.internationaltransportforum.org/pub/pdf/15CPB_BigData.pdf
Leveraging Big Data for Managing Transport Operations, May 2018
McGurrin, M.: Big Data and ITS. Noblis, Inc. (2013)
de Regt, K., Cats, O., van Oort, N., van Lint, H.: Investigation potential transit ridership by fusing smartcard and GSM Data. Transp. Res. Rec. J. Transp. Res. Board 2652 (2017). https://doi.org/10.3141/2652-06
Bouillet, E., Gasparini, L., Verscheure O.: Towards a real time public transport awareness system: case study in Dublin. In: Proceedings of the 19th International Conference on Multimedea 2011, Scottsdale, AZ, USA (2011). https://doi.org/10.1145/2072298.2072463
Lathia, N., Capra, L.: How smart is your smartcard? Measuring travel behaviours, perceptions, and incentives. In: Proceedings of the 13th International Conference on Ubiquitous Computing, UbiComp 2011, pp. 291–300. ACM, New York (2011)
Lathia, N., Quercia, D., Crowcroft, J.: The hidden image of the city: sensing community well-being from urban mobility. In: Proceedings of the 10th International Conference on Pervasive Computing, Pervasive 2012, pp. 91–98. Springer, Heidelberg (2012)
Ma, X.-L., Wu, Y.-J., Wang, Y.-H., Chen, F., Liu, J.-F.: Mining smart card data for transit riders’ travel patterns. Transp. Res. Part C: Emerg. Technol. 36, 1–12 (2013)
Xu, G., Zong, Y., Yang, Z.: Applied Data Mining, 1st edn. (2013)
Andronov, A., Spiridovska, N.: Markov-modulated linear regression. In: Proceedings’ Book: International conference on Statistical Models and Methods for Reliability and Survival Analysis and Their Validation (S2MRSA), Bordeaux, France, pp. 24–28 (2012)
Spiridovska, N.: A quasi-alternating Markov-modulated linear regression: model implementation using data about coaches’ delay time. Int. J. Circ. Syst. Sig. Process. 12, 617–628 (2018)
Spiridovska, N., Yatskiv (Jackiva), I.: Public transport passenger flow analysis and prediction using alternating Markov-modulated linear regression. In: 29th European Conference on Operational Research (Euro 2018) Handbook, p. 208 (2018)
Andronov, A., Spiridovska, N., Santalova, D.: On a parametrical estimation for a convolution of exponential densities. In: Book of Abstracts: The 27th Nordic Conference in Mathematical Statistics (Nordstat 2018), Tartu, Estonia, p. 10 (2018)
Acknowledgement
This work was financially supported by the specific support objective activity 1.1.1.2. “Post-doctoral Research Aid” (Project id. N. 1.1.1.2/16/I/001) of the Republic of Latvia, funded by the European Regional Development Fund. Nadezda Spiridovska research project No. 1.1.1.2/VIAA/1/16/075 “Non-traditional regression models in transport modelling”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
(Yatskiv), I.J., Spiridovska, N. (2019). Data Preparation Framework Development for Markov-Modulated Linear Regression Analysis. In: Kabashkin, I., Yatskiv (Jackiva), I., Prentkovskis, O. (eds) Reliability and Statistics in Transportation and Communication. RelStat 2018. Lecture Notes in Networks and Systems, vol 68. Springer, Cham. https://doi.org/10.1007/978-3-030-12450-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-12450-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12449-6
Online ISBN: 978-3-030-12450-2
eBook Packages: EngineeringEngineering (R0)