Abstract
Big data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, and decrease construction worker injuries, among others. Despite these benefits, research on big data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a transportation-specific programming language, and its big data infrastructure that is aimed at decreasing this barrier to entry. Our evaluation, that uses over two dozen research questions from six categories, shows that research is easier to realize as a BoaT computer program, an order of magnitude faster when this program is run, and exhibits 12–14× decrease in storage requirements.
Similar content being viewed by others
References
Adu-Gyamfi YO, Sharma A, Knickerbocker S, Hawkins NR, Jackson M (2017) Framework for evaluating the reliability of wide-area probe data. Transp Res Rec 2643(1):93–104
Barai SK (2003) Data mining applications in transportation engineering. Transport 18(5):216–223
Biuk-Aghai RP, Kou WT, Fong S (2016) Big data analytics for transportation: problems and prospects for its application in china. In: IEEE Region 10 symposium (TENSYMP). IEEE, pp 173–178
Borning A, Ševcíková H, Waddell P (2008a) A domain-specific language for urban simulation variables. In: Proceedings of the 2008 international conference on digital government research. Digital Government Society of North America, pp 207–215
Borning A, Waddell P, Förster R (2008b) Urbansim: using simulation to inform public deliberation and decision-making. Digital government, pp 439–464
Chakraborty P, Hess JR, Sharma A, Knickerbocker S (2017) Outlier mining based traffic incident detection using big data analytics. Technical Report
Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Du B, Huang R, Chen X, Xie Z, Liang Y, Lv W, Ma J (2016) Active CTDaaS: a data service framework based on transparent IoD in city traffic. IEEE Trans Comput 65(12):3524–3536
Dyer R, Nguyen HA, Rajan H, Nguyen TN (2015) Boa: ultra-large-scale software repository and source-code mining. ACM Trans Softw Eng Methodol 25(1):7:1–7:34
El Faouzi NE, Leung H, Kurian A (2011) Data fusion in intelligent transportation systems: progress and challenges—a survey. Inf Fusion 12(1):4–10
Fan J, Han F, Liu H (2014) Challenges of big data analysis. Natl Sci Rev 1(2):293–314
Huang T, Wang S, Sharma A (2016) Leveraging high-resolution traffic data to understand the impacts of congestion on safety. In: 17th International conference road safety on five continents (RS5C 2016), Rio de Janeiro, 17–19 May 2016. Statens väg-och transportforskningsinstitut
Jagadish H, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 57(7):86–94
Kambatla K, Kollias G, Kumar V, Grama A (2014) Trends in big data analytics. J Parallel Distrib Comput 74(7):2561–2573
Kitchin R (2014) The real-time city? Big data and smart urbanism. GeoJournal 79(1):1–14
Laney D (2001) 3d data management: controlling data volume, velocity and variety. META Group Res Note 6:70
Liu C, Huang B, Zhao M, Sarkar S, Vaidya U, Sharma A (2016) Data driven exploration of traffic network system dynamics using high resolution probe data. In: IEEE 55th conference on decision and control (CDC). IEEE, pp 7629–7634
Lv Y, Duan Y, Kang W, Li Z, Wang FY (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873
Pike R, Dorward S, Griesemer R, Quinlan S (2005) Interpreting the data: parallel analysis with Sawzall. Sci Program 13(4):277–298
Seedah DP, Sankaran B, O’Brien WJ (2015) Approach to classifying freight data elements across multiple data sources. Transp Res Rec 2529:56–65
Simmhan Y, Aman S, Kumbhare A, Liu R, Stevens S, Zhou Q, Prasanna V (2013) Cloud-based software platform for big data analytics in smart grids. Comput Sci Eng 15(4):38–47
Urso A (2012) Sizzle: a compiler and runtime for Sawzall, optimized for Hadoop
US Department of Transportation: Data Inventory (2017). https://www.transportation.gov/data
Waddell P, Borning A, Noth M, Freier N, Becke M, Ulfarsson G (2003) Microsimulation of urban development and location choices: design and implementation of urbansim. Netw Spat Econ 3(1):43–67
Wang X, Li Z (2016) Traffic and transportation smart with cloud computing on big data. IJCSA 13(1):1–16
Wang S, Knickerbocker S, Sharma A (2017) Big-data-driven traffic surveillance system for work zone monitoring and decision supporting. Technical Report
Yang J, Ma J (2015) A big-data processing framework for uncertainties in transportation data. In: IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, pp 1–6
Zhang J, Wang FY, Wang K, Lin WH, Xu X, Chen C (2011) Data-driven intelligent transportation systems: a survey. IEEE Trans Intell Transp Syst 12(4):1624–1639
Zheng Y (2015) Methodologies for cross-domain data fusion: an overview. IEEE Trans Big Data 1(1):16–34
Acknowledgements
This material is based upon work supported by the National Science Foundation under Grants CCF-15-18897 and CNS-15-13263. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Islam, M.J., Sharma, A. & Rajan, H. A Cyberinfrastructure for Big Data Transportation Engineering. J. Big Data Anal. Transp. 1, 83–94 (2019). https://doi.org/10.1007/s42421-019-00006-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42421-019-00006-8