Skip to main content
Log in

A Cyberinfrastructure for Big Data Transportation Engineering

  • Original Paper
  • Published:
Journal of Big Data Analytics in Transportation Aims and scope Submit manuscript

Abstract

Big data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, and decrease construction worker injuries, among others. Despite these benefits, research on big data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a transportation-specific programming language, and its big data infrastructure that is aimed at decreasing this barrier to entry. Our evaluation, that uses over two dozen research questions from six categories, shows that research is easier to realize as a BoaT computer program, an order of magnitude faster when this program is run, and exhibits 12–14× decrease in storage requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Adu-Gyamfi YO, Sharma A, Knickerbocker S, Hawkins NR, Jackson M (2017) Framework for evaluating the reliability of wide-area probe data. Transp Res Rec 2643(1):93–104

    Article  Google Scholar 

  • Barai SK (2003) Data mining applications in transportation engineering. Transport 18(5):216–223

    Article  Google Scholar 

  • Biuk-Aghai RP, Kou WT, Fong S (2016) Big data analytics for transportation: problems and prospects for its application in china. In: IEEE Region 10 symposium (TENSYMP). IEEE, pp 173–178

  • Borning A, Ševcíková H, Waddell P (2008a) A domain-specific language for urban simulation variables. In: Proceedings of the 2008 international conference on digital government research. Digital Government Society of North America, pp 207–215

  • Borning A, Waddell P, Förster R (2008b) Urbansim: using simulation to inform public deliberation and decision-making. Digital government, pp 439–464

  • Chakraborty P, Hess JR, Sharma A, Knickerbocker S (2017) Outlier mining based traffic incident detection using big data analytics. Technical Report

  • Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347

    Article  Google Scholar 

  • Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  • Du B, Huang R, Chen X, Xie Z, Liang Y, Lv W, Ma J (2016) Active CTDaaS: a data service framework based on transparent IoD in city traffic. IEEE Trans Comput 65(12):3524–3536

    MathSciNet  Google Scholar 

  • Dyer R, Nguyen HA, Rajan H, Nguyen TN (2015) Boa: ultra-large-scale software repository and source-code mining. ACM Trans Softw Eng Methodol 25(1):7:1–7:34

    Article  Google Scholar 

  • El Faouzi NE, Leung H, Kurian A (2011) Data fusion in intelligent transportation systems: progress and challenges—a survey. Inf Fusion 12(1):4–10

    Article  Google Scholar 

  • Fan J, Han F, Liu H (2014) Challenges of big data analysis. Natl Sci Rev 1(2):293–314

    Article  Google Scholar 

  • Huang T, Wang S, Sharma A (2016) Leveraging high-resolution traffic data to understand the impacts of congestion on safety. In: 17th International conference road safety on five continents (RS5C 2016), Rio de Janeiro, 17–19 May 2016. Statens väg-och transportforskningsinstitut

  • Jagadish H, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big data and its technical challenges. Commun ACM 57(7):86–94

    Article  Google Scholar 

  • Kambatla K, Kollias G, Kumar V, Grama A (2014) Trends in big data analytics. J Parallel Distrib Comput 74(7):2561–2573

    Article  Google Scholar 

  • Kitchin R (2014) The real-time city? Big data and smart urbanism. GeoJournal 79(1):1–14

    Article  Google Scholar 

  • Laney D (2001) 3d data management: controlling data volume, velocity and variety. META Group Res Note 6:70

    Google Scholar 

  • Liu C, Huang B, Zhao M, Sarkar S, Vaidya U, Sharma A (2016) Data driven exploration of traffic network system dynamics using high resolution probe data. In: IEEE 55th conference on decision and control (CDC). IEEE, pp 7629–7634

  • Lv Y, Duan Y, Kang W, Li Z, Wang FY (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873

    Google Scholar 

  • Pike R, Dorward S, Griesemer R, Quinlan S (2005) Interpreting the data: parallel analysis with Sawzall. Sci Program 13(4):277–298

    Google Scholar 

  • Seedah DP, Sankaran B, O’Brien WJ (2015) Approach to classifying freight data elements across multiple data sources. Transp Res Rec 2529:56–65

    Article  Google Scholar 

  • Simmhan Y, Aman S, Kumbhare A, Liu R, Stevens S, Zhou Q, Prasanna V (2013) Cloud-based software platform for big data analytics in smart grids. Comput Sci Eng 15(4):38–47

    Article  Google Scholar 

  • Urso A (2012) Sizzle: a compiler and runtime for Sawzall, optimized for Hadoop

  • US Department of Transportation: Data Inventory (2017). https://www.transportation.gov/data

  • Waddell P, Borning A, Noth M, Freier N, Becke M, Ulfarsson G (2003) Microsimulation of urban development and location choices: design and implementation of urbansim. Netw Spat Econ 3(1):43–67

    Article  Google Scholar 

  • Wang X, Li Z (2016) Traffic and transportation smart with cloud computing on big data. IJCSA 13(1):1–16

    Google Scholar 

  • Wang S, Knickerbocker S, Sharma A (2017) Big-data-driven traffic surveillance system for work zone monitoring and decision supporting. Technical Report

  • Yang J, Ma J (2015) A big-data processing framework for uncertainties in transportation data. In: IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, pp 1–6

  • Zhang J, Wang FY, Wang K, Lin WH, Xu X, Chen C (2011) Data-driven intelligent transportation systems: a survey. IEEE Trans Intell Transp Syst 12(4):1624–1639

    Article  Google Scholar 

  • Zheng Y (2015) Methodologies for cross-domain data fusion: an overview. IEEE Trans Big Data 1(1):16–34

    Article  Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grants CCF-15-18897 and CNS-15-13263. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md Johirul Islam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Islam, M.J., Sharma, A. & Rajan, H. A Cyberinfrastructure for Big Data Transportation Engineering. J. Big Data Anal. Transp. 1, 83–94 (2019). https://doi.org/10.1007/s42421-019-00006-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42421-019-00006-8

Keywords

Navigation