Personal and Ubiquitous Computing

, Volume 21, Issue 3, pp 443–455 | Cite as

Data fusion in automotive applications

Efficient big data stream computing approach
  • Amir Haroun
  • Ahmed Mostefaoui
  • François Dessables
Original Article


Connected vehicles are capable of collecting, through their embedded sensors, and transmitting huge amounts of data at very high frequencies. Leveraging this data can be valuable for many entities: automobile manufacturer, vehicles owners, third parties, etc. Indeed, this “big data” can be used in a large broad of services ranging from road safety services to aftermarket services (e.g., predictive and preventive maintenance). Nevertheless, processing and storing big data raised new scientific and technological challenges that traditional approaches cannot handle efficiently. In this paper, we address the issue of online (i.e., near real-time) data processing of automotive information. More precisely, we focus on the performance of data fusion to support several millions of connected vehicles. In order to face this performance challenge, we propose novel approaches, based on spatial indexation, to speed up our automotive application. To validate the effectiveness of our proposal, we have implemented and conducted real experiments on PSA Group (PSA Group is the second-largest automobile manufacturer in Europe with about 3 million sold vehicles in 2015) big data streaming platform. The experimental results have demonstrated the efficiency of our spatial indexing and querying techniques.


Big data Connected vehicles Stream computing Data fusion 


  1. 1.
    Apache: Hadoop. Version 2.6.3
  2. 2.
    Berchtold S, Keim DA, Kriegel HP (1996) The x-tree: an index structure for high-dimensional data. In: Proceedings of the 22th international conference on very large data bases, VLDB ’96. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 28–39Google Scholar
  3. 3.
    Brinkhoff T, Kriegel HP, Schneider R (1993) Comparison of approximations of complex objects used for approximation-based query processing in spatial database systems. In: Data engineering, 1993. Proceedings. Ninth International Conference on, pp 40–49Google Scholar
  4. 4.
    Finkel RA, Bentley JL (1974) Quad trees a data structure for retrieval on composite keys. Acta Inform 4(1):1–9CrossRefMATHGoogle Scholar
  5. 5.
    Fournier A, Montuno DY (1984) Triangulating simple polygons and equivalent problems. ACM Trans Graph 3(2):153–174CrossRefMATHGoogle Scholar
  6. 6.
    GOOGLE: Keyhole markup language.
  7. 7.
    Gordon MI, Thies W, Amarasinghe S (2006) Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. SIGARCH Comput Archit News 34(5):151–162CrossRefGoogle Scholar
  8. 8.
    Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57CrossRefGoogle Scholar
  9. 9.
    Haines E (1994) Graphics gems iv. chap. Point in polygon strategies. Academic Press Professional, Inc, San DiegoGoogle Scholar
  10. 10.
    Hill MD (1992) Scalable shared memory multiprocessors, chap. What is scalability?. Springer, BostonGoogle Scholar
  11. 11.
    Hirzel M, Soulé R, Schneider S, Gedik B, Grimm R (2014) A catalog of stream processing optimizations. ACM Comput Surv 46(4):46:1–46:34CrossRefGoogle Scholar
  12. 12.
    IBM: Ibm streams: capture and analyze data in motion.
  13. 13.
  14. 14.
    Isaacson C (2014) Understanding big data scalability: big data scalability series, 1st edn. Prentice Hall, Upper Saddle RiverGoogle Scholar
  15. 15.
    Kirkpatrick DG (1983) Optimal search in planar subdivisions. SIAM J Comput 12(1):28–35MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Labrinidis A, Jagadish H (2012) Challenges and opportunities with big data. Proc VLDB Endow 5(12):2032–2033CrossRefGoogle Scholar
  17. 17.
    Lee E, Lee EK, Gerla M, Oh SY (2014) Vehicular cloud networking: architecture and design principles. IEEE Commun Mag 52(2):148–155. doi: 10.1109/MCOM.2014.6736756 CrossRefGoogle Scholar
  18. 18.
    Lipton RJ, Dobkin DP (1976) Complexity measures and hierarchies for the evaluation of integers and polynomials. Theor Comput Sci 3(3):349–357MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    MacMartin S et al (1992) Fastest point in polygon test. Ray Tracing News 5(3)Google Scholar
  20. 20.
  21. 21.
    Niemeyer G Geohash.
  22. 22.
  23. 23.
    Orenstein JA (1989) Redundancy in spatial databases. SIGMOD Rec 18(2):295–305CrossRefGoogle Scholar
  24. 24.
    O’Rourke J (1985) Finding minimal enclosing boxes. Int J Comput Inf Sci 14(3):183–199MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Preparata FP, Hong SJ (1977) Convex hulls of finite sets of points in two and three dimensions. Commun ACM 20(2):87–93MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Preparata FP, Shamos MI (1985) Computational geometry—an introduction. Texts and monographs in computer science. Springer, HeidelbergMATHGoogle Scholar
  27. 27.
    Rosenfeld A (1975) A converse to the Jordan curve theorem for digital curves. Inf Control 29(3):292–293MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    Sahr K, White D, Kimerling AJ (2003) Geodesic discrete global grid systems. Cartogr Geogr Inf Sci 30(2):121–134CrossRefGoogle Scholar
  29. 29.
    Samet H, Rosenfeld A, Shaffer CA, Webber RE (1984) A geographic information system using quadtrees. Pattern Recognit 17(6):647–656CrossRefGoogle Scholar
  30. 30.
    Seidel R (1991) A simple and fast incremental randomized algorithm for computing trapezoidal decompositions and for triangulating polygons. Comput Geom 1:51–64MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Shahrivari S (2014) Beyond batch processing: towards real-time and streaming big data. CoRR. arXiv:abs/1403.3375

Copyright information

© Springer-Verlag London 2017

Authors and Affiliations

  • Amir Haroun
    • 1
    • 2
  • Ahmed Mostefaoui
    • 2
  • François Dessables
    • 1
  1. 1.PSA GroupBessoncourtFrance
  2. 2.FEMTO-ST Institute/CNRSBourgognes-Franche-Comte UniversityBelfortFrance

Personalised recommendations