Scalable Complex Event Processing on Top of MapReduce

  • Jiaxue Yang
  • Yu Gu
  • Yubin Bao
  • Ge Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7235)


In this paper, we propose a complex event processing framework on top of MapReduce, which may be widely used in many fields, such as the RFID monitoring and tracking, the intrusion detection and so on. In our framework, data collectors collect events and upload them to distributed file systems asynchronously. Then the MapReduce programming model is utilized to detect and identify events in parallel. Meanwhile, our framework also supports continuous queries over event streams by the cache mechanism. In order to reduce the delay of detecting and processing events, we replace the merge-sort phase in MapReduce tasks with hybrid sort. Also, the results can be responded in the real-time manner to users using the feedback mechanism. The feasibility and efficiency of our proposed framework are verified by the experiments.


Hash Function Intrusion Detection Master Node Event Stream Continuous Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jeffrey, D., Sanjay, G.: Mapreduce: Simplied data processing on large clusters. In: OSDI (2004)Google Scholar
  2. 2.
    Jeffrey, D., Sanjay, G.: Mapreduce: a fiexible data processing tool. Communications of the ACM (2010)Google Scholar
  3. 3.
    Tom, W.: Hadoop: The Definitive Guide. O’Reilly, Yahoo! Press (2009)Google Scholar
  4. 4.
  5. 5.
    Eugene, W., Yanlei, D., Shariq, R.: High-Performance Complex Event Processing over Streams. In: SIGMOD (2006)Google Scholar
  6. 6.
    Kyumars, S.E., Tahmineh, S., Peter, M.F.: Changing Flights in Mid-air: A Model for Safely Modifying Continuous Queries. In: SIGMOD (2011)Google Scholar
  7. 7.
    Chun, C., Feng, L., Beng, C.O.: TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets. In: SIGMOD (2011)Google Scholar
  8. 8.
    Nicholas, P., Matteo, M., Peter, P.: Distributed Complex Event Processing with Query Rewriting. In: DEBS 2009 (2009)Google Scholar
  9. 9.
    Jens, D., Jorge-Arnulfo, Q., Alekh, J.: Hadoop++: Making a Yellow Elephant Run Like a Cheetah. In: VLDB (2010)Google Scholar
  10. 10.
    Tomasz, N., Michalis, P., Chaitanya, M.: MRShare: Sharing Across Multiple Queries in MapReduce. In: VLDB (2010)Google Scholar
  11. 11.
    Yingyi, B., Bill, H., Magdalena, B.: HaLoop: Efficient Iterative Data Processing on Large Clusters. In: VLDB (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jiaxue Yang
    • 1
  • Yu Gu
    • 1
  • Yubin Bao
    • 1
  • Ge Yu
    • 1
  1. 1.Northeastern UniversityChina

Personalised recommendations