High-Performance XML Message Brokering

Chapter
Part of the Data-Centric Systems and Applications book series (DCSA)

Abstract

For distributed environments including Web Services, data and application integration, and personalized content delivery, XML is becoming the common wire format for data. In this emerging distributed infrastructure, XML message brokers will play a key role as central exchange points for messages sent between applications and/or users. Users (equivalently, applications, or organizations) subscribe to the message broker by providing profiles expressing their data interests. After arriving at the message broker, these profiles become “standing queries,” which are executed on all incoming data. Data sources publish their data by pushing streams of XML messages to the broker. The broker delivers to each user the messages that match his data interests; these messages are presented in the required format of the user. We have developed “YFilter”, an XML filtering system aimed at providing efficient filtering for large numbers (e.g., 10’s or 100’s of thousands) of path queries. The key innovation in YFilter is a Nondeterministic Finite Automaton (NFA)-based representation of path expressions which combines all queries into a single machine. YFilter exploits commonality among path queries by merging the common prefixes of the paths so that they are processed at most once. The NFA-based implementation also provides additional benefits including a relatively small machine size, flexibility in dealing with diverse characteristics of data and queries, incremental machine construction, and ease of maintenance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Microsoft biztalk server (2002). http://www.microsoft.com/biztalk
  2. 2.
    Sybase financial fushion message broker (2003). http://www.sybase.com/products/middleware/messagebroker
  3. 3.
    M. Altinel, M.J. Franklin, Efficient filtering of XML documents for selective dissemination of information, in Proc. of Int’l Conf. on Very Large Databases (2000) Google Scholar
  4. 4.
    S. Boag, D. Chamberlin, M. Fernandez, D. Florescu, J. Robie, J. Simeon, XQuery 1.0: an XML query language (2002). W3C working draft. http://www.w3.org/TR/xquery
  5. 5.
    T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, Extensible markup language XML 1.0. W3C recommendation (2000). http://www.w3.org/TR/2004/REC-xml-20001006
  6. 6.
    N. Bruno, L. Gravano, N. Koudas, D. Srivastrava, Navigation- vs. index-based XML multi-query processing, in Proc. of IEEE Int’l Conf. on Data Engineering (2003) Google Scholar
  7. 7.
    D. Chamberlin, P. Fankhauser, D. Florescu, M. Marchiori, J. Robie, XML query use cases. W3C working draft (2002). http://www.w3.org/TR/xmlquery-use-cases/
  8. 8.
    C.Y. Chan, P. Felber, M.N. Garofalakis, R. Rastogi, Efficient filtering of XML documents with XPath expressions, in Proc. of IEEE Int’l Conf. on Data Engineering (2002) Google Scholar
  9. 9.
    J. Chen, D.J. DeWitt, J.F. Naughton, Design and evaluation of alternative selection placement strategies in optimizing continuous queries, in Proc. of IEEE Int’l Conf. on Data Engineering (2002) Google Scholar
  10. 10.
    J. Chen, D.J. DeWitt, F. Tian, Y. Wang, NiagaraCQ: a scalable continuous query system for Internet databases, in Proc. of ACM SIGMOD Conf. on Management of Data (2000) Google Scholar
  11. 11.
    J. Clark, S. DeRose, XML path language XPath—version 1.0 (1999). http://www.w3.org/TR/xpath
  12. 12.
    Y. Diao, M. Altinel, M.J. Franklin, H. Zhang, P. Fischer, Path sharing and predicate evaluation for high-performance XML filtering. ACM Trans. Database Syst. (2003) Google Scholar
  13. 13.
    Y. Diao, M.J. Franklin, Query processing for high-volume XML message brokering, in Proc. of Int’l Conf. on Very Large Databases (2003) Google Scholar
  14. 14.
    F. Fabret, H.A. Jacobsen, F. Llirbat, J. Pereira, K.A. Ross, D. Shasha, Filtering algorithms and implementation for very fast publish/subscribe, in Proc. of ACM SIGMOD Conf. on Management of Data (2001) Google Scholar
  15. 15.
    T.J. Green, G. Miklau, M. Onizuka, D. Suciu, Processing XML streams with deterministic automata, in Proc. of IEEE Int’l Conf. on Database Theory (2003) Google Scholar
  16. 16.
    A. Gupta, D. Suciu, Streaming processing of XPath queries with predicates, in Proc. of ACM SIGMOD Conf. on Management of Data (2003) Google Scholar
  17. 17.
    E.N. Hanson, C. Carnes, L. Huang, M. Konyala, L. Noronha, S. Parthasarathy, J. Park, A. Vernon, Scalable trigger processing, in Proc. of IEEE Int’l Conf. on Data Engineering (1999) Google Scholar
  18. 18.
    J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation (Addition-Wesley, Reading, 1979) MATHGoogle Scholar
  19. 19.
    L.V. Lakshmanan, S. Parthasarathy, On efficient matching of streaming XML documents and queries, in Proc. of Int’l Conf. on Extending Database Technology (2002) Google Scholar
  20. 20.
    L. Liu, C. Pu, W. Tang, Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. (1999) Google Scholar
  21. 21.
    B. Ludäscher, P. Mukhopadhyay, Y. Papakonstantinou, A transducer-based xml query processing, in Proc. of Int’l Conf. on Very Large Databases (2002) Google Scholar
  22. 22.
    S.R. Madden, M.A. Shah, J.M. Hellerstein, V. Raman, Continuously adaptive continuous queries over streams, in Proc. of ACM SIGMOD Conf. on Management of Data (2002) Google Scholar
  23. 23.
    B. Nguyen, S. Abiteboul, G. Cobena, M. Preda, Monitoring XML data on the web, in Proc. of ACM SIGMOD Conf. on Management of Data (2001) Google Scholar
  24. 24.
    D. Olteanu, T. Kiesling, F. Bry, An evaluation of regular path expressions with qualifiers against XML streams, in Proc. of IEEE Int’l Conf. on Data Engineering (2003) Google Scholar
  25. 25.
    A. Rosenthal, U.S. Chakravarthy, Anatomy of a modular multiple query optimizer, in Proc. of Int’l Conf. on Very Large Databases (1988) Google Scholar
  26. 26.
    P. Roy, S. Seshadri, S. Sudarshan, S. Bhobe, Efficient and extensible algorithms for multi-query optimization, in Proc. of ACM SIGMOD Conf. on Management of Data (2000) Google Scholar
  27. 27.
    T.K. Sellis, Multiple-query optimization. ACM Trans. Database Syst. (1988) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Massachusetts AmherstAmherstUSA
  2. 2.Computer Science DivisionUniversity of California, BerkeleyBerkeleyUSA

Personalised recommendations