Abstract
For distributed environments including Web Services, data and application integration, and personalized content delivery, XML is becoming the common wire format for data. In this emerging distributed infrastructure, XML message brokers will play a key role as central exchange points for messages sent between applications and/or users. Users (equivalently, applications, or organizations) subscribe to the message broker by providing profiles expressing their data interests. After arriving at the message broker, these profiles become “standing queries,” which are executed on all incoming data. Data sources publish their data by pushing streams of XML messages to the broker. The broker delivers to each user the messages that match his data interests; these messages are presented in the required format of the user. We have developed “YFilter”, an XML filtering system aimed at providing efficient filtering for large numbers (e.g., 10’s or 100’s of thousands) of path queries. The key innovation in YFilter is a Nondeterministic Finite Automaton (NFA)-based representation of path expressions which combines all queries into a single machine. YFilter exploits commonality among path queries by merging the common prefixes of the paths so that they are processed at most once. The NFA-based implementation also provides additional benefits including a relatively small machine size, flexibility in dealing with diverse characteristics of data and queries, incremental machine construction, and ease of maintenance.
This work has been supported in part by the National Science Foundation under the ITR grants IIS0086057 and SI0122599 and by Boeing, IBM, Intel, Microsoft, Siemens, and the UC MICRO program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Microsoft biztalk server (2002). http://www.microsoft.com/biztalk
Sybase financial fushion message broker (2003). http://www.sybase.com/products/middleware/messagebroker
M. Altinel, M.J. Franklin, Efficient filtering of XML documents for selective dissemination of information, in Proc. of Int’l Conf. on Very Large Databases (2000)
S. Boag, D. Chamberlin, M. Fernandez, D. Florescu, J. Robie, J. Simeon, XQuery 1.0: an XML query language (2002). W3C working draft. http://www.w3.org/TR/xquery
T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, Extensible markup language XML 1.0. W3C recommendation (2000). http://www.w3.org/TR/2004/REC-xml-20001006
N. Bruno, L. Gravano, N. Koudas, D. Srivastrava, Navigation- vs. index-based XML multi-query processing, in Proc. of IEEE Int’l Conf. on Data Engineering (2003)
D. Chamberlin, P. Fankhauser, D. Florescu, M. Marchiori, J. Robie, XML query use cases. W3C working draft (2002). http://www.w3.org/TR/xmlquery-use-cases/
C.Y. Chan, P. Felber, M.N. Garofalakis, R. Rastogi, Efficient filtering of XML documents with XPath expressions, in Proc. of IEEE Int’l Conf. on Data Engineering (2002)
J. Chen, D.J. DeWitt, J.F. Naughton, Design and evaluation of alternative selection placement strategies in optimizing continuous queries, in Proc. of IEEE Int’l Conf. on Data Engineering (2002)
J. Chen, D.J. DeWitt, F. Tian, Y. Wang, NiagaraCQ: a scalable continuous query system for Internet databases, in Proc. of ACM SIGMOD Conf. on Management of Data (2000)
J. Clark, S. DeRose, XML path language XPath—version 1.0 (1999). http://www.w3.org/TR/xpath
Y. Diao, M. Altinel, M.J. Franklin, H. Zhang, P. Fischer, Path sharing and predicate evaluation for high-performance XML filtering. ACM Trans. Database Syst. (2003)
Y. Diao, M.J. Franklin, Query processing for high-volume XML message brokering, in Proc. of Int’l Conf. on Very Large Databases (2003)
F. Fabret, H.A. Jacobsen, F. Llirbat, J. Pereira, K.A. Ross, D. Shasha, Filtering algorithms and implementation for very fast publish/subscribe, in Proc. of ACM SIGMOD Conf. on Management of Data (2001)
T.J. Green, G. Miklau, M. Onizuka, D. Suciu, Processing XML streams with deterministic automata, in Proc. of IEEE Int’l Conf. on Database Theory (2003)
A. Gupta, D. Suciu, Streaming processing of XPath queries with predicates, in Proc. of ACM SIGMOD Conf. on Management of Data (2003)
E.N. Hanson, C. Carnes, L. Huang, M. Konyala, L. Noronha, S. Parthasarathy, J. Park, A. Vernon, Scalable trigger processing, in Proc. of IEEE Int’l Conf. on Data Engineering (1999)
J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation (Addition-Wesley, Reading, 1979)
L.V. Lakshmanan, S. Parthasarathy, On efficient matching of streaming XML documents and queries, in Proc. of Int’l Conf. on Extending Database Technology (2002)
L. Liu, C. Pu, W. Tang, Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. (1999)
B. Ludäscher, P. Mukhopadhyay, Y. Papakonstantinou, A transducer-based xml query processing, in Proc. of Int’l Conf. on Very Large Databases (2002)
S.R. Madden, M.A. Shah, J.M. Hellerstein, V. Raman, Continuously adaptive continuous queries over streams, in Proc. of ACM SIGMOD Conf. on Management of Data (2002)
B. Nguyen, S. Abiteboul, G. Cobena, M. Preda, Monitoring XML data on the web, in Proc. of ACM SIGMOD Conf. on Management of Data (2001)
D. Olteanu, T. Kiesling, F. Bry, An evaluation of regular path expressions with qualifiers against XML streams, in Proc. of IEEE Int’l Conf. on Data Engineering (2003)
A. Rosenthal, U.S. Chakravarthy, Anatomy of a modular multiple query optimizer, in Proc. of Int’l Conf. on Very Large Databases (1988)
P. Roy, S. Seshadri, S. Sudarshan, S. Bhobe, Efficient and extensible algorithms for multi-query optimization, in Proc. of ACM SIGMOD Conf. on Management of Data (2000)
T.K. Sellis, Multiple-query optimization. ACM Trans. Database Syst. (1988)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Diao, Y., Franklin, M.J. (2016). High-Performance XML Message Brokering. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds) Data Stream Management. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28608-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-28608-0_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28607-3
Online ISBN: 978-3-540-28608-0
eBook Packages: Computer ScienceComputer Science (R0)