Abstract
A parallel scanning method using the concept of bitstream addition is introduced and studied in application to the problem of XML parsing and well-formedness checking. On processors supporting W-bit addition operations, the method can perform up to W finite state transitions per instruction. The method is based on the concept of parallel bitstream technology, in which parallel streams of bits are formed such that each stream comprises bits in one-to-one correspondence with the character code units of a source data stream. Parsing routines are initially prototyped in Python using its native support for unbounded integers to represent arbitrary-length bitstreams. A compiler then translates the Python code into low-level C-based implementations. These low-level implementations take advantage of the SIMD (single-instruction multiple-data) capabilities of commodity processors to yield a dramatic speed-up over traditional alternatives employing byte-at-a-time parsing.
Keywords
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from Berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley (December 2006)
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (XML) 1.0, 5th edn. W3C Recommendation (2008)
Cameron, R.D.: A Case Study in SIMD Text Processing with Parallel Bit Streams. In: ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), Salt Lake City, Utah (2008)
Cameron, R.D., Herdy, K.S., Lin, D.: High performance XML parsing using parallel bit stream technology. In: CASCON 2008: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research, pp. 222–235. ACM Press, New York (2008)
Dai, Z., Ni, N., Zhu, J.: A 1 cycle-per-byte XML parsing accelerator. In: FPGA 2010: Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 199–208. ACM Press, New York (2010)
Herdy, K.S., Burggraf, D.S., Cameron, R.D.: High performance GML to SVG transformation for the visual presentation of geographic data in web-based mapping systems. In: Proceedings of SVG Open 2008 (August 2008)
Kostoulas, M.G., Matsa, M., Mendelsohn, N., Perkins, E., Heifets, A., Mercaldi, M.: XML Screamer: An Integrated Approach to High Performance XML Parsing, Validation and Deserialization. In: Proceedings of the 15th International Conference on World Wide Web (WWW 2006), pp. 93–102 (2006)
Leventhal, M., Lemoine, E.: The XML chip at 6 years. In: International Symposium on Processing XML Efficiently: Overcoming Limits on Space, Time, or Bandwidth (August 2009)
Shah, B., Rao, P.R., Moon, B., Rajagopalan, M.: A data parallel algorithm for XML DOM parsing. In: Bellahsène, Z., Hunt, E., Rys, M., Unland, R. (eds.) XSym 2009. LNCS, vol. 5679, pp. 75–90. Springer, Heidelberg (2009)
Zhang, Y., Pan, Y., Chiu, K.: Speculative p-DFAs for parallel XML parsing. In: 2009 International Conference on High Performance Computing (HiPC), pp. 388–397 (December 2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cameron, R.D., Amiri, E., Herdy, K.S., Lin, D., Shermer, T.C., Popowich, F.P. (2011). Parallel Scanning with Bitstream Addition: An XML Case Study. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6853. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23397-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-23397-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23396-8
Online ISBN: 978-3-642-23397-5
eBook Packages: Computer ScienceComputer Science (R0)