BitCube: A Three-Dimensional Bitmap Indexing for XML Documents
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
XML is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented by XML-elements. In this paper, we propose that an XML document collection be represented and indexed using a bitmap indexing technique. We define the similarity and popularity operations suitable for bitmap indexes. We also define statistical measurements in the BitCube: center, and radius. Based on these measurements, we describe a new bitmap indexing based technique to cluster XML documents. The techniques for clustering are motivated by the fact that the bitmap indexes are expected to be very sparse.
Furthermore, a 2-dimensional bitmap index is extended to a 3-dimensional bitmap index, called the BitCube. Sophisticated querying of XML document collections can be performed using primitive operations such as slice, project, and dice. Experiments show that the BitCube can be created efficiently and the primitive operations can be performed more efficiently with the BitCube than with other alternatives.
- Berchtold, S., Keim, D.A., and Kriegel, H.P. (1996). The X-tree: An Index Structure for High-Dimensional Data. In Proc. Intl. Conf. On Very Large Data Bases, Bombay, India (pp. 28-39).
- Chan, C. and Ioannidis,Y. (1998). Bitmap Index Design and Evaluation. In Proc. of Int'lACMSIGMODConference(pp. 355-366).
- Gupta, A. and Mumick, I.S. (Eds.) (2000). Materialized Views. Cambridge, MA: MIT Press.
- Hill, D. (1968). Mechanized Information Storage, Retrieval and Dissemination. Amsterdam: North-Holland.
- Kobayashi, M. and Takeda, K. (2000). Information Retrieval on theWeb. ACMComputing Surveys, 32(2), 144-173.
- O'Neil, P. and Quass, D. (1997). Improved Query Performance with Variant Indexes. In Proc. of Int'l ACM SIGMOD Conference(pp. 38-49).
- Papadimitriou, C., Tamaki, H., Raghavan, P., and Vempala, S. (1998). Latent Semantic Indexing: A Probabilistic Analysis. In Proc. of the 17th ACM Symposium on Principles of Database Systems(pp. 159-168).
- Salton, G. and McGill, M. (1983). Introduction to Modern Information Retrieval. NY: McGraw-Hill.
- Tomasic, A., Garcia-Molina, H., and Shoens, K. (1994). Incremental Updates of Inverted Lists for Text Retrieval. In Proc. ACM SIGMOD Conference on Management of Data, Minneapolis, U.S.A. (pp. 289-300).
- Willet, P. (1988). Recent Trends in Hierarchical Document Clustering: A Critical Review. Information Processing and Management, 24, 577-597.
- Wu, M. (1999). Query Optimization for Selections using Bitmaps. In Proc. Int'l ACM SIGMOD Conference(pp. 227-238).
- Yoon, J. and Kim, S. (1998). A Three-Level User Interface to Multimedia Digital Libraries with Relaxation and Restriction. In IEEE Conf. on Advanced Digital Libraries, Santa Barbara, U.S.A. (pp. 206-215).
- Zamir, O. and Etzioni, O. (1998).Web Document Clustering: A Feasibility Demonstration. In Proc. of ACMSIGIR Conf. on Research and Development in Information Retrieval(pp. 46-54).
- BitCube: A Three-Dimensional Bitmap Indexing for XML Documents
Journal of Intelligent Information Systems
Volume 17, Issue 2-3 , pp 241-254
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers
- Additional Links
- XML document retrieval
- document clustering
- bitmap indexing
- bit-wise operations
- Industry Sectors
- Author Affiliations
- 1. Center for Advanced Computer Studies, University of Louisiana, Lafayette, LA, 70504-4330, USA
- 2. E-Center for E-Business and Department of Information and Software Engineering, George Mason University, Fairfax, VA, 22030-4444, USA