Supporting Efficient Parametric Search of E-Commerce Data: A Loosely-Coupled Solution
Electronic commerce is emerging as a major application area for database systems. A large number of e-commerce sites provide electronic product catalogs that allow users to search products of interest. Due to the constant evolution and the high sparsity of e-commerce data, most commercial e-commerce systems use the so-called vertical schema for data storage. However, query processing for data stored using vertical schema is extremely slow because current RDBMS, especially its costbased query optimizer, is designed to deal with traditional horizontal schema efficiently.
Most e-commerce systems would like to offer advanced parametric search capabilities to their users. However, most searches are expected to be online which means that the query execution should be very fast. RDBMSs require new capabilities and enhancements before they can satisfy the search performance criteria against vertical schema. The tightly-coupled enhancements and additions to a DBMS require considerable amount of work and may take a long time to be accomplished. In this paper, we describe an alternative approach called SAL, a Search Assistant Layer that can be implemented outside a database engine to accommodate the urgent need for efficient parametric search on e-commerce data. Our experimental results show that dramatic performance improvement is provided by SAL for search queries.
Unable to display preview. Download preview PDF.
- 1.A. Aboulnaga and S. Chaudhuri. Self-tuning histograms: Building histograms without looking at data. In A. Delis, C. Faloutsos, and S. Ghandeharizadeh, editors, SIGMOD 1999, Proceedings ACM SIGMOD International Conference on Management of Data, June 1–3, 1999, Philadephia, Pennsylvania, USA, pages 181–192. ACM Press, 1999.Google Scholar
- 2.R. Agrawal, T. Imielinsk, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993A CM SIGMOD International Conference on Management of Data, pages 207–216, 1993.Google Scholar
- 3.R. Agrawal, A. Somani, and Y. Xu. Storage and querying of e-commerce data. In VLDB’01, Proceedings of 27rd International Conference on Very Large Data Bases, September 11–14, 2001, Roma, Italy. Morgan Kaufmann, 2001.Google Scholar
- 4.R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 1994 International Conference on Very Large Databases, pages 487–499, 1994.Google Scholar
- 5.D. W.-L. Cheung, J. Han, V. Ng, and C. Y. Wong. Maintenance of discovered association rules in large databases: An incremental updating technique. In S. Y. W. Su, editor, Proceedings of the Twelfth International Conference on Data Engineering, February 26–March 1, 1996, New Orleans, Louisiana, USA, pages 106–114. IEEE Computer Society, 1996.Google Scholar
- 6.G. P. Copeland and S. Khoshafian. A decomposition storage model. In S. B. Navathe, editor, Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, Austin, Texas, May 28–31, 1985, pages 268–279. ACM Press, 1985.Google Scholar
- 7.H. Garcia-Molina, J. Ullman, and J. Widom. Database System Implementation. Prentice-Hall, 2000.Google Scholar
- 8.J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In W. Chen, J. F. Naughton, and P. A. Bernstein, editors, Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16–18, 2000, Dallas, Texas, USA, volume 29, pages 1–12. ACM, 2000.Google Scholar
- 9.H. V. Jagadish, H. Jin, B. C. Ooi, and K.-L. Tan. Global optimization of histograms. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, May 21–24, 2001, Santa Barbara, CA, USA, pages 223–234, 2001.Google Scholar
- 10.S. Khoshafian, G. P. Copeland, T. Jagodis, H. Boral, and P. Valduriez. A query processing strategy for the decomposed storage model. In Proceedings of the Third International Conference on Data Engineering, February 3–5, 1987, Los Angeles, California, USA, pages 636–643. IEEE Computer Society, 1987.Google Scholar
- 11.Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pages 448–459, Seattle, WA, June 1998.Google Scholar
- 12.M. Muralikrishna and D. J. DeWitt. Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pages 28–36, 1988.Google Scholar
- 13.G. Piatetsky-Shapiro and C. Connell. Accurate estimation of the number of tuples satisfying a condition. In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, pages 256–276, 1984.Google Scholar
- 14.V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In Proceedings of the 1997 International Conference on Very Large Databases, Athens, Greece, August 1997.Google Scholar
- 15.V. Poosala, Y. E. Ioannidis, P. J. Haas, and E. Shekita. Improved histograms for selectivity estimation of range predicates. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Canada, May 1996.Google Scholar
- 16.P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pages 23–34, 1979.Google Scholar
- 17.R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In H. V. Jagadish and I. S. Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4–6, 1996, pages 1–12. ACM Press, 1996.Google Scholar
- 18.S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka. An efficient algorithm for the incremental updation of association rules in large databases. In KDD 1997, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, August 14–17, 1997, Newport Beach, California, USA, pages 263–266. AAAI Press, 1997.Google Scholar
- 19.K. Wang, Y. He, and J. Han. Mining frequent itemsets using support constraints. In A. E. Abbadi, M. L. Brodie, S. Chakravarthy, U. Dayal, N. Kamel, G. Schlageter, and K.-Y. Whang, editors, VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10–14, 2000, Cairo, Egypt, pages 43–52. Morgan Kaufmann, 2000.Google Scholar