Representing and Querying Semistructured Web Data Using Nested Tables with Structural Variants
This paper proposes an approach to representing and querying semistructured Web data. The proposed approach is based on nested tables, which may have internal nested structural variations to accommodate semistructured data. Our motivation is to reduce the complexity found in typical query languages for semistructured data and to provide users with an alternative for quickly querying data obtained from multiple-record Web pages. We show the feasibility of our proposal by developing a prototype for a graphical query interface called QSByE (Querying Semistructured data By Example). For QSByE, we define a particular variation of nested tables and propose a set of QBE-like operations that extends typical nested-relational-algebra operations to handle semistructured data. We show examples of how users can pose interesting queries using QSByE.
Unable to display preview. Download preview PDF.
- 1.Bonifati, A., AND Ceri, S. Comparative analysis of five XML query languages. SIGMOD Record 29, 1 (2001), 68–79.Google Scholar
- 2.Buneman, P., Davidson, S. B., Hillebrand, G. G., AND Suciu, D. A Query Language and Optimization Techniques for Unstructured Data. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (Quebec, Canada, 1996), pp. 505–516.Google Scholar
- 3.Buneman, P., Deutsch, A., AND Tan, W. A Deterministic Model for Semistructured Data. In Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats (Jerusalem, Israel, 1999).Google Scholar
- 4.Colby, L. S. A Recursive Algebra and Query Optimization for Nested Relations. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (Portland, Oregon, 1989), pp. 273–283.Google Scholar
- 5.Deutsch, A., Fernandez, M. F., AND Suciu, D. Storing Semistructured Data with STORED. In Proceedings the 1999 ACM SIGMOD International Conference on Management of Data (Philadephia, Pennsylvania, 1999), pp. 431–442.Google Scholar
- 7.Evangelista-Filha, I. M. R., Laender, A. H. F., AND Silva, A. S. Querying Semistructured Data By Example: The QSByE Interface. In Proceedings of the International Workshop on Information Integration on the Web (Rio de Janeiro, Brazil, 2001), pp. 156–163.Google Scholar
- 9.Goldman, R., AND Widom, J. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proceedings of the 23rd International Conference on Very Large Data Bases (Athens, Greece, 1997), pp. 436–445.Google Scholar
- 10.Jaeschke, G., AND Schek, H.-J. Remarks on the Algebra of Non First Normal Form Relations. In Proceedings of the ACM Symposium on Principles of Database (Los Angeles, California, 1982), pp. 124–138.Google Scholar
- 12.Libkin, L. A Relational Algebra for Complex Objects Based on Partial Information. In Proceedings of the 3rd Symposium on Mathematical Fundamentals of Database and Knowledge Bases Systems (Rostock, Germany, 1991), pp. 29–43.Google Scholar
- 13.Lorentzos, N. A., AND Dondis, K. A. Query by Example for Nested Tables. In Proceedings of the 9th International Conference on Database and Expert Systems Applications (Vienna, Austria, 1998), pp. 716–725.Google Scholar
- 14.Makinouchi, A. A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model. In Proceedings of the 3rd International Conference on Very Large Data Bases (Tokyo, Japan, 1977), pp. 447–453.Google Scholar
- 16.Papakonstantinou, Y., Garcia-molina, H., AND Widom, J. Object Exchange Across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering (Taipei, Taiwan, 1995), pp. 251–260.Google Scholar
- 17.Thomas, S. J., AND Fischer, P. C. Nested Relational Structures. Advances in Computing Research 3 (1986), 269–307.Google Scholar