Representing and Querying Semistructured Web Data Using Nested Tables with Structural Variants

  • Altigran S. da Silva
  • Irna M. R. Evangelista Filha
  • Alberto H. F. Laender
  • David W. Embley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2503)


This paper proposes an approach to representing and querying semistructured Web data. The proposed approach is based on nested tables, which may have internal nested structural variations to accommodate semistructured data. Our motivation is to reduce the complexity found in typical query languages for semistructured data and to provide users with an alternative for quickly querying data obtained from multiple-record Web pages. We show the feasibility of our proposal by developing a prototype for a graphical query interface called QSByE (Querying Semistructured data By Example). For QSByE, we define a particular variation of nested tables and propose a set of QBE-like operations that extends typical nested-relational-algebra operations to handle semistructured data. We show examples of how users can pose interesting queries using QSByE.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bonifati, A., AND Ceri, S. Comparative analysis of five XML query languages. SIGMOD Record 29, 1 (2001), 68–79.Google Scholar
  2. 2.
    Buneman, P., Davidson, S. B., Hillebrand, G. G., AND Suciu, D. A Query Language and Optimization Techniques for Unstructured Data. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (Quebec, Canada, 1996), pp. 505–516.Google Scholar
  3. 3.
    Buneman, P., Deutsch, A., AND Tan, W. A Deterministic Model for Semistructured Data. In Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats (Jerusalem, Israel, 1999).Google Scholar
  4. 4.
    Colby, L. S. A Recursive Algebra and Query Optimization for Nested Relations. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (Portland, Oregon, 1989), pp. 273–283.Google Scholar
  5. 5.
    Deutsch, A., Fernandez, M. F., AND Suciu, D. Storing Semistructured Data with STORED. In Proceedings the 1999 ACM SIGMOD International Conference on Management of Data (Philadephia, Pennsylvania, 1999), pp. 431–442.Google Scholar
  6. 6.
    Embley, D., Campbell, D., Jiang, Y., Liddle, S., Lonsdale, D., Ng, Y.-K., AND Smith, R. Conceptual-model-based data extraction from multiple-record Web pages. Data & Knowledge Engineering 31, 3 (1999), 227–251.CrossRefGoogle Scholar
  7. 7.
    Evangelista-Filha, I. M. R., Laender, A. H. F., AND Silva, A. S. Querying Semistructured Data By Example: The QSByE Interface. In Proceedings of the International Workshop on Information Integration on the Web (Rio de Janeiro, Brazil, 2001), pp. 156–163.Google Scholar
  8. 8.
    Florescu, D., Levy, A., AND Mendelzon, A. Database Techniques for the World-Wide Web: A Survey. SIGMOD Record 27, 3 (1998), 59–74.CrossRefGoogle Scholar
  9. 9.
    Goldman, R., AND Widom, J. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proceedings of the 23rd International Conference on Very Large Data Bases (Athens, Greece, 1997), pp. 436–445.Google Scholar
  10. 10.
    Jaeschke, G., AND Schek, H.-J. Remarks on the Algebra of Non First Normal Form Relations. In Proceedings of the ACM Symposium on Principles of Database (Los Angeles, California, 1982), pp. 124–138.Google Scholar
  11. 11.
    Laender, A. H. F., Ribeiro-Neto, B., AND Dasilva., A. S. DEByE-Data Extraction By Bxample. Data and Knowledge Engineering 40, 2 (2002), 121–154.CrossRefGoogle Scholar
  12. 12.
    Libkin, L. A Relational Algebra for Complex Objects Based on Partial Information. In Proceedings of the 3rd Symposium on Mathematical Fundamentals of Database and Knowledge Bases Systems (Rostock, Germany, 1991), pp. 29–43.Google Scholar
  13. 13.
    Lorentzos, N. A., AND Dondis, K. A. Query by Example for Nested Tables. In Proceedings of the 9th International Conference on Database and Expert Systems Applications (Vienna, Austria, 1998), pp. 716–725.Google Scholar
  14. 14.
    Makinouchi, A. A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model. In Proceedings of the 3rd International Conference on Very Large Data Bases (Tokyo, Japan, 1977), pp. 447–453.Google Scholar
  15. 15.
    Mchugh, J., Abiteboul, S., Goldman, R., Quass, D., AND Widom, J. Lore: A Database Management System for Semistructured Data. SIGMOD Record 26, 3 (1997), 54–66.CrossRefGoogle Scholar
  16. 16.
    Papakonstantinou, Y., Garcia-molina, H., AND Widom, J. Object Exchange Across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering (Taipei, Taiwan, 1995), pp. 251–260.Google Scholar
  17. 17.
    Thomas, S. J., AND Fischer, P. C. Nested Relational Structures. Advances in Computing Research 3 (1986), 269–307.Google Scholar
  18. 18.
    Zloof, M. M. Query-by-Example: A Data Base Language. IBM Systems Journal 16, 4 (1977), 324–343.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Altigran S. da Silva
    • 1
  • Irna M. R. Evangelista Filha
    • 1
  • Alberto H. F. Laender
    • 1
  • David W. Embley
    • 2
  1. 1.Department of Computer ScienceFederal University of Minas GeraisBelo Horizonte MGBrazil
  2. 2.Department of Computer ScienceBrigham Young UniversityProvoUSA

Personalised recommendations