Skip to main content

Efficient Learning of Semi-structured Data from Queries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2225))

Abstract

This paper studies the polynomial-time learnability of the classes of ordered gapped tree patterns (OGT) and ordered gapped forests (OGF) under the into-matching semantics in the query learning model of Angluin. The class OGT is a model of semi-structured database query languages, and a generalization of both the class of ordered/unordered tree pattern languages and the class of non-erasing regular pattern languages. First, we present a polynomial time learning algorithm for μ- OGT, the subclass of OGT without repeated tree variables, using equivalence queries and membership queries. By extending this algorithm, we present polynomial time learning algorithms for the classes μ-OGF of forests without repeated variables and OGT of trees with repeated variables using equivalence queries and subset queries. We also give representation-independent hardness results which indicate that both of equivalence and membership queries are necessary to learn μ-OGT.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Abiteboul, Quass, McHugh, J. Widom, J. L. Wiener, The Lorel query language for semistructured data, Int’l. J. on Digital Libraries, 1(1), 68–88, 1997.

    Article  Google Scholar 

  2. T. R. Amoth, P. Cull, and P. Tadepalli, Exact learning of unordered tree patterns from queries, In Proc. COLT’99, ACM Press, 323–332, 1999.

    Google Scholar 

  3. T. R. Amoth, P. Cull, and P. Tadepalli, Exact learning of tree patterns from queries and counterexamples, In Proc. COLT’98, ACM Press, 175–186, 1988.

    Google Scholar 

  4. D. Angluin, Finding patterns common to a set of strings, JCSS, 21, 46–62, 1980.

    MathSciNet  MATH  Google Scholar 

  5. D. Angluin, Queries and concept learning, Machine Learning, 2(4), 319–342, 1988.

    MathSciNet  Google Scholar 

  6. H. Arimura, H. Ishizaka, T. Shinohara, S. Otsuki, A generalization of the least general generalization, Machine Intelligence, 13, 59–85, 1994.

    Google Scholar 

  7. H. Arimura, H. Ishizaka, T. Shinohara, Learning unions of tree patterns using queries, Theoretical Computer Science, 185(1), 47–62, 1997.

    Article  MathSciNet  Google Scholar 

  8. H. Arimura, T. Shinohara, S. Otsuki, Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data, In Proc. STACS’94, LNCS 775, Springer-Verlag, 649–660, 1994.

    MATH  Google Scholar 

  9. P. Buneman, M. F. Fernandez, D. Suciu, UnQL: A query language and algebra for semistructured data based on structural recursion, VLDB J., 9(1), 76–110, 2000.

    Article  Google Scholar 

  10. M. Frazier, L. Pitt, Classic learning, Machine Learning, 25 (2–3), 151–193, 1996.

    Article  Google Scholar 

  11. R. Khardon, Learning function-free Horn expressions, Mach. Learn., 35(1), 241–275, 1999.

    Article  Google Scholar 

  12. K-I. Ko, A. Marron, Tzeng, Learning string patterns and tree patterns from examples, In Proc. 7th Internat. Conference on Machine Learning, 384–391, 1990.

    Google Scholar 

  13. Kosaraju, S. R., Efficient tree pattern matching, In Proc. 30th FOCS, 178–183, 1989.

    Google Scholar 

  14. N. Kushmerick, Wrapper induction: efficiency and expressiveness, Artificial Intelligence, Vol.118, pp.15–68, 2000.

    Article  MathSciNet  Google Scholar 

  15. S. Matsumoto and A. Shinohara, Learning Pattern Languages Using Queries, Proc. Euro COLT’97, LNAI, Springer-Verlag, 185–197, 1997.

    Google Scholar 

  16. J. Nessel and S. Lange, Learning erasing pattern languages with queries, Proc. ALT2000, LNAI 1968, Springer-Verlag, 86–100, 2000.

    Google Scholar 

  17. G. D. Plotkin, A note on inductive generalization, In Machine Intell., 5, Edinburgh Univ. Press, 153–163, 1970.

    MathSciNet  MATH  Google Scholar 

  18. H. Sakamoto, Y. Murakami, H. Arimura, S. Arikawa, Extracting Partial Structures from HTML Documents, In Proc. FLAIRS 2001, AAAI Press, 2001.

    Google Scholar 

  19. Extensible Markup Language (XML) Version 1.0. W3C Recommendation 1998.

    Google Scholar 

  20. XML-QL: A Query Language for XML W3C Note, Aug. 1998.

    Google Scholar 

  21. L. Pitt, M. K. Warmuth, Prediction-preserving reducibility, J. Comput. System Sci. 41(3) (1990) 430–467.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Arimura, H., Sakamoto, H., Arikawa, S. (2001). Efficient Learning of Semi-structured Data from Queries. In: Abe, N., Khardon, R., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2001. Lecture Notes in Computer Science(), vol 2225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45583-3_24

Download citation

  • DOI: https://doi.org/10.1007/3-540-45583-3_24

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42875-6

  • Online ISBN: 978-3-540-45583-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics