Abstract
This paper studies the polynomial-time learnability of the classes of ordered gapped tree patterns (OGT) and ordered gapped forests (OGF) under the into-matching semantics in the query learning model of Angluin. The class OGT is a model of semi-structured database query languages, and a generalization of both the class of ordered/unordered tree pattern languages and the class of non-erasing regular pattern languages. First, we present a polynomial time learning algorithm for μ- OGT, the subclass of OGT without repeated tree variables, using equivalence queries and membership queries. By extending this algorithm, we present polynomial time learning algorithms for the classes μ-OGF of forests without repeated variables and OGT of trees with repeated variables using equivalence queries and subset queries. We also give representation-independent hardness results which indicate that both of equivalence and membership queries are necessary to learn μ-OGT.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, Quass, McHugh, J. Widom, J. L. Wiener, The Lorel query language for semistructured data, Int’l. J. on Digital Libraries, 1(1), 68–88, 1997.
T. R. Amoth, P. Cull, and P. Tadepalli, Exact learning of unordered tree patterns from queries, In Proc. COLT’99, ACM Press, 323–332, 1999.
T. R. Amoth, P. Cull, and P. Tadepalli, Exact learning of tree patterns from queries and counterexamples, In Proc. COLT’98, ACM Press, 175–186, 1988.
D. Angluin, Finding patterns common to a set of strings, JCSS, 21, 46–62, 1980.
D. Angluin, Queries and concept learning, Machine Learning, 2(4), 319–342, 1988.
H. Arimura, H. Ishizaka, T. Shinohara, S. Otsuki, A generalization of the least general generalization, Machine Intelligence, 13, 59–85, 1994.
H. Arimura, H. Ishizaka, T. Shinohara, Learning unions of tree patterns using queries, Theoretical Computer Science, 185(1), 47–62, 1997.
H. Arimura, T. Shinohara, S. Otsuki, Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data, In Proc. STACS’94, LNCS 775, Springer-Verlag, 649–660, 1994.
P. Buneman, M. F. Fernandez, D. Suciu, UnQL: A query language and algebra for semistructured data based on structural recursion, VLDB J., 9(1), 76–110, 2000.
M. Frazier, L. Pitt, Classic learning, Machine Learning, 25 (2–3), 151–193, 1996.
R. Khardon, Learning function-free Horn expressions, Mach. Learn., 35(1), 241–275, 1999.
K-I. Ko, A. Marron, Tzeng, Learning string patterns and tree patterns from examples, In Proc. 7th Internat. Conference on Machine Learning, 384–391, 1990.
Kosaraju, S. R., Efficient tree pattern matching, In Proc. 30th FOCS, 178–183, 1989.
N. Kushmerick, Wrapper induction: efficiency and expressiveness, Artificial Intelligence, Vol.118, pp.15–68, 2000.
S. Matsumoto and A. Shinohara, Learning Pattern Languages Using Queries, Proc. Euro COLT’97, LNAI, Springer-Verlag, 185–197, 1997.
J. Nessel and S. Lange, Learning erasing pattern languages with queries, Proc. ALT2000, LNAI 1968, Springer-Verlag, 86–100, 2000.
G. D. Plotkin, A note on inductive generalization, In Machine Intell., 5, Edinburgh Univ. Press, 153–163, 1970.
H. Sakamoto, Y. Murakami, H. Arimura, S. Arikawa, Extracting Partial Structures from HTML Documents, In Proc. FLAIRS 2001, AAAI Press, 2001.
Extensible Markup Language (XML) Version 1.0. W3C Recommendation 1998.
XML-QL: A Query Language for XML W3C Note, Aug. 1998.
L. Pitt, M. K. Warmuth, Prediction-preserving reducibility, J. Comput. System Sci. 41(3) (1990) 430–467.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arimura, H., Sakamoto, H., Arikawa, S. (2001). Efficient Learning of Semi-structured Data from Queries. In: Abe, N., Khardon, R., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2001. Lecture Notes in Computer Science(), vol 2225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45583-3_24
Download citation
DOI: https://doi.org/10.1007/3-540-45583-3_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42875-6
Online ISBN: 978-3-540-45583-7
eBook Packages: Springer Book Archive