ICGI 2008: Grammatical Inference: Algorithms and Applications pp 43-56 | Cite as
Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings
Abstract
Comparison of standard language learning paradigms (identification in the limit, query learning, Pac learning) has always been a complex question. Moreover, when to the question of converging to a target one adds computational constraints, the picture becomes even less clear: how much do queries or negative examples help? Can we find good algorithms that change their minds very little or that make very few errors? In order to approach these problems we concentrate here on two classes of languages, the topological balls of strings (for the edit distance) and the deterministic finite automata ( Open image in new window
), and (re-)visit the different learning paradigms to sustain our claims.
Keywords
Polynomial learnability deterministic finite automata balls of strings edit distancePreview
Unable to display preview. Download preview PDF.
References
- 1.Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)MathSciNetGoogle Scholar
- 2.Navarro, G.: A guided tour to approximate string matching. ACM computing surveys 33(1), 31–88 (2001)CrossRefGoogle Scholar
- 3.Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Computing Survey 33(3), 273–321 (2001)CrossRefGoogle Scholar
- 4.Kohonen, T.: Median strings. Pattern Recognition Letters 3, 309–313 (1985)CrossRefGoogle Scholar
- 5.Schulz, K.U., Mihov, S.: Fast string correction with Levenshtein automata. Int. Journal on Document Analysis and Recognition 5(1), 67–85 (2002)MATHCrossRefGoogle Scholar
- 6.Sagot, M.F., Wakabayashi, Y.: Pattern inference under many guises. In: Recent Advances in Algorithms and Combinatorics, pp. 245–287. Springer, Heidelberg (2003)CrossRefGoogle Scholar
- 7.Gold, E.M.: Language identification in the limit. Information and Control 10(5), 447–474 (1967)MATHCrossRefGoogle Scholar
- 8.Angluin, D.: Queries and concept learning. Machine Learning Journal 2, 319–342 (1987)Google Scholar
- 9.Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)MATHCrossRefGoogle Scholar
- 10.Angluin, D.: Negative results for equivalence queries. Machine Learning Journal 5, 121–150 (1990)Google Scholar
- 11.Pitt, L.: Inductive inference, DFA’s, and computational complexity. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 18–44. Springer, Heidelberg (1989)Google Scholar
- 12.Li, M., Vitanyi, P.: Learning simple concepts under simple distributions. Siam Journal of Computing 20, 911–935 (1991)MATHCrossRefMathSciNetGoogle Scholar
- 13.Denis, F.: Learning regular languages from simple positive examples. Machine Learning Journal 44(1), 37–66 (2001)MATHCrossRefGoogle Scholar
- 14.Parekh, R.J., Honavar, V.: On the relationship between models for learning in helpful environments. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 207–220. Springer, Heidelberg (2000)Google Scholar
- 15.Haussler, D., Kearns, M.J., Littlestone, N., Warmuth, M.K.: Equivalence of models for polynomial learnability. Information and Computation 95(2), 129–161 (1991)MATHCrossRefMathSciNetGoogle Scholar
- 16.Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. In: 21st ACM Symposium on Theory of Computing (STOC 1989), pp. 433–444 (1989)Google Scholar
- 17.de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning Journal 27, 125–138 (1997)MATHCrossRefGoogle Scholar
- 18.Wagner, R., Fisher, M.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)MATHCrossRefGoogle Scholar
- 19.Papadimitriou, C.M.: Computational Complexity. Addison Wesley, New York (1994)MATHGoogle Scholar
- 20.Becerra-Bonache, L., de la Higuera, C., Janodet, J.C., Tantini, F.: Learning balls of strings with correction queries. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 18–29. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 21.Angluin, D.: Learning regular sets from queries and counterexamples. Information and Control 39, 337–350 (1987)CrossRefMathSciNetGoogle Scholar
- 22.Warmuth, M.: Towards representation independence in PAC-learning. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 78–103. Springer, Heidelberg (1989)Google Scholar
- 23.Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)Google Scholar
- 24.Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. Journal of the ACM 35(4), 965–984 (1988)MATHCrossRefMathSciNetGoogle Scholar
- 25.Maier, D.: The complexity of some problems on subsequences and supersequences. Journal of the ACM 25, 322–336 (1977)CrossRefMathSciNetGoogle Scholar
- 26.de la Higuera, C., Casacuberta, F.: Topology of strings: Median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)MATHCrossRefMathSciNetGoogle Scholar
- 27.Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the ACM 40(1), 95–142 (1993)MATHCrossRefMathSciNetGoogle Scholar
- 28.Angluin, D., Smith, C.: Inductive inference: theory and methods. ACM computing surveys 15(3), 237–269 (1983)CrossRefMathSciNetGoogle Scholar
- 29.Greenberg, R.I.: Bounds on the number of longest common subsequences. Technical report, Loyola University (2003), http://arXiv.org/abs/cs/0301030v2
- 30.Greenberg, R.I.: Fast and simple computation of all longest common subsequences. Technical report, Loyola University (2002), http://arXiv.org/abs/cs.DS/0211001
- 31.Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37, 302–320 (1978)MATHCrossRefMathSciNetGoogle Scholar
- 32.Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition. Series in Machine Perception and Artificial Intelligence, vol. 5, pp. 99–108. World Scientific, Singapore (1992)Google Scholar
- 33.Denis, F., Lemay, A., Terlutte, A.: Learning regular languages using RFSA. Theoretical Computer Science 313(2), 267–294 (2004)MATHCrossRefMathSciNetGoogle Scholar