Robust Identification in the Limit from Incomplete Positive Data

Kaelbling, Philip; Lambert, Dakotah; Heinz, Jeffrey

doi:10.1007/978-3-031-43587-4_20

Philip Kaelbling⁹,
Dakotah Lambert¹⁰ &
Jeffrey Heinz¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14292))

Included in the following conference series:

International Symposium on Fundamentals of Computation Theory

171 Accesses

Abstract

Intuitively, a learning algorithm is robust if it can succeed despite adverse conditions. We examine conditions under which learning algorithms for classes of formal languages are able to succeed when the data presentations are systematically incomplete; that is, when certain kinds of examples are systematically absent. One motivation comes from linguistics, where the phonotactic pattern of a language may be understood as the intersection of formal languages, each of which formalizes a distinct linguistic generalization. We examine under what conditions these generalizations can be learned when the only data available to a learner belongs to their intersection. In particular, we provide three formal definitions of robustness in the identification in the limit from positive data paradigm, and several theorems which describe the kinds of classes of formal languages which are, and are not, robustly learnable in the relevant sense. We relate these results to classes relevant to natural language phonology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The notion of robustness studied here is different from the one studied by Case et al. [3]. There, a class is “robustly learnable” if and only if its effective transformations are learnable too. As such, their primary interest is classes “outside the world of the recursively enumerable classes.” This paper uses the term “robustly learnable” to mean learnable despite the absence of some positive evidence.
2.
Alexander Clark (personal communication) provides a counterexample. Let \(C = \{ L_\infty , L_1, \dots \}\) where \(L_n = \{ a^m : 0< m < n \} \cup \{b^{n+1}\}\) and \(L_\infty = a^+ \cup \{ b\}\). Let \(D = \{ a^*\}\). Both classes are ilpd-learnable but \(\{ L_C \cap L_D : L_C\in C, L_D\in D\}\) is not.
3.
Technically, local classes need to be augmented with symbols marking word edges.
4.
Note that this is a stronger guarantee than consistency.

References

Angluin, D.: Inductive inference of formal languages from positive data. Inf. Control 45(2), 117–135 (1980)
Article MathSciNet MATH Google Scholar
Blum, L., Blum, M.: Toward a mathematical theory of inductive inference. Inf. Control 28(2), 125–155 (1975)
Article MathSciNet MATH Google Scholar
Case, J., Jain, S., Stephan, F., Wiehagen, R.: Robust learning-rich and poor. J. Comput. Syst. Sci. 69(2), 123–165 (2004)
Article MathSciNet MATH Google Scholar
Clark, A., Lappin, S.: Linguistic Nativism and the Poverty of the Stimulus. Wiley-Blackwell (2011)
Google Scholar
Eyraud, R., Heinz, J., Yoshinaka, R.: Efficiency in the identification in the limit learning paradigm. In: Heinz, J., Sempere, J.M. (eds.) Topics in Grammatical Inference, pp. 25–46. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-48395-4_2
Chapter MATH Google Scholar
Freivalds, R., Kinber, E., Wiehagen, R.: On the power of inductive inference from good examples. Theoret. Comput. Sci. 110(1), 131–144 (1993)
Article MathSciNet MATH Google Scholar
Fulk, M., Jain, S.: Learning in the presence of inaccurate information. Theoret. Comput. Sci. 161, 235–261 (1996)
Article MathSciNet MATH Google Scholar
Gold, E.M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)
Article MathSciNet MATH Google Scholar
Haines, L.H.: On free monoids partially ordered by embedding. J. Combinatorial Theory 6(1), 94–98 (1969)
Article MathSciNet MATH Google Scholar
Heinz, J.: String extension learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 897–906. Association for Computational Linguistics, Uppsala, Sweden (July 2010)
Google Scholar
Heinz, J.: The computational nature of phonological generalizations. In: Hyman, L., Plank, F. (eds.) Phonological Typology, Phonetics and Phonology, vol. 23, chap. 5, pp. 126–195. Mouton de Gruyter (2018)
Google Scholar
Heinz, J., Kasprzik, A., Kötzing, T.: Learning in the limit with lattice-structured hypothesis spaces. Theoret. Comput. Sci. 457, 111–127 (2012)
Article MathSciNet MATH Google Scholar
Heinz, J., Rawal, C., Tanner, H.G.: Tier-based strictly local constraints for phonology. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 58–64. Association for Computational Linguistics, Portland (2011)
Google Scholar
Jain, S.: Program synthesis in the presence of infinite number of inaccuracies. J. Comput. Syst. Sci. 53, 583–591 (1996)
Article MathSciNet MATH Google Scholar
Jain, S., Lange, S., Nessel, J.: On the learnability of recursively enumerable languages from good examples. Theoret. Comput. Sci. 261, 3–29 (2001)
Article MathSciNet MATH Google Scholar
Jain, S., Osherson, D., Royer, J.S., Sharma, A.: Systems That Learn: An Introduction to Learning Theory, 2nd edn. The MIT Press (1999)
Google Scholar
Lambert, D.: Grammar interpretations and learning TSL online. In: Proceedings of the Fifteenth International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 153, pp. 81–91, August 2021
Google Scholar
Lambert, D.: Relativized adjacency. Journal of Logic, Language and Information, May 2023
Google Scholar
Lambert, D., Rawski, J., Heinz, J.: Typology emerges from simplicity in representations and learning. J. Lang. Modelling 9(1), 151–194 (2021)
Article Google Scholar
McNaughton, R., Papert, S.A.: Counter-Free Automata. MIT Press (1971)
Google Scholar
Osherson, D.N., Stob, M., Weinstein, S.: Systems That Learn. MIT Press, Cambridge (1986)
Google Scholar
Pin, J.E.: Profinite methods in automata theory. In: 26th International Symposium on Theoretical Aspects of Computer Science STACS 2009, February 2009
Google Scholar
Rogers, J., et al.: On languages piecewise testable in the strict sense. In: Ebert, C., Jäger, G., Michaelis, J. (eds.) MOL 2007/2009. LNCS (LNAI), vol. 6149, pp. 255–265. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14322-9_19
Chapter Google Scholar
Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07407-4_23
Chapter Google Scholar
Smetsers, R., Volpato, M., Vaandrager, F., Verwer, S.: Bigger is not always better: on the quality of hypotheses in active automata learning. In: Clark, A., Kanazawa, M., Yoshinaka, R. (eds.) The 12th International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 34, pp. 167–181. PMLR, Kyoto, Japan, 17–19 Sep 2014
Google Scholar
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar

Download references

Acknowledgements

We acknowledge support from the Data + Computing = Discovery summer REU program at the Institute for Advanced Computational Science at Stony Brook University, supported by the NSF under award 1950052.

Author information

Authors and Affiliations

Department of Computer Science, Wesleyan University, Middletown, USA
Philip Kaelbling
Université Jean Monnet Saint-Étienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien UMR 5516, Saint-Étienne, France
Dakotah Lambert
Department of Linguistics and Institute for Advanced Computational Science, Stony Brook University, Stony Brook, USA
Jeffrey Heinz

Authors

Philip Kaelbling
View author publications
You can also search for this author in PubMed Google Scholar
Dakotah Lambert
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Heinz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philip Kaelbling .

Editor information

Editors and Affiliations

University of Trier, Trier, Germany
Henning Fernau
University of Kiel, Kiel, Germany
Klaus Jansen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaelbling, P., Lambert, D., Heinz, J. (2023). Robust Identification in the Limit from Incomplete Positive Data. In: Fernau, H., Jansen, K. (eds) Fundamentals of Computation Theory. FCT 2023. Lecture Notes in Computer Science, vol 14292. Springer, Cham. https://doi.org/10.1007/978-3-031-43587-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-43587-4_20
Published: 21 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43586-7
Online ISBN: 978-3-031-43587-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Identification in the Limit from Incomplete Positive Data