Skip to main content

Robust Identification in the Limit from Incomplete Positive Data

  • Conference paper
  • First Online:
Fundamentals of Computation Theory (FCT 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14292))

Included in the following conference series:

  • 171 Accesses

Abstract

Intuitively, a learning algorithm is robust if it can succeed despite adverse conditions. We examine conditions under which learning algorithms for classes of formal languages are able to succeed when the data presentations are systematically incomplete; that is, when certain kinds of examples are systematically absent. One motivation comes from linguistics, where the phonotactic pattern of a language may be understood as the intersection of formal languages, each of which formalizes a distinct linguistic generalization. We examine under what conditions these generalizations can be learned when the only data available to a learner belongs to their intersection. In particular, we provide three formal definitions of robustness in the identification in the limit from positive data paradigm, and several theorems which describe the kinds of classes of formal languages which are, and are not, robustly learnable in the relevant sense. We relate these results to classes relevant to natural language phonology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The notion of robustness studied here is different from the one studied by Case et al. [3]. There, a class is “robustly learnable” if and only if its effective transformations are learnable too. As such, their primary interest is classes “outside the world of the recursively enumerable classes.” This paper uses the term “robustly learnable” to mean learnable despite the absence of some positive evidence.

  2. 2.

    Alexander Clark (personal communication) provides a counterexample. Let \(C = \{ L_\infty , L_1, \dots \}\) where \(L_n = \{ a^m : 0< m < n \} \cup \{b^{n+1}\}\) and \(L_\infty = a^+ \cup \{ b\}\). Let \(D = \{ a^*\}\). Both classes are ilpd-learnable but \(\{ L_C \cap L_D : L_C\in C, L_D\in D\}\) is not.

  3. 3.

    Technically, local classes need to be augmented with symbols marking word edges.

  4. 4.

    Note that this is a stronger guarantee than consistency.

References

  1. Angluin, D.: Inductive inference of formal languages from positive data. Inf. Control 45(2), 117–135 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  2. Blum, L., Blum, M.: Toward a mathematical theory of inductive inference. Inf. Control 28(2), 125–155 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  3. Case, J., Jain, S., Stephan, F., Wiehagen, R.: Robust learning-rich and poor. J. Comput. Syst. Sci. 69(2), 123–165 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  4. Clark, A., Lappin, S.: Linguistic Nativism and the Poverty of the Stimulus. Wiley-Blackwell (2011)

    Google Scholar 

  5. Eyraud, R., Heinz, J., Yoshinaka, R.: Efficiency in the identification in the limit learning paradigm. In: Heinz, J., Sempere, J.M. (eds.) Topics in Grammatical Inference, pp. 25–46. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-48395-4_2

    Chapter  MATH  Google Scholar 

  6. Freivalds, R., Kinber, E., Wiehagen, R.: On the power of inductive inference from good examples. Theoret. Comput. Sci. 110(1), 131–144 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  7. Fulk, M., Jain, S.: Learning in the presence of inaccurate information. Theoret. Comput. Sci. 161, 235–261 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  8. Gold, E.M.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  9. Haines, L.H.: On free monoids partially ordered by embedding. J. Combinatorial Theory 6(1), 94–98 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  10. Heinz, J.: String extension learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 897–906. Association for Computational Linguistics, Uppsala, Sweden (July 2010)

    Google Scholar 

  11. Heinz, J.: The computational nature of phonological generalizations. In: Hyman, L., Plank, F. (eds.) Phonological Typology, Phonetics and Phonology, vol. 23, chap. 5, pp. 126–195. Mouton de Gruyter (2018)

    Google Scholar 

  12. Heinz, J., Kasprzik, A., Kötzing, T.: Learning in the limit with lattice-structured hypothesis spaces. Theoret. Comput. Sci. 457, 111–127 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  13. Heinz, J., Rawal, C., Tanner, H.G.: Tier-based strictly local constraints for phonology. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 58–64. Association for Computational Linguistics, Portland (2011)

    Google Scholar 

  14. Jain, S.: Program synthesis in the presence of infinite number of inaccuracies. J. Comput. Syst. Sci. 53, 583–591 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  15. Jain, S., Lange, S., Nessel, J.: On the learnability of recursively enumerable languages from good examples. Theoret. Comput. Sci. 261, 3–29 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  16. Jain, S., Osherson, D., Royer, J.S., Sharma, A.: Systems That Learn: An Introduction to Learning Theory, 2nd edn. The MIT Press (1999)

    Google Scholar 

  17. Lambert, D.: Grammar interpretations and learning TSL online. In: Proceedings of the Fifteenth International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 153, pp. 81–91, August 2021

    Google Scholar 

  18. Lambert, D.: Relativized adjacency. Journal of Logic, Language and Information, May 2023

    Google Scholar 

  19. Lambert, D., Rawski, J., Heinz, J.: Typology emerges from simplicity in representations and learning. J. Lang. Modelling 9(1), 151–194 (2021)

    Article  Google Scholar 

  20. McNaughton, R., Papert, S.A.: Counter-Free Automata. MIT Press (1971)

    Google Scholar 

  21. Osherson, D.N., Stob, M., Weinstein, S.: Systems That Learn. MIT Press, Cambridge (1986)

    Google Scholar 

  22. Pin, J.E.: Profinite methods in automata theory. In: 26th International Symposium on Theoretical Aspects of Computer Science STACS 2009, February 2009

    Google Scholar 

  23. Rogers, J., et al.: On languages piecewise testable in the strict sense. In: Ebert, C., Jäger, G., Michaelis, J. (eds.) MOL 2007/2009. LNCS (LNAI), vol. 6149, pp. 255–265. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14322-9_19

    Chapter  Google Scholar 

  24. Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07407-4_23

    Chapter  Google Scholar 

  25. Smetsers, R., Volpato, M., Vaandrager, F., Verwer, S.: Bigger is not always better: on the quality of hypotheses in active automata learning. In: Clark, A., Kanazawa, M., Yoshinaka, R. (eds.) The 12th International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 34, pp. 167–181. PMLR, Kyoto, Japan, 17–19 Sep 2014

    Google Scholar 

  26. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We acknowledge support from the Data + Computing = Discovery summer REU program at the Institute for Advanced Computational Science at Stony Brook University, supported by the NSF under award 1950052.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Kaelbling .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaelbling, P., Lambert, D., Heinz, J. (2023). Robust Identification in the Limit from Incomplete Positive Data. In: Fernau, H., Jansen, K. (eds) Fundamentals of Computation Theory. FCT 2023. Lecture Notes in Computer Science, vol 14292. Springer, Cham. https://doi.org/10.1007/978-3-031-43587-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43587-4_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43586-7

  • Online ISBN: 978-3-031-43587-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics