On the Learnability of Shuffle Ideals

  • Dana Angluin
  • James Aspnes
  • Aryeh Kontorovich
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7568)


Although PAC learning unrestricted regular languages is long known to be a very difficult problem, one might suppose the existence (and even an abundance) of natural efficiently learnable sub-families. When our literature search for a natural efficiently learnable regular family came up empty, we proposed the shuffle ideals as a prime candidate. A shuffle ideal generated by a string u is simply the collection of all strings containing u as a (discontiguous) subsequence. This fundamental language family is of theoretical interest in its own right and also provides the building blocks for other important language families. Somewhat surprisingly, we discovered that even a class as simple as the shuffle ideals is not properly PAC learnable, unless RP=NP. In the positive direction, we give an efficient algorithm for properly learning shuffle ideals in the statistical query (and therefore also PAC) model under the uniform distribution.


Regular Language Statistical Query Machine Learn Research Discrete Apply Mathematic Membership Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Angluin, D.: On the complexity of minimum inference of regular sets. Information and Control 3(39), 337–350 (1978)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Angluin, D.: Inference of reversible languages. Journal of the ACM (JACM) 3(29), 741–765 (1982)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Angluin, D., Slonim, D.K.: Randomly fallible teachers: Learning monotone DNF with an incomplete membership oracle. Machine Learning 14(1), 7–26 (1994)zbMATHGoogle Scholar
  5. 5.
    Bshouty, N.H.: Exact learning of formulas in parallel. Machine Learning 26(1), 25–41 (1997)zbMATHCrossRefGoogle Scholar
  6. 6.
    Bshouty, N.H., Eiron, N.: Learning monotone DNF from a teacher that almost does not answer membership queries. Journal of Machine Learning Research 3, 49–57 (2002)MathSciNetGoogle Scholar
  7. 7.
    Bshouty, N.H., Jackson, J.C., Tamon, C.: Exploring learnability between exact and PAC. J. Comput. Syst. Sci. 70(4), 471–484 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Clark, A., Thollard, F.: Pac-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research (JMLR) 5, 473–497 (2004)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Cortes, C., Kontorovich, L.(A.), Mohri, M.: Learning Languages with Rational Kernels. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 349–364. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognition 38, 1332–1348 (2005)CrossRefGoogle Scholar
  11. 11.
    Eilenberg, S., Mac Lane, S.: On the groups of H(Π,n). I. Ann. of Math. (2) 58, 55–106 (1953)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Mark Gold, E.: Complexity of automaton identification from given data. Information and Control 3(37), 302–420 (1978)CrossRefGoogle Scholar
  13. 13.
    Ishigami, Y., Tani, S.: Vc-dimensions of finite automata and commutative finite automata with k letters and n states. Discrete Applied Mathematics 74(2), 123–134 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Jackson, J.C., Lee, H.K., Servedio, R.A., Wan, A.: Learning random monotone DNF. Discrete Applied Mathematics 159(5), 259–271 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Kearns, M.: Efficient noise-tolerant learning from statistical queries. J. ACM 45(6), 983–1006 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Kearns, M.J., Valiant, L.G.: Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM (JACM) 41(1), 67–95 (1994)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. The MIT Press (1997)Google Scholar
  18. 18.
    Klíma, O., Polák, L.: Hierarchies of piecewise testable languages. Int. J. Found. Comput. Sci. 21(4), 517–533 (2010)zbMATHCrossRefGoogle Scholar
  19. 19.
    Kontorovich, L.(A.), Cortes, C., Mohri, M.: Learning Linearly Separable Languages. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 288–303. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Kontorovich, L.(A.), Cortes, C., Mohri, M.: Kernel methods for learning languages. Theor. Comput. Sci. 405(3), 223–236 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    Kontorovich, L.(A.), Nadler, B.: Universal Kernel-Based Learning with Applications to Regular Languages. Journal of Machine Learning Research 10, 997–1031 (2009)MathSciNetGoogle Scholar
  22. 22.
    Kontorovich, L.(A.), Ron, D., Singer, Y.: A Markov Model for the Acquisition of Morphological Structure. Technical Report CMU-CS-03-147 (2003)Google Scholar
  23. 23.
    Koskenniemi, K.: Two-level model for morphological analysis. In: IJCAI, pp. 683–685 (1983)Google Scholar
  24. 24.
    Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley (1983)Google Scholar
  25. 25.
    Mohri, M.: On some applications of finite-state automata theory to natural language processing. Nat. Lang. Eng. 2, 61–80 (1996)CrossRefGoogle Scholar
  26. 26.
    Mohri, M.: Finite-state transducers in language and speech processing. Computational Linguistics 23(2), 269–311 (1997)MathSciNetGoogle Scholar
  27. 27.
    Mohri, M., Moreno, P., Weinstein, E.: Efficient and robust music identification with weighted finite-state transducers. IEEE Transactions on Audio, Speech & Language Processing 18(1), 197–207 (2010)CrossRefGoogle Scholar
  28. 28.
    Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. Computer Speech & Language 16(1), 69–88 (2002)CrossRefGoogle Scholar
  29. 29.
    Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition, pp. 49–61. World Scientific Publishing (1992)Google Scholar
  30. 30.
    Palmer, N., Goldberg, P.W.: PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance. Theor. Comput. Sci. 387(1), 18–31 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    Parekh, R., Honavar, V.G.: Learning DFA from simple examples. Mach. Learn. 44(1-2), 9–35 (2001)zbMATHCrossRefGoogle Scholar
  32. 32.
    Păun, G.: Mathematical Aspects of Natural and Formal Languages. World Scientific Publishing (1994)Google Scholar
  33. 33.
    Pitt, L., Warmuth, M.: Prediction-preserving reducibility. Journal of Computer and System Sciences 41(3), 430–467 (1990)MathSciNetzbMATHCrossRefGoogle Scholar
  34. 34.
    Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the Association for Computing Machinery 40(1), 95–142 (1993)MathSciNetzbMATHCrossRefGoogle Scholar
  35. 35.
    Rambow, O., Bangalore, S., Butt, T., Nasr, A., Sproat, R.: Creating a finite-state parser with application semantics. In: COLING (2002)Google Scholar
  36. 36.
    Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. Journal of Computer and System Sciences 56(2), 133–152 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  37. 37.
    Sellie, L.: Learning random monotone DNF under the uniform distribution. In: COLT, pp. 181–192 (2008)Google Scholar
  38. 38.
    Servedio, R.A.: On learning monotone DNF under product distributions. Inf. Comput. 193(1), 57–74 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  39. 39.
    Simon, I.: Piecewise Testable Events. In: Brakhage, H. (ed.) GI-Fachtagung 1975. LNCS, vol. 33, pp. 214–222. Springer, Heidelberg (1975)Google Scholar
  40. 40.
    Sproat, R., Shih, C., Gale, W., Chang, N.: A stochastic finite-state word-segmentation algorithm for Chinese. Computational Linguistics 22(3), 377–404 (1996)Google Scholar
  41. 41.
    Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Dana Angluin
    • 1
  • James Aspnes
    • 1
  • Aryeh Kontorovich
    • 2
  1. 1.Department of Computer ScienceYale UniversityNew HavenUSA
  2. 2.Department of Computer ScienceBen-Gurion University of the NegevBeer ShevaIsrael

Personalised recommendations