STACS 2004: STACS 2004 pp 117-128

# Local Limit Distributions in Pattern Statistics: Beyond the Markovian Models

• Alberto Bertoni
• Christian Choffrut
• Massimiliano Goldwurm
• Violetta Lonati
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2996)

## Abstract

Motivated by problems of pattern statistics, we study the limit distribution of the random variable counting the number of occurrences of the symbol a in a word of length n chosen at random in {a,b}*, according to a probability distribution defined via a finite automaton equipped with positive real weights. We determine the local limit distribution of such a quantity under the hypothesis that the transition matrix naturally associated with the finite automaton is primitive. Our probabilistic model extends the Markovian models traditionally used in the literature on pattern statistics.

This result is obtained by introducing a notion of symbol-periodicity for irreducible matrices whose entries are polynomials in one variable over an arbitrary positive semiring. This notion and the related results we prove are of interest in their own right, since they extend classical properties of the Perron–Frobenius Theory for non-negative real matrices.

## Keywords

Automata and Formal Languages Pattern statistics Local Limit Theorems Perron–Frobenius Theory

## References

1. 1.
Bender, E.A.: Central and local limit theorems applied to asymptotic enumeration. Journal of Combinatorial Theory 15, 91–111 (1973)
2. 2.
Berstel, J., Reutenauer, C.: Rational series and their languages. Springer, New York (1988)
3. 3.
Bertoni, A., Choffrut, C., Goldwurm, M., Lonati, V.: On the number of occurrences of a symbol in words of regular languages. Theoretical Computer Science 302(1-3), 431–456 (2003)
4. 4.
de Falco, D., Goldwurm, M., Lonati, V.: Frequency of symbol occurrences in simple nonprimitive stochastic models. In: Esig, Z., Fülop, Z. (eds.) Proceedings 7th D.L.T. Conference. LNCS, vol. 2710, pp. 242–253. Springer, Heidelberg (2003)Google Scholar
5. 5.
Flajolet, P., Sedgewick, R.: The average case analysis of algorithms: multivariate asymptotics and limit distributions. Rapport de recherche n. 3162, INRIA Rocquencourt (May 1997)Google Scholar
6. 6.
Flajolet, P., Sedgewick, R.: Analytic combinatorics: functional equations, rational and algebraic functions. Rapport de recherche n. 4103, INRIA Rocquencourt (January 2001)Google Scholar
7. 7.
Gnedenko, B.V.: The theory of probability. Mir Publishers, Moscow (1976) (translated by Yankovsky, G.)Google Scholar
8. 8.
Guibas, L.J., Odlyzko, A.M.: String overlaps, pattern matching, and nontransitive games. Journal of Combinatorial Theory. Series A 30(2), 183–208 (1981)
9. 9.
Kuich, W., Salomaa, A.: Semirings, automata, languages. Springer, New York (1986)
10. 10.
Nicodème, P., Salvy, B., Flajolet, P.: Motif statistics. Theoretical Computer Science 287(2), 593–617 (2002)
11. 11.
Prum, B., Rudolphe, F., Turckheim, E.: Finding words with unexpected frequencies in deoxyribonucleic acid sequence. J. Roy. Statist. Soc. Ser. B 57, 205–220 (1995)
12. 12.
Régnier, M., Szpankowski, W.: On pattern frequency occurrences in a Markovian sequence. Algorithmica 22(4), 621–649 (1998)
13. 13.
Seneta, E.: Non-negative matrices and Markov chains. Springer, New York (1981)
14. 14.
Waterman, M.: Introduction to computational biology. Chapman & Hall, NewYork (1995)

## Authors and Affiliations

• Alberto Bertoni
• 1
• Christian Choffrut
• 2
• Massimiliano Goldwurm
• 1
• Violetta Lonati
• 1
1. 1.Dipartimento di Scienze dell’InformazioneUniversità degli Studi di MilanoMilanoItaly
2. 2.L.I.A.F.A. (Laboratoire d’Informatique Algorithmique, Fondements et Applications)Université Paris VIIParisFrance