# Generating Function Methods for Run and Scan Statistics

• Yong Kong
Living reference work entry

## Abstract

Runs and pattern statistics have found successful applications in various fields. Many classical results of distributions of runs were obtained by combinatorial methods. As the patterns under study become complicated, the combinatorial complexity involved may become challenging, especially when dealing with multistate or multiset systems. Several unified methods have been devised to overcome the combinatorial difficulties. One of them is the finite Markov chain imbedding approach. Here we use a systematic approach that is inspired by methods in statistical physics. In this approach the study of run and pattern distributions is decoupled into two easy independent steps. In the first step, elements of each object (usually represented by its generating function) are considered in isolation without regards of elements of the other objects. In the second step, formulas in matrix or explicit forms combine the results from the first step into a whole multi-object system with potential nearest neighbor interactions. By considering only one kind of object each time in the first step, the complexity arising from the simultaneous interactions of elements from multiple objects is avoided. In essence the method builds up a higher level generating function for the whole system by using the lower level of generating functions from individual objects. By dealing with generating functions in each step, the method usually obtains results that are more general than those obtained by other methods. Examples of different complexities and flavors for run- and pattern-related distributions will be used to illustrate the method.

## Keywords

Combinatorial complexity Distribution-free statistical test Distributions of runs Eulerian number and Simon Newcomb number Generating function Multivariate ligand binding Randomness test Rises, falls, and levels Successions

## References

1. Balakrishnan N, Koutras MV (2002) Runs and scans with applications. Wiley, New York
2. Di Cera E, Kong Y (1996) Theory of multivalent binding in one and two-dimensional lattices. Biophys Chem 61(2):107–124
3. Dillon JF, Roselle DP (1969) Simon Newcomb’s problem. SIAM J Appl Math 17:1086–1093
4. Dwass M (1973) The number of increases in a random permutation. J Combin Theory Ser A 15:192–199
5. Fu JC (1995) Exact and limiting distributions of the number of successions in a random permutation. Ann Inst Stat Math 47:435–446
6. Fu JC, Koutras MV (1994) Distribution theory of runs: a Markov chain approach. J Am Stat Assoc 89:1050–1058
7. Fu JC, Lou WYW (2003) Distribution theory of runs and patterns: a finite Markov chain imbedding approach. World Scientific Publishing Company Pte Limited, Singapore
8. Glaz J, Naus J, Wallenstein S (2001) Scan statistics. Springer, New York
9. Glaz J, Pozdnyakov V, Wallenstein S (eds) (2009) Scan statistics: methods and applications, 1st edn. Birkhäuser, Basel
10. Godbole AP, Papastavridis SG (eds) (1994) Runs and patterns in probability: selected papers. Kluwer Academic Publishers, DordrechtGoogle Scholar
11. Graham RL, Knuth DE, Patashnik O (1994) Concrete mathematics: a foundation for computer science, 2nd edn. Addison-Wesley Longman Publishing Co., Inc., Boston
12. Hill T (1985) Cooperativity theory in biochemistry: steady-state and equilibrium systems. Molecular biology, biochemistry and biophysics series. Springer, New York
13. Hirano K (1986) Some properties of the distributions of order k. In: Philippou AN, Bergum GE, Horadam AF (eds) Fibonacci numbers and their applications. Reidel, Dordrecht, pp 43–53
14. Inoue K, Aki S (2007) Joint distributions of numbers of runs of specified lengths in a sequence of Markov dependent multistate trials. Ann Inst Statist Math 59(3):577–595
15. Johnson BC (2002) The distribution of increasing 2-sequences in random permutations of arbitrary multi-sets. Statist Probab Lett 59(1):67–74
16. Kaplansky I (1944) Symbolic solution of certain problems in permutations. Bull Am Math Soc 50(12):906–914
17. Knuth DE (1997) The art of computer programming, vol 2 (3rd Ed.): seminumerical algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston
18. Kong Y (1999) General recurrence theory of ligand binding on a three-dimensional lattice. J Chem Phys 111:4790–4799
19. Kong Y (2001) A simple method for evaluating partition functions of linear polymers. J Phys Chem B 105:10111–10114
20. Kong Y (2006a) Distribution of runs and longest runs: a new generating function approach. J Am Stat Assoc 101:1253–1263
21. Kong Y (2006b) Packing dimers on (2p+ 1)×(2q+ 1) lattices. Phys Rev E 73(1):016106
22. Kong Y (2007) Asymptotics of the monomer-dimer model on two-dimensional semi-infinite lattices. Phys Rev E 75(5):051123
23. Kong Y (2015) Distributions of runs revisited. Commun Stat Theory Methods 44:4663–4678
24. Kong Y (2018a) Decoupling combinatorial complexity: a two-step approach to distributions of runs. Methodology and Computing in Applied Probability, https://doi.org/10.1007/s11009-018-9689-1 Google Scholar
25. Kong Y (2016) Number of appearances of events in random sequences: a new approach to non-overlapping runs. Commun Stat Theory Methods 45(22):6765–6772
26. Kong Y (2017a) The mth longest runs of multivariate random sequences. Ann Ins Stat Math 69:497–512
27. Kong Y (2017b) Number of appearances of events in random sequences: a new generating function approach to Type II and Type III runs. Ann Ins Stat Math 69:489–495
28. Kong Y (2018a) Distributions of successions of arbitrary multisets. Manuscript submittedGoogle Scholar
29. Kong Y (2018b) Joint distribution of rises, falls, and number of runs in random sequences. Commun Stat Theory Methods. https://doi.org/10.1080/03610926.2017.1414261
30. Kong Y (2018c) Decoupling combinatorial complexity: a two-step approach to distributions of runs. Methodol Comput Appl Probab. https://doi.org/10.1007/s11009-018-9689-1
31. Koutras M (1997) Waiting times and number of appearances of events in a sequence of discrete random variables. In: Balakrishnan N (ed) Advances in combinatorial methods and applications to probability and statistics. Birkhäuser, Boston, pp 363–384
32. Koutras MV, Alexandrou VA (1995) Runs, scans and urn model distributions: a unified Markov chain approach. Ann Ins Stat Math 47(4):743–766
33. Petkovsěk M, Wilf HS, Zeilberger D (1996) A = B. A K Peters Ltd, WellesleyGoogle Scholar
34. Philippou AN, Makri FS (1986) Successes, runs, and longest runs. Stat Probab Lett 4:211–215
35. Philippou AN, Georghiou C, Philippou GN (1983) A generalized geometric distribution and some of its properties. Stat Probab Lett 1:171–175
36. Reilly JW, Tanny SM (1980) Counting permutations by successions and other figures. Discret Math 32:69–76
37. Riordan J (1965) A recurrence for permutations without rising or falling successions. Ann Math Stat 36(2):708–710
38. Stanley RP (2011) Enumerative combinatorics. Cambridge studies in advanced mathematics, vol 1, 2nd edn. Cambridge University Press, Cambridge
39. Whitworth WA (1959) Choice and chance, 5th edn. Hafner Publishing CO., New York
40. Zasedatelev A, Gurskii G, Vol’kenshtein M (1971) Theory of one-dimensional adsorption. I. Adsorption of small molecules on a homopolymer. Mol Biol 5(2):194–198Google Scholar

© Springer Science+Business Media, LLC, part of Springer Nature 2019

## Authors and Affiliations

1. 1.School of Public HealthYale UniversityNew HavenUSA

## Section editors and affiliations

• Joseph Glaz
• 1
• Markos V. Koutras
• 2
1. 1.Department of StatisticsUniversity of ConnecticutStorrsUSA
2. 2.Dept. of Statistics and Insurance Science, University of PiraeusPiraeusGreece