The mth longest runs of multivariate random sequences

Article
  • 119 Downloads

Abstract

The distributions of the mth longest runs of multivariate random sequences are considered. For random sequences made up of k kinds of letters, the lengths of the runs are sorted in two ways to give two definitions of run length ordering. In one definition, the lengths of the runs are sorted separately for each letter type. In the second definition, the lengths of all the runs are sorted together. Exact formulas are developed for the distributions of the mth longest runs for both definitions. The derivations are based on a two-step method that is applicable to various other runs-related distributions, such as joint distributions of several letter types and multiple run lengths of a single letter type.

Keywords

Generating function Combinatorial identities Randomness test Distribution-free statistical test Runs length test Biological sequence analysis 

Notes

Acknowledgments

This work was supported in part by the Clinical and Translational Science Award UL1 RR024139 from the National Center for Research Resources, National Institutes of Health.

References

  1. Balakrishnan, N., Koutras, M. V. (2002). Runs and scans with applications. New York: Wiley.Google Scholar
  2. Bradley, J. V. (1968). Distribution-free statistical tests. Englewood Cliffs: Prentice-Hall.MATHGoogle Scholar
  3. Burr, E. J., Cane, G. (1961). Longest run of consecutive observations having a specified attribute. Biometrika, 48, 461–465.Google Scholar
  4. Eryilmaz, S. (2006). Some results associated with the longest run statistic in a sequence of Markov dependent trials. Applied Mathematics and Computation, 175, 119–130.MathSciNetCrossRefMATHGoogle Scholar
  5. Fu, J. C., Wang, L., Lou, W. Y. W. (2003). On exact and large deviation approximation for the distribution of the longest run in a sequence of two-state Markov dependent trials. Journal of Applied Probability, 40, 346–360.Google Scholar
  6. Glaz, J., Naus, J., Wallenstein, S. (2001). Scan statistics. New York: Springer.Google Scholar
  7. Godbole, A. P., Papastavridis, S. G. (Eds.). (1994). Runs and patterns in probability: selected papers. Dordrecht: Kluwer Academic Publishers.Google Scholar
  8. Knuth, D. E. (1997). The art of computer programming. Seminumerical Algorithms (3rd edn, vol. 2). Boston: Addison-Wesley Longman Publishing Co. Inc.Google Scholar
  9. Kong, Y. (2006). Distribution of runs and longest runs: a new generating function approach. Journal of the American Statistical Association, 101, 1253–1263.MathSciNetCrossRefMATHGoogle Scholar
  10. Kong, Y. (2015a). Distributions of runs revisited. Communications in Statistics Theory and Methods, 44, 4663–4678.MathSciNetCrossRefMATHGoogle Scholar
  11. Kong, Y. (2015b). Number of appearances of events in random sequences: a new approach to non-overlapping runs. Communications in Statistics Theory and Methods (to appear).Google Scholar
  12. Kong, Y. (2015c). Number of appearances of events in random sequences: a new generating function to Type II and Type III runs. Annals of the Institute of Statistical Mathematics. doi:10.1007/s10463-015-0549-2.
  13. Koutras, M. V., Alexandrou, V. A. (1995). Runs, scans and urn model distributions: A unified Markov chain approach. Annals of the Institute of Statistical Mathematics, 47(4), 743–766.Google Scholar
  14. Koutras, M. V., Papastavridis, S. G. (1993). On the number of runs and related statistics. Statistica Sinica, 3, 277–294.Google Scholar
  15. Lou, W. Y. W. (1996). On runs and longest run tests: a method of finite Markov chain imbedding. Journal of the American Statistical Association, 91, 1595–1601.MathSciNetCrossRefMATHGoogle Scholar
  16. Makri, F. S., Philippou, A. N., Psillakis, Z. M. (2007). Shortest and longest length of success runs in binary sequences. Journal of Statistical Planning and Inference, 137, 2226–2239.Google Scholar
  17. Mood, A. M. (1940). The distribution theory of runs. Annals of Mathematical Statistics, 11, 367–392.MathSciNetCrossRefMATHGoogle Scholar
  18. Morris, M., Schachtel, G., Karlin, S. (1993). Exact formulas for multitype run statistics in a random ordering. SIAM Journal on Discrete Mathematics, 6, 70–86.Google Scholar
  19. Mosteller, F. (1941). Note on an application of runs to quality control charts. Annals of Mathematical Statistics, 12, 228–232.MathSciNetCrossRefMATHGoogle Scholar
  20. Muselli, M. (1996). Useful inequalities for the longest run distribution. Statistics & Probability Letters, 46, 239–249.MathSciNetCrossRefMATHGoogle Scholar
  21. Olmstead, P. S. (1958). Runs determined in a sample by an arbitrary cut. The Bell System Technical Journal, 37, 55–82.CrossRefGoogle Scholar
  22. Petkovsěk, M., Wilf, H.S., Zeilberger, D. (1996). \(\mathit{A = B}\). Wellesley: A K Peters Ltd.Google Scholar
  23. Philippou, A. N., Makri, F. S. (1985). Longest success runs and Fibonacci-type polynomials. The Fibonacci Quarterly, 23, 338–346.Google Scholar
  24. Philippou, A. N., Makri, F. S. (1986). Successes, runs, and longest runs. Statistics & Probability Letters, 4, 211–215.Google Scholar
  25. Schilling, M. F. (1990). The longest run of heads. The College Mathematics Journal, 21, 196–207.MathSciNetCrossRefMATHGoogle Scholar
  26. Schuster, E. F. (1996). The conditional distribution of the longest run in a sample from a multiletter alphabet. Communications in Statistics Simulation and Computation, 25, 215–224.MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2016

Authors and Affiliations

  1. 1.Department of Molecular Biophysics and Biochemistry, W.M. Keck Foundation Biotechnology Resource LaboratoryYale UniversityNew HavenUSA

Personalised recommendations