Abstract
We consider the distribution of the number of successes in success runs of length at least k in a binary sequence. One important application of this statistic is in the detection of tandem repeats among DNA sequence segments. In the literature, its distribution has been computed for independent sequences and Markovian sequences of order one. We extend these results to Markovian sequences of a general order. We also show that the statistic can be represented as a function of the number of overlapping success runs of lengths k and k + 1 in the sequence, and give immediate consequences of this representation.
Similar content being viewed by others
References
G. Benson, “Tandem repeats finder: A program to analyze DNA sequences,” Nucleic Acids Research vol. 27 pp. 573–580, 1999.
G. Benson and X. Su, “On the distribution of k-tuple matches for sequence homology: A constant time exact calculation of the variance,” Journal of Computational Biology vol. 5 pp. 87–100, 1998.
J. C. Fu and M. V. Koutras, “Distribution theory of runs: A Markov chain approach,” Journal of the American Statistical Association vol. 89 pp. 1050–1058, 1994.
J. C. Fu, W. Y. W. Lou, Z.-D. Bai, and G. Li, “The exact and limiting distributions for the number of successes in success runs within a sequence of Markov-dependent two-state trials,” Annals of the Institute of Statistical Mathematics vol. 54(4) pp. 719–730, 2002.
W. Y. W. Lou, “The exact distribution of the k-tuple statistic for sequence homology,” Statistics & Probabability Letters vol. 61 pp. 51–59, 2003.
D. E. K. Martin, “On the distribution of the number of successes in fourth- or lower-order Markovian trials,” Computers & Operations Research vol. 27(2) pp. 93–109, 2000.
Author information
Authors and Affiliations
Corresponding author
Additional information
AMS 2000 Subject Classification
60E05, 60J05
Rights and permissions
About this article
Cite this article
Martin, D.E.K. Distribution of the Number of Successes in Success Runs of Length at Least k in Higher-Order Markovian Sequences. Methodol Comput Appl Probab 7, 543–554 (2005). https://doi.org/10.1007/s11009-005-5007-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11009-005-5007-9