Distribution on Contingency of Alignment of Two Literal Sequences Under Constrains
- First Online:
- Cite this article as:
- Jäntschi, L. & Bolboacă, S.D. Acta Biotheor (2015) 63: 55. doi:10.1007/s10441-014-9243-7
- 87 Downloads
The case of ungapped alignment of two literal sequences under constrains is considered. The analysis lead to general formulas for probability mass function and cumulative distribution function for the general case of using an alphabet with a chosen number of letters (e.g. 4 for deoxyribonucleic acid sequences) in the expression of the literal sequences. Formulas for three statistics including mean, mode, and standard deviation were obtained. Distributions are depicted for three important particular cases: alignment on binary sequences, alignment of trinomial series (such as coming from generalized Kronecker delta), and alignment of genetic sequences (with four literals in the alphabet). A particular case when sequences contain each letter of the alphabet at least once in both sequences has also been analyzed and some statistics for this restricted case are given.