Abstract
A maximum contiguous subsequence of a real-valued sequence is a contiguous subsequence with the maximum cumulative sum. A minimal maximum contiguous subsequence is a minimal contiguous subsequence among all maximum ones of the sequence. We have previously designed and implemented a domain-decomposed parallel algorithm on cluster systems with Message Passing Interface that finds all successive minimal maximum subsequences of a random sample sequence from a normal distribution with negative mean. The parallel cluster algorithm employs the theory of random walk to derive an approximate probabilistic length upper bound for overlapping subsequences in an appropriate probabilistic setting, which is incorporated in the algorithm to facilitate the concurrent computation of all minimal maximum subsequences in hosting processors. We present in this article: (1) a generalization of the parallel cluster algorithm with improvements for input of arbitrary real-valued sequence, and (2) an empirical study of the speedup and efficiency achieved by the parallel algorithm with synthetic normally-distributed random sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akl, S.G., Guenther, G.R.: Applications of broadcasting with selective reduction to the maximal sum subsegment problem. Int. J. High Speed Comput. 3(2), 107–119 (1991)
Altschul, S.F.: Amino acid substitution matrices from an information theoretic perspective. J. Mol. Biol. 219(3), 555–565 (1991)
Alves, C.E.R., Cáceres, E.N., Song, S.W.: Finding all maximal contiguous subsequences of a sequence of numbers in \(O(1)\) communication rounds. IEEE Trans. Parallel Distrib. Syst. 24(3), 724–733 (2013)
Bernholt, T., Hofmeister, T.: An algorithm for a generalized maximum subsequence problem. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 178–189. Springer, Heidelberg (2006). https://doi.org/10.1007/11682462_20
Brendel, V., Bucher, P., Nourbakhsh, I.R., Blaisdell, B.E., Karlin, S.: Methods and algorithms for statistical analysis of protein sequences. Proc. Nat. Acad. Sc. U.S.A. 89(6), 2002–2006 (1992)
Dai, H.-K., Su, H.-C.: A parallel algorithm for finding all successive minimal maximum subsequences. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 337–348. Springer, Heidelberg (2006). https://doi.org/10.1007/11682462_33
Dai, H.K., Wang, Z.: A parallel algorithm for finding all minimal maximum subsequences via random walk. In: Dediu, A.-H., Formenti, E., Martín-Vide, C., Truthe, B. (eds.) LATA 2015. LNCS, vol. 8977, pp. 133–144. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15579-1_10
Dembo, A., Karlin, S.: Strong limit theorems of empirical functionals for large exceedances of partial sums of I.I.D. variables. Ann. Probab. 19(4), 1737–1755 (1991)
Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley Series in Probability and Mathematical Statistics, 2nd Edn., vol. 2. Wiley, New York (1971)
He, X., Huang, C.-H.: Communication efficient BSP algorithm for all nearest smaller values problem. J. Parallel Distrib. Comput. 61(10), 1425–1438 (2001)
JáJá, J.: An Introduction to Parallel Algorithms. Addison-Wesley, Boston (1992)
Karlin, S., Altschul, S.F.: Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Nat. Acad. Sci. U.S.A. 87(6), 2264–2268 (1990)
Karlin, S., Altschul, S.F.: Applications and statistics for multiple high-scoring segments in molecular sequences. Proc. Nat. Acad. Sci. U.S.A. 90(12), 5873–5877 (1993)
Karlin, S., Brendel, V.: Chance and statistical significance in protein and DNA sequence analysis. Science 257(5066), 39–49 (1992)
Karlin, S., Dembo, A.: Limit distributions of maximal segmental score among Markov-dependent partial sums. Adv. Appl. Probab. 24, 113–140 (1992)
Karlin, S., Dembo, A., Kawabata, T.: Statistical composition of high-scoring segments from molecular sequences. Ann. Stat. 18(2), 571–581 (1990)
Lin, T.-C., Lee, D.T.: Randomized algorithm for the sum selection problem. Theoret. Comput. Sci. 377(1–3), 151–156 (2007)
Ruzzo, W.L., Tompa, M.: A linear time algorithm for finding all maximal scoring subsequences. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, pp. 234–241. International Society for Computational Biology (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dai, H.K. (2019). Finding All Minimal Maximum Subsequences in Parallel. In: Dang, T., Küng, J., Takizawa, M., Bui, S. (eds) Future Data and Security Engineering. FDSE 2019. Lecture Notes in Computer Science(), vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-35653-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35652-1
Online ISBN: 978-3-030-35653-8
eBook Packages: Computer ScienceComputer Science (R0)