Advertisement

Near Linear Time Construction of an Approximate Index for All Maximum Consecutive Sub-sums of a Sequence

  • Ferdinando Cicalese
  • Eduardo Laber
  • Oren Weimann
  • Raphael Yuster
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7354)

Abstract

We present a novel approach for computing all maximum consecutive subsums in a sequence of positive integers in near linear time. Solutions for this problem over binary sequences can be used for reporting existence (and possibly one occurrence) of Parikh vectors in a bit string. Recently, several attempts have been tried to build indexes for all Parikh vectors of a binary string in subquadratic time. However, to the best of our knowledge, no algorithm is know to date which can beat by more than a polylogarithmic factor the natural Θ(n 2) exhaustive procedure. Our result implies an approximate construction of an index for all Parikh vectors of a binary string in O(n 1 + η ) time, for any constant η > 0. Such index is approximate, in the sense that it leaves a small chance for false positives, i.e., Parikh vectors might be reported which are not actually present in the string. No false negative is possible. However, we can tune the parameters of the algorithm so that we can strictly control such a chance of error while still guaranteeing strong sub-quadratic running time.

Keywords

Parikh vectors maximum subsequence sum approximate pattern matching approximation algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amir, A., Apostolico, A., Landau, G.M., Satta, G.: Efficient text fingerprinting via Parikh mapping. J. Discrete Algorithms 1(5-6), 409–421 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Babai, L., Felzenszwalb, P.F.: Computing rank-convolutions with a mask. ACM Trans. Algorithms 6(1), 1–13 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Benson, G.: Composition Alignment. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 447–461. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Bergkvist, A., Damaschke, P.: Fast algorithms for finding disjoint subsequences with extremal densities. Pattern Recognition 39, 2281–2292 (2006)zbMATHCrossRefGoogle Scholar
  5. 5.
    Böcker, S.: Sequencing from compomers: Using mass spectrometry for DNA de novo sequencing of 200 + nt. Journal of Computational Biology 11(6), 1110–1134 (2004)CrossRefGoogle Scholar
  6. 6.
    Böcker, S., Lipták, Z.: A fast and simple algorithm for the Money Changing Problem. Algorithmica 48(4), 413–432 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Bremner, D., Chan, T.M., Demaine, E.D., Erickson, J., Hurtado, F., Iacono, J., Langerman, S., Taslakian, P.: Necklaces, Convolutions, and X + Y. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 160–171. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Burcsi, P., Cicalese, F., Fici, G., Lipták, Z.: On Approximate Jumbled Pattern Matching. Theory of Computing Systems 50(1), 35–51 (2012)CrossRefGoogle Scholar
  9. 9.
    Burcsi, P., Cicalese, F., Fici, G., Lipták, Z.: On Table Arrangements, Scrabble Freaks, and Jumbled Pattern Matching. In: Boldi, P. (ed.) FUN 2010. LNCS, vol. 6099, pp. 89–101. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Butman, A., Eres, R., Landau, G.M.: Scaled and permuted string matching. Inf. Process. Lett. 92(6), 293–297 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Chan, T.M.: All-pairs shortest paths with real weights in O(n 3/logn) time. Algorithmica 50(2), 236–243 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Chen, Y.H., Lu, H.I., Tang, C.Y.: Disjoint Segments with Maximum Density. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3515, pp. 845–850. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  13. 13.
    Cicalese, F., Fici, G., Lipták, Z.: Searching for jumbled patterns in strings. In: Proc. of the Prague Stringology Conference 2009 (PSC 2009), pp. 105–117 (2009)Google Scholar
  14. 14.
    Cieliebak, M., Erlebach, T., Lipták, Z., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: Combinatorics of weighted strings. Discrete Applied Mathematics 137(1), 27–46 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Eres, R., Landau, G.M., Parida, L.: Permutation pattern discovery in biosequences. Journal of Computational Biology 11(6), 1050–1060 (2004)CrossRefGoogle Scholar
  16. 16.
    Jokinen, P., Tarhio, J., Ukkonen, E.: A comparison of approximate string matching algorithms. Software Practice and Experience 26(12), 1439–1458 (1996)CrossRefGoogle Scholar
  17. 17.
    Moosa, T.M., Rahman, M.S.: Sub-quadratic time and linear size data structures for permutation matching in binary strings. J. Discrete Algorithms 10(1), 5–9 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Parida, L.: Gapped Permutation Patterns for Comparative Genomics. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 376–387. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ferdinando Cicalese
    • 1
  • Eduardo Laber
    • 2
  • Oren Weimann
    • 3
  • Raphael Yuster
    • 4
  1. 1.Department of Computer ScienceUniversity of SalernoItaly
  2. 2.Department of InformaticsPUC-RioRio de JaneiroBrazil
  3. 3.Department of Computer ScienceUniversity of HaifaIsrael
  4. 4.Department of MathematicsUniversity of HaifaIsrael

Personalised recommendations