k-Maximum Subarrays for Small k: Divide-and-Conquer Made Simpler

Daescu, Ovidiu; Malik, Hemant

doi:10.1007/978-3-030-34029-2_29

Ovidiu Daescu¹³ &
Hemant Malik¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11544))

Included in the following conference series:

International Symposium on Experimental Algorithms

655 Accesses

Abstract

Given an array A of n real numbers, the maximum subarray problem is to find a contiguous subarray which has the largest sum. The k-maximum subarrays problem is to find k such subarrays with the largest sums. For the 1−maximum subarray the well known divide-and-conquer algorithm, presented in most textbooks, although suboptimal, is easy to implement and can be made optimal with a simple change that speeds up the combine phase. On the other hand, the only known divide-and-conquer algorithm for $k > 1$, that is efficient for small values of k, is difficult to implement, due to the intricacies of the combine phase. In this paper, we show how to simplify the combine phase considerably while preserving the overall running time.

In the process of designing the combine phase of the algorithm we provide a simple, sublinear, $O(\sqrt{k} \log ^{3} k)$ time algorithm, for finding the k largest sums of $X + Y$, where X and Y are sorted arrays of size n and $k \le n^2$. The k largest sums are implicitly represented and can be enumerated with an additional O(k) time.

Our solution relies on simple operations such as merging sorted arrays, binary search and selecting the $k^{th}$ smallest number in an array. We have implemented our algorithm and report excellent performance as compared to previous results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Acm SIGMOD Record, vol. 22, pp. 207–216. ACM (1993)
Google Scholar
Bae, S.E., Takaoka, T.: Improved algorithms for the k-maximum subarray problem. Comput. J. 49(3), 358–374 (2006)
Article Google Scholar
Bengtsson, F., Chen, J.: Efficient algorithms for k maximum sums. Algorithmica 46(1), 27–41 (2006)
Article MathSciNet Google Scholar
Bengtsson, F., Chen, J.: Ranking k maximum sums. Theor. Comput. Sci. 377(1–3), 229–237 (2007)
Article MathSciNet Google Scholar
Bentley, J.: Algorithm design techniques. Commun. ACM 27(9), 865–871 (1984)
Article MathSciNet Google Scholar
Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)
Article MathSciNet Google Scholar
Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)
Article MathSciNet Google Scholar
Chazelle, B.: The soft heap: an approximate priority queue with optimal error rate. J. ACM (JACM) 47(6), 1012–1027 (2000)
Article MathSciNet Google Scholar
Cheng, C.H., Chen, K.Y., Tien, W.C., Chao, K.M.: Improved algorithms for the k maximum-sums problems. Theor. Comput. Sci. 362(1–3), 162–170 (2006)
Article MathSciNet Google Scholar
Cormen, T.H.: Introduction to Algorithms. MIT press, Cambridge (2009)
MATH Google Scholar
Frederickson, G.N., Johnson, D.B.: The complexity of selection and ranking in x+ y and matrices with sorted columns. J. Comput. Syst. Sci. 24(2), 197–208 (1982)
Article MathSciNet Google Scholar
Frederickson, G.N., Johnson, D.B.: Generalized selection and ranking: sorted matrices. SIAM J. Comput. 13(1), 14–30 (1984)
Article MathSciNet Google Scholar
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization. ACM SIGMOD Record 25(2), 13–23 (1996)
Article Google Scholar
Grenander, U.: Pattern analysis: lectures in pattern theory 2. Appl. Math. Sci. 24 (1978)
Google Scholar
Kaplan, H., Kozma, L., Zamir, O., Zwick, U.: Selection from heaps, row-sorted matrices and $ x+ y $ using soft heaps. arXiv preprint arXiv:1802.07041 (2018)

Download references

Author information

Authors and Affiliations

University of Texas at Dallas, Richardson, TX, 75080, USA
Ovidiu Daescu & Hemant Malik

Authors

Ovidiu Daescu
View author publications
You can also search for this author in PubMed Google Scholar
Hemant Malik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hemant Malik .

Editor information

Editors and Affiliations

Wilfrid Laurier University, Waterloo, ON, Canada
Ilias Kotsireas
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Ioannina, Ioannina, Greece
Konstantinos E. Parsopoulos
Delft University of Technology, Delft, The Netherlands
Dimitris Souravlias
University of Florida, Gainesville, FL, USA
Arsenis Tsokas

Appendices

Appendix-1: Linear Time Divide-and-Conquer Maximum Subarray

In this section we give a detailed description of a simple, linear time divide and conquer algorithm to find the maximum subarray (k = 1), by placing the algorithm in [4] in a standard divide-and-conquer framework.

Given an array A of n real numbers, the maximum subarray problem is to find a contiguous subarray whose sum of elements is maximum over all possible subarrays, including A itself. The divide and conquer algorithm divides A into two subarrays of equal size, makes two recursive calls, and then proceeds with the combine step while keeping track of the maximum subarray sum found in the process.

In the combine phase, at an internal node, we have two subarrays, $A_{1}$ (from the left child) and $A_{2}$ (from the right child). We define the following variables which are used to find the maximum subarray (see also Fig. 3):

$max\_left \leftarrow -\inf $	maximum subarray starting from leftmost index
$max\_right\leftarrow -\inf $	maximum subarray starting from rightmost index
$sum\leftarrow 0$	sum of all elements in array
$max\_cross\leftarrow -\inf $	maximum crossing subarray
$max\_sub\leftarrow -\inf $	maximum subarray

The idea is to make the combine phase run in O(1) time instead of the O(n) time, as described in [10]. For that, the values (and corresponding array indexes) of $max\_left$, $max\_right$, and sum must also be passed up from the recursive calls. The sum value at a given node can be found by adding up the sums from the children. The value $max\_left$ is either the $max\_left$ from the left child or the sum value from the left plus the $max\_left$ value from the right child. Similarly, the value $max\_right$ is either the $max\_right$ from the right child or the sum value from the right plus the $max\_right$ value from the left child. The following divide and conquer algorithm, $Maximum\_Subarray$, takes in the input an array A of size n and two integers, low and high, which correspond to the start index and end index of subarray $A[low \dots high]$, and finds and returns the maximum subarray of A[low, high].

In above algorithm, steps 1-7 take O(1) time. Steps 8-9 correspond to the recursive calls. Steps 10-15 take O(1) time. Therefore, the time taken by Algorithm 1 is:

$$ T(n) = 2T(n/2) + O(1) = O(n)$$

Appendix-2: An O($k \log K$) Algorithm for X + Y

Given two input arrays, A and B, each of size k, sorted in non-increasing order, and outputs the k-maximum sums of the pairwise addition of A and B. For our purpose, A would contain the k largest sums of $A_l$ for subarrays starting at the rightmost entry of $A_l$, while B would contain the k largest sums of $A_r$ for subarrays starting at the leftmost entry of $A_r$. We use a priority queue Q implemented as a binary heap to store pairwise sums, as they are generated. An AVL tree T is also used, to avoid placing duplicate pairs (i, j) in Q.

Time Complexity of Algorithm MAX_SUM_CROSS: Lines 12, 15 take $O(\log k)$ time for searching T, lines 13, 16 take $O(\log k)$ time to store indices in T, lines 9, 14, 17 take $O(\log k)$ to add or remove an element in the priority queue, and the while loop in line 8 runs k times. Therefore, the time complexity for algorithm MAX$\_$SUM$\_$CROSS is O($k \log k$).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Daescu, O., Malik, H. (2019). k-Maximum Subarrays for Small k: Divide-and-Conquer Made Simpler. In: Kotsireas, I., Pardalos, P., Parsopoulos, K., Souravlias, D., Tsokas, A. (eds) Analysis of Experimental Algorithms. SEA 2019. Lecture Notes in Computer Science(), vol 11544. Springer, Cham. https://doi.org/10.1007/978-3-030-34029-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-34029-2_29
Published: 14 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34028-5
Online ISBN: 978-3-030-34029-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

k-Maximum Subarrays for Small k: Divide-and-Conquer Made Simpler

Abstract

Access this chapter

References