Abstract
Given an array A of n real numbers, the maximum subarray problem is to find a contiguous subarray which has the largest sum. The k-maximum subarrays problem is to find k such subarrays with the largest sums. For the 1−maximum subarray the well known divide-and-conquer algorithm, presented in most textbooks, although suboptimal, is easy to implement and can be made optimal with a simple change that speeds up the combine phase. On the other hand, the only known divide-and-conquer algorithm for \(k > 1\), that is efficient for small values of k, is difficult to implement, due to the intricacies of the combine phase. In this paper, we show how to simplify the combine phase considerably while preserving the overall running time.
In the process of designing the combine phase of the algorithm we provide a simple, sublinear, \(O(\sqrt{k} \log ^{3} k)\) time algorithm, for finding the k largest sums of \(X + Y\), where X and Y are sorted arrays of size n and \(k \le n^2\). The k largest sums are implicitly represented and can be enumerated with an additional O(k) time.
Our solution relies on simple operations such as merging sorted arrays, binary search and selecting the \(k^{th}\) smallest number in an array. We have implemented our algorithm and report excellent performance as compared to previous results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Acm SIGMOD Record, vol. 22, pp. 207–216. ACM (1993)
Bae, S.E., Takaoka, T.: Improved algorithms for the k-maximum subarray problem. Comput. J. 49(3), 358–374 (2006)
Bengtsson, F., Chen, J.: Efficient algorithms for k maximum sums. Algorithmica 46(1), 27–41 (2006)
Bengtsson, F., Chen, J.: Ranking k maximum sums. Theor. Comput. Sci. 377(1–3), 229–237 (2007)
Bentley, J.: Algorithm design techniques. Commun. ACM 27(9), 865–871 (1984)
Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)
Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)
Chazelle, B.: The soft heap: an approximate priority queue with optimal error rate. J. ACM (JACM) 47(6), 1012–1027 (2000)
Cheng, C.H., Chen, K.Y., Tien, W.C., Chao, K.M.: Improved algorithms for the k maximum-sums problems. Theor. Comput. Sci. 362(1–3), 162–170 (2006)
Cormen, T.H.: Introduction to Algorithms. MIT press, Cambridge (2009)
Frederickson, G.N., Johnson, D.B.: The complexity of selection and ranking in x+ y and matrices with sorted columns. J. Comput. Syst. Sci. 24(2), 197–208 (1982)
Frederickson, G.N., Johnson, D.B.: Generalized selection and ranking: sorted matrices. SIAM J. Comput. 13(1), 14–30 (1984)
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization. ACM SIGMOD Record 25(2), 13–23 (1996)
Grenander, U.: Pattern analysis: lectures in pattern theory 2. Appl. Math. Sci. 24 (1978)
Kaplan, H., Kozma, L., Zamir, O., Zwick, U.: Selection from heaps, row-sorted matrices and \( x+ y \) using soft heaps. arXiv preprint arXiv:1802.07041 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix-1: Linear Time Divide-and-Conquer Maximum Subarray
In this section we give a detailed description of a simple, linear time divide and conquer algorithm to find the maximum subarray (k = 1), by placing the algorithm in [4] in a standard divide-and-conquer framework.
Given an array A of n real numbers, the maximum subarray problem is to find a contiguous subarray whose sum of elements is maximum over all possible subarrays, including A itself. The divide and conquer algorithm divides A into two subarrays of equal size, makes two recursive calls, and then proceeds with the combine step while keeping track of the maximum subarray sum found in the process.
In the combine phase, at an internal node, we have two subarrays, \(A_{1}\) (from the left child) and \(A_{2}\) (from the right child). We define the following variables which are used to find the maximum subarray (see also Fig. 3):
\(max\_left \leftarrow -\inf \) | maximum subarray starting from leftmost index |
\(max\_right\leftarrow -\inf \) | maximum subarray starting from rightmost index |
\(sum\leftarrow 0\) | sum of all elements in array |
\(max\_cross\leftarrow -\inf \) | maximum crossing subarray |
\(max\_sub\leftarrow -\inf \) | maximum subarray |
The idea is to make the combine phase run in O(1) time instead of the O(n) time, as described in [10]. For that, the values (and corresponding array indexes) of \(max\_left\), \(max\_right\), and sum must also be passed up from the recursive calls. The sum value at a given node can be found by adding up the sums from the children. The value \(max\_left\) is either the \(max\_left\) from the left child or the sum value from the left plus the \(max\_left\) value from the right child. Similarly, the value \(max\_right\) is either the \(max\_right\) from the right child or the sum value from the right plus the \(max\_right\) value from the left child. The following divide and conquer algorithm, \(Maximum\_Subarray\), takes in the input an array A of size n and two integers, low and high, which correspond to the start index and end index of subarray \(A[low \dots high]\), and finds and returns the maximum subarray of A[low, high].
In above algorithm, steps 1-7 take O(1) time. Steps 8-9 correspond to the recursive calls. Steps 10-15 take O(1) time. Therefore, the time taken by Algorithm 1 is:
Appendix-2: An O(\(k \log K\)) Algorithm for X + Y
Given two input arrays, A and B, each of size k, sorted in non-increasing order, and outputs the k-maximum sums of the pairwise addition of A and B. For our purpose, A would contain the k largest sums of \(A_l\) for subarrays starting at the rightmost entry of \(A_l\), while B would contain the k largest sums of \(A_r\) for subarrays starting at the leftmost entry of \(A_r\). We use a priority queue Q implemented as a binary heap to store pairwise sums, as they are generated. An AVL tree T is also used, to avoid placing duplicate pairs (i, j) in Q.
Time Complexity of Algorithm MAX_SUM_CROSS: Lines 12, 15 take \(O(\log k)\) time for searching T, lines 13, 16 take \(O(\log k)\) time to store indices in T, lines 9, 14, 17 take \(O(\log k)\) to add or remove an element in the priority queue, and the while loop in line 8 runs k times. Therefore, the time complexity for algorithm MAX\(\_\)SUM\(\_\)CROSS is O(\(k \log k\)).
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Daescu, O., Malik, H. (2019). k-Maximum Subarrays for Small k: Divide-and-Conquer Made Simpler. In: Kotsireas, I., Pardalos, P., Parsopoulos, K., Souravlias, D., Tsokas, A. (eds) Analysis of Experimental Algorithms. SEA 2019. Lecture Notes in Computer Science(), vol 11544. Springer, Cham. https://doi.org/10.1007/978-3-030-34029-2_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-34029-2_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34028-5
Online ISBN: 978-3-030-34029-2
eBook Packages: Computer ScienceComputer Science (R0)