# On the time required to sum n semigroup elements on a parallel machine with simultaneous writes

## Abstract

Suppose we have a completely-connected network of random-access machines which communicate by reading and writing data from their neighbours, with simultaneous reads and writes allowed. In the case of write conflicts, we allow any protocol which results in one of the competing values being written into the target register. We consider the *semigroup summation problem,* that is, the problem of summing n semigroup elements. If the semigroup is finite, we find that it can be solved in time O(log n/log log n) using only n processors, regardless of the details of the write-conflict resolution scheme used. In contrast, we show that any parallel machine for solving the summation problem for infinite cancellative semigroups must take time [log_{3}n], again, regardless of the details of the conflict resolution scheme. We give an example where it is possible to sum n “polynomial-sized” elements in less than [log_{3}n] time using only polynomially many processors. We are also able to show that such a machine must obey the [log_{3}n] lower-bound for elements which are only polynomially larger. Our upper-bounds are for a machine with a reasonable local instruction-set, whilst the lower-bounds are based on a communication argument, and thus hold no matter how much computational power is available to each processor. Similar results hold for a parallel machine whose processors communicate via a shared memory.

## Keywords

Shared Memory Parallel Machine Computation Graph Communication Register Input String## Preview

Unable to display preview. Download preview PDF.

## References

- 1.A. K. Chandra, L. J. Stockmeyer, and U. Vishkin, “A complexity theory for unbounded fan-in parallelism,”
*Proc. 23rd Ann. IEEE Symp. on Foundations of Computer Science*, pp. 1–13, 1982.Google Scholar - 2.S. A. Cook and C. Dwork, “Bounds on the time for parallel RAMs to compute simple functions,”
*Proc. 14th Ann. ACM Symp. on Theory of Computing*, pp. 231–233, May 1982.Google Scholar - 3.S. Fortune and J. Wyllie, “Parallelism in random access machines,”
*Proc. 10th Ann. ACM Symp. on Theory of Computing*, pp. 114–118, 1978.Google Scholar - 4.M. R. Garey and D. S. Johnson,
*Computers and intractability: a guide to the theory of NP-completeness*, W. H. Freeman, 1979.Google Scholar - 5.L. M. Goldschlager, “A universal interconnection pattern for parallel computers,”
*J. ACM*, vol. 29, no. 4, pp. 1073–1086, Oct. 1982.Google Scholar - 6.F. Meyer auf der Heide and R. Reischuk, “On the limits to speed up parallel machines by large hardware and unbounded communication,”
*Proc. 25th Ann. IEEE Symp. on Foundations of Computer Science*, pp. 56–64, Singer Island, Florida, Oct. 1984.Google Scholar - 7.F. Meyer auf der Heide and A. Wigderson, “The complexity of parallel sorting,”
*Proc. 26th Ann. IEEE Symp. on Foundations of Computer Science*, Portland, Oregon, Oct. 1985.Google Scholar - 8.C. P. Kruskal,
*Personal Communication*, May 1985.Google Scholar - 9.I. Parberry, “Parallel speedup of sequential machines: a defense of the parallel computation thesis,” Technical Report CS-84-17, Dept. of Computer Science, Penn. State Univ., Oct. 1984.Google Scholar
- 10.I. Parberry, “A complexity theory of parallel computation,”
*Ph. D. Thesis,*Dept. of Computer Science, Univ. of Warwick, May 1984.Google Scholar - 11.I. Parberry, “Some practical simulations of impractical parallel computers,” in
*VLSI: Algorithms and Architectures*, ed. P. Bertollazzi and F. Lucio, Proc. International Workshop on Parallel Computing and VLSI, pp. 27–37, North-Holland, 1985.Google Scholar - 12.I. Parberry and G. Schnitger, “Parallel computation with threshold functions (Preliminary Version),” Technical Report CS-85-32, Dept. of Computer Science, Penn. State Univ., Dec. 1985.Google Scholar
- 13.R. Reischuk, “A lower time-bound for parallel random-access machines without simultaneous writes,” Research Report RJ3431, IBM Research, San Jose, Mar. 1982.Google Scholar
- 14.Y. Shiloach and U. Vishkin, “Finding the maximum, sorting and merging in a parallel computation model,”
*J. Algorithms*, vol. 2, pp. 88–102, 1981.Google Scholar - 15.U. Vishkin and A. Wigderson, “Trade-offs between depth and width in parallel computation,”
*Proc. 24th Ann. IEEE Symp. on Foundations of Computer Science*, Tucson, Arizona, Nov. 1983.Google Scholar - 16.A. C. Yao, “Separating the polynomial-time hierarchy by oracles,”
*Proc. 26th Ann. IEEE Symp. on Foundations of Computer Science*, Portland, Oregon, Oct. 1985.Google Scholar