Hierarchical clustering: A structure for scalable multiprocessor operating system design

Unrau, Ronald C.; Krieger, Orran; Gamsa, Benjamin; Stumm, Michael

doi:10.1007/BF01245400

Hierarchical clustering: A structure for scalable multiprocessor operating system design

Published: March 1995

Volume 9, pages 105–134, (1995)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Ronald C. Unrau¹,
Orran Krieger²,
Benjamin Gamsa³ &
…
Michael Stumm⁴

150 Accesses
26 Citations
Explore all metrics

Abstract

We introduce the concept ofhierarchical clustering as a way to structure shared-memory multiprocessor operating systems for scalability. The concept is based on clustering and hierarchical system design. Hierarchical clustering leads to a modular system, composed of easy-to-design and efficient building blocks. The resulting structure is scalable because it 1) maximizes locality, which is key to good performance in NUMA (non-uniform memory access) systems and 2) provides for concurrency that increases linearly with the number of processors. At the same time, there is tight coupling within a cluster, so the system performs well for local interactions that are expected to constitute the common case. A clustered system can easily be adapted to different hardware configurations and architectures by changing the size of the clusters. We show how this structuring technique is applied to the design of a microkernel-based operating system calledHurricane. This prototype system is the first complete and running implementation of its kind and demonstrates the feasibility of a hierarchically clustered system. We present performance results based on the prototype, demonstrating the characteristics and behavior of a clustered system. In particular, we show how clustering trades off the efficiencies of tight coupling for the advantages of replication, increased locality, and decreased lock contention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

mxkernel: A Novel System Software Stack for Data Processing on Modern Hardware

Article Open access 06 October 2020

A Classification of Resource Heterogeneity in Multiprocessing Systems

Exploring Data Locality for Clustered Enterprise Applications

References

Ahmad, I., and Ghafoor, A. 1991. Semi-distributed load balancing for massively parallel multicomputer systems.IEEE Transactions on Software Engineering, 17, 10 (Oct.): 987–1004.
Google Scholar
Anderson, T.E. 1990. The performance of spin lock alternatives for shared-memory multiprocessors.IEEE Transactions on Parallel and Distributed Systems, 1, 1 (Jan.): 6–16.
Google Scholar
Balan, R., and Gollhardt, K. 1992. A scalable implementation of virtual memory HAT layer for shared memory multiprocessor machines. InProc., USEN1X Summer '92 Conference (San Antonio, Tex., June), pp. 107–115.
Barach, D., Wells, R., and Uban, T. 1990. Design of parallel virtual memory management on the TC2000. Technical Report 7296, BBN Advanced Computers Inc., Cambridge, Mass.
Google Scholar
Barak, A., and Kornatzky, Y. 1987. Design principles of operating systems for large scale multicomputers. Technical Report RC 13220 (#59114), IBM T.J. Watson Research Center.
BBN. 1988.Overview of the Butterfly GP1000. BBN Advanced Computers, Inc.
BBN. 1989.TC2000 Technical Product Summary. BBN Advanced Computers, Inc.
Bolosky, W.J., Fitzgerald, R.P, and Scott, M.L. 1989. Simple but effective techniques for NUMA memory management. InProc., 12th ACM Symposium on Operating System Principles, pp. 19–31.
Brecht, T.B. 1993. On the importance of parallel application placement in NUMA multiprocessors. InProc., SEDMS IV, Symposium on Experiences with Distributed and Multiprocessor Systems, USENIX Association, pp. 1–18.
Campbell, M., Holt, R., and Slice, J. 1991. Lock granularity tuning mechanisms in SVR4/MP. InProc., SEDMS II, Symposium on Experiences with Distributed and Multiprocessor Systems, USENIX Association, pp. 221–228.
Chaiken, D., Kubiatowicz, J., and Agarwal, A. 1991. LimitLESS directories: A scalable cache coherence scheme. InProc., 4th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Santa Clara), ACM Press, pp. 224–234.
Google Scholar
Chang, H.H.Y., and Rosenburg, B. 1992. Experience porting Mach to the RP3 large-scale shared-memory multiprocessor.Future Generation Computer Systems, 7, 2/3 (Apr.): 259–267.
Google Scholar
Chaves, E., Das, P.C., LeBlanc, T.J., Marsh, B.D., and Scott, M.L. 1993. Kernel-kernel communication in a shared-memory multiprocessor.Concurrency: Practice and Experience, 5, 3, (May): pp. 171–191.
Google Scholar
Chen, J.B., and Bershad, B.N. 1993. The impact of operating system structure on memory system performance. InProc., 14th ACM Symposium on Operating Systems Principles, pp. 120–133.
Cheriton, D.R. 1988. The V distributed system.Communications of the ACM, 31, 3 (Mar.): 314–333.
Google Scholar
Cheriton, D.R., Goosen, H., and Boyle, P. 1991. ParaDiGM: A highly scalable shared-memory multi-computer architecture.IEEE Computer, 24, 2 (Feb.): 33–46.
Google Scholar
Cox, A.L., and Fowler, R.J. 1989. The implementation of a coherent memory abstraction on a NUMA multiprocessor: Experiences with PLATINUM. InProc., 12th ACM Symposium on Operating System Principles, pp. 32–44.
Feitelson, D.G., and Rudolph, L. 1990. Distributed hierarchical control for parallel processing.IEEE Computer, 23, 5 (May): 65–81.
Google Scholar
Frank, S., Rothnie, J., and Burkhardt, H. 1993. The KSR1: Bridging the gap between shared memory and MPPs. InIEEE Compcon 1993 Digest of Papers, pp. 285–294.
Gamsa, B. 1992. Region-oriented main memory management in shared-memory NUMA multiprocessors. Master's thesis, Department of Computer Science, University of Toronto, Toronto, Canada.
Google Scholar
Gamsa, B., Krieger, O., and Stumm, M. 1993. Optimizing IPC performance for shared-memory multiprocessors. Technical Report 294, CSRI, University of Toronto, Toronto, Canada.
Google Scholar
Hagersten, E., Landin, A., and Haridi, S. 1992. DDM — A cache-only memory architecture.IEEE Computer, 25, 9 (Sept.): 44–54.
Google Scholar
Krieger, O. 1994. HFS: A flexible file system for shared memory multiprocessors. Ph.D. thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.
Google Scholar
Krieger, O., and Stumm, M. 1993. HFS: A flexible file system for large-scale multiprocessors. InProc., 1993 DAGS/PC Symposium (Hanover, N.H., June), Dartmouth Institute for Advanced Graduate Studies, pp. 6–14.
Krieger, O., Stumm, M., and Unrau, R. 1994. The Alloc Stream Facility: A redesign of application-level stream I/O.IEEE Computer, 27, 3 (Mar.): 75–83.
Google Scholar
Kuck, D.J., Davidson, E.S., Lawrie, D.H., and Sameh, A.H. 1986. Parallel supercomputing today and the Cedar approach.Science, 231 (Feb.): 967–974.
Google Scholar
LaRowe Jr., R.P., Ellis, C.S., and Kaplan, L.S. 1991. Tuning NUMA memory management for applications and architectures. InProc., 13th ACM Symposium on Operating System Principles (Asilomar, Pacific Grove, Calif.), Association for Computing Machinery SIGOPS, pp. 137–151.
Google Scholar
Lenoski, D., Laudon, J., Gharachorloo, K., Weber, W., Gupta, A., Hennessy, J., Horowitz, M., and Lam, M.S. 1992. The Stanford DASH Multiprocessor.Computer, 25, 3 (Mar.): 63–79.
Google Scholar
Mellor-Crummey, J.M., and Scott, M.L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors.ACM Transactions on Computer Systems, 9, 1 (Feb.): 21–65.
Google Scholar
Oed, W. 1993. The Cray Research massively parallel processor system CRAY T3D. Technical report, Cray Research GmbH, München, Germany.
Google Scholar
Peacock, J.K., Saxena, S., Thomas, T., Yang, F., and Yu, W. 1992. Experiences from multithreading system V release 4. InProc., SEDMS III, Symposium on Experiences with Distributed and Multiprocessor Systems, USENIX Association, pp. 77–91.
Pfister, G.F., Brantley, W.C., George, D.A., Harvey, S.L., Kleinfelder, W.J., McAuliffe, K.P., Melton, E.A., Norton, V.A., and Weiss, J. 1985. The IBM Research Parallel Processor Prototype. InProc., 1985 International Conference on Parallel Processing, pp. 764–771.
Scott, M.L., LeBlanc, T.J., Marsh, B.D., Becker, T.G., Dubnicki, C., Markatos, E.P., and Smithline, N.G. 1990. Implementation issues for the Psyche multiprocessor operating system.Computing Systems, 3, 1 (Jan.): 101–137.
Google Scholar
Simon, H.A. 1985.The Sciences of the Artificial, 2nd ed. MIT Press, Cambridge, Mass.
Google Scholar
Stumm, M., Unrau, R., and Krieger, O. 1992. Designing a scalable operating system for shared memory multiprocessors. InProc., USENIX Workshop on Microkernels and Other Kernel Architectures, pp. 285–303.
Unrau, R. 1993. Scalable memory management through hierarchical symmetric multiprocessing. Ph.D. thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.
Google Scholar
Unrau, R., Krieger, O., Gamsa, B., and Stumm, M. 1994. Experiences with locking in a NUMA multiprocessor operating system kernel. InProc., USENIX OSDI Symposium (Nov.), pp. 139–152.
Vranesic, Z.G., Stumm, M., Lewis, D., and White, R. 1991. Hector: A hierarchically structured shared-memory multiprocessor.IEEE Computer, 24, 1 (Jan.): 72–80.
Google Scholar
Zajcew, R., Roy, P., Black, D., Peak, C., Guedes, P., Kemp, B., LoVerso, J., Leibensperger, M., Barnett, M., Rabii, F., and Netterwala, D. 1993. An OSF/1 UNIX for massively parallel multicomputers. InProc., USENIX Winter Conference, USENIX Association, pp. 449–468.
Zhou, Z., and Brecht, T. 1991. Processor pool-based scheduling for large-scale NUMA multiprocessors. InProc., ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems (San Diego), ACM Press, pp. 133–142.
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Canada, 844 Don Mills Rd., M3C 1V7, North York, Ontario
Ronald C. Unrau
Department of Electrical and Computer Engineering, University of Toronto, M5S 1A4, Toronto, Canada
Orran Krieger
Department of Computer Science, University of Toronto, M5S IA4, Toronto, Canada
Benjamin Gamsa
Department of Electrical and Computer Engineering, University of Toronto, M5S 1A4, Toronto, Canada
Michael Stumm

Authors

Ronald C. Unrau
View author publications
You can also search for this author in PubMed Google Scholar
Orran Krieger
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Gamsa
View author publications
You can also search for this author in PubMed Google Scholar
Michael Stumm
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Unrau, R.C., Krieger, O., Gamsa, B. et al. Hierarchical clustering: A structure for scalable multiprocessor operating system design. J Supercomput 9, 105–134 (1995). https://doi.org/10.1007/BF01245400

Download citation

Received: 15 March 1994
Accepted: 15 June 1994
Issue Date: March 1995
DOI: https://doi.org/10.1007/BF01245400

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical clustering: A structure for scalable multiprocessor operating system design

Abstract

Access this article

Similar content being viewed by others

mxkernel: A Novel System Software Stack for Data Processing on Modern Hardware

A Classification of Resource Heterogeneity in Multiprocessing Systems

Exploring Data Locality for Clustered Enterprise Applications

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical clustering: A structure for scalable multiprocessor operating system design

Abstract

Access this article

Similar content being viewed by others

mxkernel: A Novel System Software Stack for Data Processing on Modern Hardware

A Classification of Resource Heterogeneity in Multiprocessing Systems

Exploring Data Locality for Clustered Enterprise Applications

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation