Performance Evaluation of a Massively Parallel I/O Subsystem
Presented are the trace-driven simulation results of a study conducted to evaluate the performance of the internal parallel I/O subsystem of the Vulcan massively parallel processor (MPP) architecture. The system sizes evaluated vary from 16 to 512 nodes. The results show that a compute node to I/O node ratio of four is the most cost effective for all system sizes, suggesting high scalability. Also, processor-to-processor communication effects are negligible for small message sizes and the greater the fraction of I/O reads, the better the I/O performance. Worse case I/O node placement is within 13% of more efficient placement strategies. Introducing parallelism into the internal I/O subsystem improves I/O performance significantly.
KeywordsNode Ratio Request Rate Node Placement Read Request Node Blocking
Unable to display preview. Download preview PDF.
- S.J. Baylor, C. Benveniste, and Y. Hsu. Performance evaluation of a parallel i/o architecture. International Conference on Supercomputing, pages 404–413, July 1995.Google Scholar
- P.F. Corbett and D.G. Feitelson. Design and implementation of the vesta parallel file system. Scalable High Performance Computing Conference, pages 63–70, 1994.Google Scholar
- RF. Corbett, D.G. Feitelson, J-R Prost, and SJ. Baylor. Parallel access to files in the vesta file system. Supercomputing ’93, pages 472–481, November 1993.Google Scholar
- Alverson et. al. The tera computer system. International Conference on Super-computing, pages 1–6, June 1990.Google Scholar
- Leiserson et. al. The network architecture of the connection machine cm-5. 4th Symposium on Parallel Algorithms and Architectures, pages 272–285, June 1992.Google Scholar
- Stunkel et. al. Architecture and implementation of vulcan. International Parallel Processing Symposium, pages 268–274, April 1994.Google Scholar
- D.G. Feitelson, RF. Corbett, SJ. Baylor, and Y. Hsu. Parallel i/o subsystems in massively parallel supercomputers. IEEE Parallel and Distributed Technology, Fall 1995.Google Scholar
- D.H. Lawrie. Access and alignment of data in an array processor. IEEE Transactions on Computers, pages 1145–1155, December 1975.Google Scholar
- M. Livingston and Q.F. Stout. Distributing resources in hypercube computers. 3rd Conference on Hypercube Concurrent Computer Applications, pages 40–48, January 1988.Google Scholar
- P. Pierce. A concurrent file system for a high parallel mass storage subsystem. Fourth Conference on Hypercube Computers and Applications, pages 155–160, 1989.Google Scholar
- C.B. Stunkel, D.G. Shea, D.G. Grice, RH. Hochschild, and M. Tsao. The spl high-performance switch. Scalable High Performance Computing Conference, May 1994.Google Scholar