Abstract
A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk according to the similar service time. Firstly, the files were sorted and stored at the set I in descending order in terms of their service time, then one disk of cluster node was selected randomly when the files were to be assigned, and at last the continuous files were taken orderly from the set I to the disk until the disk reached its load maximum. The experimental results show that the new strategy improves the performance by 20.2% when the load of the system is light and by 31.6% when the load is heavy. And the higher the data access rate, the more evident the improvement of the performance obtained by the heuristic file sorted assignment algorithm.
Similar content being viewed by others
References
Bajaj R, Agrawal D P. Improving scheduling of tasks in a heterogeneous environment [J]. IEEE Transactions on Parallel and Distributed Systems, 2004, 15 (2): 107–118.
LONG Xiang, LI Zhong-ze, GAO Xian-peng, et al. A new method to improve the I/O efficiency on network of workstations[J]. Journal of Computer Research and Development, 2000, 37(6): 650–656. (in Chinese)
Rajkumar B. High Performance Cluster Computing: Architectures and Systems (Vol. 1)[M]. New Jersey: Prentice Hall PTR Inc, 1999.
Bell K, Chien A, Lauria M. A high-performance cluster storage server[A]. Proceeding of the 11th IEEE International Symposium on High Performance Distributed Computing[C]. Edinburgh, Scotland, 2002.
Ma X S, Jiao X M, Campbell M, et al. Flexible and efficient parallel I/O for large-scale multi-component simulations[A]. Proceedings of the International Parallel and Distributed Processing Symposium[C]. New Mexico, 2003.
Nancy T, Daniel A, Reed. Automatic ARIMA time seies modeling for adaptive I/O prefetching[J]. IEEE Transactions on Parallel and Distributed Systems, 2004, 15(4): 362–377.
Keren A, Barak A. Opportunity cost algorithms for reduction of I/O and interprocess communication overhead in a computing cluster[J]. IEEE Transactions on Parallel and Distributed Systems, 2003, 14(1): 39–50.
Venugopal C R, Rao S S S P. Impact of delays in parallel I/O system: an empirical study[A]. Proceedings of the High Performance Distributed Computing [C]. New York, 1996.
Shen X H, Liao W K, Choudhary A, et al. A high-performance application data environment for largescale scientific computations[J]. IEEE Transactions on Parallel and Distributed Systems, 2003, 14(12): 1262–1274.
Aguilar J. A graph theoretical model for scheduling simultaneous I/O operations on parallel and distributed environments [J]. Parallel Processing Lettes, 2002, 12(1): 113–115.
Scheuermann P, Weikum G, Zabback P. Data partitioning and load balancing in parallel disk systems [J]. The International Journal on Very Large Data Bases, 1998, 7(1): 48–66.
Copeland G, Alexander W, Bougher E, et al. Data placement in bubba[A]. Proceeding ACM SIGMOD Int’l Conf Management of Data[C]. Los Angeles, 1988.
Apon A W, Wolinski P D, Amerson G M. Sensitivity of cluster file system access to I/O server selection [A]. Proceeding of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid[C]. Berlin, 2002.
Carballeira F G, Carretero J, Calderon A, et al. An adaptive cache coherence protocol specification for parallel input/output systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2004, 15(6): 533–545.
SUN Jian-hua, JIN Hai, CHEN Hao, et al. Server scheduling scheme for asynchronous cluster video server [A]. Proceedings of the 17th International Conference on Advanced Information Networking and Applications[C]. Xi’an, 2003.
Ching A, Choudhary A, Coloma K, et al. Noncontinuous I/O accesses through MPI-IO[A]. Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid [C]. Tokyo, 2003.
ZHOU Xin-rong, WEI Tong. A greedy I/O scheduling method in the storage system of clusters[A]. Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid[C]. Tokyo, 2003.
Kwan T, Mcgrath R, Reed D. Ncsas world wide web server design and performance[J]. Computer, 1995, 28(11): 67–74.
Ali S, Maciejewski A A, Siegel H J, et al. Measuring the robustness of a resource allocation[J]. IEEE Transactions on Parallel and Distributed Systems, 2004, 15(7): 630–641.
Perez J M, Garcia F, Carretero J, et al. Data allocation and load balancing for heterogeneous cluster storage systems[A]. Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid[C]. Tokyo, 2003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Project(20040533036) supported by the Specialized Research Fund for the Doctoral Program of Higher Education of China; project(03JJY4054) supported by the Natural Science Foundation of Human Province
Rights and permissions
About this article
Cite this article
Chen, Zg., Zeng, Bq., Xiong, C. et al. Heuristic file sorted assignment algorithm of parallel I/O on cluster computing system. J Cent. South Univ. Technol. 12, 572–577 (2005). https://doi.org/10.1007/s11771-005-0125-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-005-0125-7