Cluster Computing

, Volume 18, Issue 3, pp 1025–1039 | Cite as

The dispatch time aligning I/O scheduling for parallel file systems

  • Yonggang LiuEmail author
  • Jing Qin
  • Renato Figueiredo


In Parallel File Systems (PFSs), a data file I/O request may be divided into multiple I/O sub-requests across the storage system. The latency of the original I/O request depends on the finish time of the last sub-request. Due to application multiplexing and various file data layouts employed in PFS, data servers may have very different workloads. Thus, the performance penalty caused by the finish time of different sub-requests can be significant. The Dispatch Time Aligning (DTA) I/O scheduling algorithm focuses on improving system throughput by prioritizing lagged sub-requests in PFS I/O requests. The DTA algorithm associates sub-requests from the same I/O request, detects and prioritizes the lagged-behind sub-requests. The dual-queue scheduling scheme in DTA provides I/O request latency control while improving system throughput. Simulation results show that the DTA algorithm can provide up to 83 % higher total system throughput than the Earliest Deadline First algorithm, while offering similar latency guarantees.


Parallel file system I/O scheduling  Workload imbalance Earliest deadline first Data layout 



This research is sponsored by the National Science Foundation under Grants CCF-0937973 and CCF-0938045. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


  1. 1.
    Coffman Jr, E.G., Garey, M.R., Johnson, D.S.: An application of bin-packing to multiprocessor scheduling. SIAM J. Comput. 7(1), 1–17 (1978)zbMATHMathSciNetCrossRefGoogle Scholar
  2. 2.
    Ekelin, C.: Clairvoyant non-preemptive edf scheduling. In: 8th Euromicro Conference on Real-Time Systems, p. 7. IEEE (2006)Google Scholar
  3. 3.
    Garey, M.R., Johnson, D.S.: Complexity results for multiprocessor scheduling under resource constraints. SIAM J. Comput. 4(4), 397–411 (1975)zbMATHMathSciNetCrossRefGoogle Scholar
  4. 4.
    Goyal, P., Vin, H.M., Chen, H.: Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks. In: ACM SIGCOMM Computer Communication Review, vol. 26, pp. 157–168. ACM (1996)Google Scholar
  5. 5.
    Gulati, A., Ahmad, I., Waldspurger, C.A., et al.: Parda: Proportional allocation of resources for distributed storage access. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST), vol. 9, pp. 85–98 (2009)Google Scholar
  6. 6.
    Hou, E.S., Ansari, N., Ren, H.: A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5(2), 113–120 (1994)CrossRefGoogle Scholar
  7. 7.
    Huang, L., Peng, G., Chiueh, T.C.: Multi-dimensional storage virtualization. ACM SIGMETRICS Perform. Eval. Rev. 32(1), 14–24 (2004)CrossRefGoogle Scholar
  8. 8.
    Jin, C., Wei, D.X., Low, S.H.: Fast tcp: motivation, architecture, algorithms, performance. In: INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, vol. 4, pp. 2490–2501. IEEE (2004)Google Scholar
  9. 9.
    Jin, W., Chase, J.S., Kaur, J.: Interposed proportional sharing for a storage service utility. In: ACM SIGMETRICS Performance Evaluation Review, vol. 32, pp. 37–48. ACM (2004)Google Scholar
  10. 10.
    Kasahara, H., Narita, S.: Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. Comput. 33(11), 1023–1029 (1984)CrossRefGoogle Scholar
  11. 11.
    Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997)Google Scholar
  12. 12.
    Liu, C.L., Layland, J.W.: Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM (JACM) 20(1), 46–61 (1973)zbMATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Liu, Y., Figueiredo, R., Clavijo, D., Xu, Y., Zhao, M.: Towards simulation of parallel file system scheduling algorithms with pfssim. In: Proceedings of the 7th IEEE International Workshop on Storage Network Architectures and Parallel I/O (2011)Google Scholar
  14. 14.
    Lumb, C.R., Merchant, A., Alvarez, G.A.: Façade: Virtual storage devices with performance guarantees. In: Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST), pp. 131–144. USENIX Association (2003)Google Scholar
  15. 15.
    Nagle, D., Serenyi, D., Matthews, A.: The panasas activescale storage cluster delivering scalable high bandwidth storage. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p. 53. IEEE Computer Society (2004)Google Scholar
  16. 16.
    Ross, R.B., Ligon III, W.B.: Server-side scheduling in cluster parallel i/o systems. Calc. Paralleles Spec. Issue Parallel I/O Clust. Comput. (2001)Google Scholar
  17. 17.
    Omnet++ network simulation framework. OMNeT++. (2013)
  18. 18.
    Ross, R.B., Thakur, R., et al.: Pvfs: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)Google Scholar
  19. 19.
    Schmuck, F.B., Haskin, R.L.: Gpfs: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST), vol. 2, p. 19 (2002)Google Scholar
  20. 20.
    Schwan, P.: Lustre: Building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)Google Scholar
  21. 21.
    Song, H., Yin, Y., Sun, X.H., Thakur, R., Lang, S.: Server-side i/o coordination for parallel file systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 17. ACM (2011)Google Scholar
  22. 22.
    Wang, Y., Merchant, A.: Proportional-share scheduling for distributed storage systems. In: Proceedings of the 5nd USENIX Conference on File and Storage Technologies (FAST), vol. 7 (2007)Google Scholar
  23. 23.
    Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th symposium on Operating systems design and implementation, pp. 307–320. USENIX Association (2006)Google Scholar
  24. 24.
    Xu, Y., Arteaga, D., Zhao, M., Liu, Y., Figueiredo, R., Seelam, S.: vpfs: bandwidth virtualization of parallel storage systems. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12. IEEE (2012)Google Scholar
  25. 25.
    Zhang, J., Sivasubramaniam, A., Riska, A., Wang, Q., Riedel, E.: An interposed 2-level i/o scheduling framework for performance virtualization. In: Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS) (2005)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringUniversity of FloridaGainesvilleUSA

Personalised recommendations