The Journal of Supercomputing

, Volume 47, Issue 3, pp 312–334

Fault tolerant file models for parallel file systems: introducing distribution patterns for every file

  • A. Calderón
  • F. García-Carballeira
  • L. M. Sánchez
  • J. D. García
  • J. Fernandez
Article

DOI: 10.1007/s11227-008-0199-8

Cite this article as:
Calderón, A., García-Carballeira, F., Sánchez, L.M. et al. J Supercomput (2009) 47: 312. doi:10.1007/s11227-008-0199-8

Abstract

Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage devices. This approach increases the performance and scalability of the system, but a fault in one single node can stop the whole system. To avoid this problem, data must be stored using some kind of redundant technique, so any data stored in a faulty element can be recovered. Fault tolerance can be provided in I/O systems by using replication or RAID based schemes. However, most of the current systems apply the same technique for all files in the system.

This paper describes the fault tolerance support provided by Expand, a parallel file system based on standard servers. This support can be applied to other parallel file systems with many benefices: fault tolerance at file level, flexible definition of fault tolerance scheme to be used, possibility to change the fault tolerant support used for a file, etc.

Keywords

Parallel file system Fault-tolerance support Parallel I/O Block distribution 

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • A. Calderón
    • 1
  • F. García-Carballeira
    • 1
  • L. M. Sánchez
    • 1
  • J. D. García
    • 1
  • J. Fernandez
    • 1
  1. 1.Computer Architecture Group, Computer Science DepartmentUniversidad Carlos III de MadridLeganésSpain

Personalised recommendations