Skip to main content

Co-Design and Verification of an Available File System

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 10747)

Abstract

Distributed file systems play a vital role in large-scale enterprise services. However, the designer of a distributed file system faces a vexing choice between strong consistency and asynchronous replication. The former supports a standard sequential model by synchronising operations, but is slow and fragile. The latter is highly available and responsive, but exposes users to concurrency anomalies. In this paper, we describe a rigorous and general approach to navigating this trade-off by leveraging static verification tools that allow to verify different file system designs. We show that common file system operations can run concurrently without synchronisation, while still retaining a semantics reasonably similar to Posix hierarchical structure. The one exception is the \(\mathsf {move}\) operation, for which we prove that, unless synchronised, it will have an anomalous behaviour.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-73721-8_17
  • Chapter length: 24 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-73721-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   107.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. https://github.com/Z3Prover/z3

  2. POSIX.1-2008. The Open Group Base Specifications Issue 7

    Google Scholar 

  3. Google Drive (2017). https://www.google.com/drive/

  4. Microsoft OneDrive (2017). https://onedrive.live.com/

  5. Abrial, J.-R.: A system development process with Event-B and the Rodin platform. In: Butler, M., Hinchey, M.G., Larrondo-Petrie, M.M. (eds.) ICFEM 2007. LNCS, vol. 4789, pp. 1–3. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76650-6_1

    CrossRef  Google Scholar 

  6. Arkoudas, K., Zee, K., Kuncak, V., Rinard, M.: Verifying a file system implementation. In: Davies, J., Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 373–390. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30482-1_32

    CrossRef  Google Scholar 

  7. Baker, M.G., Hartman, J.H., Kupfer, M.D., Shirriff, K.W., Ousterhout, J.K.: Measurements of a distributed file system. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, SOSP 1991, pp. 198–212. ACM, New York (1991)

    Google Scholar 

  8. Balasubramaniam, S., Pierce, B.C.: What is a file synchronizer? In: Int. Conf. on Mobile Comp. and Netw. (MobiCom 1998). ACM/IEEE, October 1998

    Google Scholar 

  9. Bernstein, P., Radzilacos, V., Hadzilacos, V.: Concurrency Control and Recovery in Database Systems. Addison Wesley Publishing Company (1987)

    Google Scholar 

  10. Biri, N., Galmiche, D.: Models and separation logics for resource trees. Journal of Logic and Computation 17(4), 687–726 (2007)

    MathSciNet  CrossRef  MATH  Google Scholar 

  11. Bjørner, N.: Models and software model checking of a distributed file replication system. In: Jones, C.B., Liu, Z., Woodcock, J. (eds.) Formal Methods and Hybrid Real-Time Systems. LNCS, vol. 4700, pp. 1–23. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75221-9_1

    CrossRef  Google Scholar 

  12. Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: a parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, ALS 2000, Berkeley, CA, USA, pp. 28–28 (2000)

    Google Scholar 

  13. Clements, A.T., Kaashoek, M.F., Zeldovich, N., Morris, R.T., Kohler, E.: The scalable commutativity rule: designing scalable software for multicore processors. In: Symp. on Op. Sys. Principles (SOSP), ACM SIG on Op. Sys. (SIGOPS), pp. 1–17. Assoc. for Computing Machinery, Farmington (2013)

    Google Scholar 

  14. Damchoom, K., Butler, M., Abrial, J.-R.: Modelling and proof of a tree-structured file system in Event-B and Rodin. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 25–44. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88194-0_5

    CrossRef  Google Scholar 

  15. Davidson, S.B., Garcia-Molina, H., Skeen, D.: Consistency in a partitioned network: a survey. ACM Comput. Surv. 17(3), 341–370 (1985). http://doi.acm.org/10.1145/5505.5508

    CrossRef  Google Scholar 

  16. El Ghazi, A.A., Taghdiri, M.: Analyzing alloy constraints using an SMT solver: a case study. In: 5th International Workshop on Automated Formal Methods (AFM), Edinburgh, United Kingdom (2010)

    Google Scholar 

  17. Freitas, L., Woodcock, J., Butterfield, A.: POSIX and the verification grand challenge: a roadmap. In: Proceedings of the 13th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 153–162, March 2008

    Google Scholar 

  18. Gardner, P., Ntzik, G., Wright, A.: Local reasoning for the POSIX file system. In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 169–188. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54833-8_10

    CrossRef  Google Scholar 

  19. Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. In: Symp. on Op. Sys. Principles (SOSP), pp. 29–43. Assoc. for Computing Machinery, Bolton Landing, October 2003

    Google Scholar 

  20. Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ‘Cause I’m strong enough: reasoning about consistency choices in distributed systems. In: Symp. on Principles of Prog. Lang. (POPL). St. Petersburg, FL (2016)

    Google Scholar 

  21. Guy, R., Heidemann, J.S., Mak, W., Popek, G.J., Rothmeier, D.: Implementation of the ficus replicated file system. In: USENIX Conference Proceedings, pp. 63–71 (1990)

    Google Scholar 

  22. Haogang, C., Daniel, Z., Tej, C., Adam, C., Frans, K.M., Nickolai, Z.: Using crash hoare logic for certifying the FSCQ file system. In: Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, pp. 18–37. ACM, New York (2015)

    Google Scholar 

  23. Hesselink, W.H., Lali, M.: Formalizing a hierarchical file system. Formal Aspects of Computing 24(1), 27–44 (2010)

    MathSciNet  CrossRef  MATH  Google Scholar 

  24. Hughes, J.: Specifying a visual file system in z. In: IEEE Colloquium on Formal Methods in HCI: II, pp. 3/1-3/3, December 1989

    Google Scholar 

  25. Jones, C.B.: Specification and design of (parallel) programs. In: IFIP Congress, North-Holland (1983)

    Google Scholar 

  26. Kistler, J.J., Satyanarayanan, M.: Disconnected operation in the Coda file system. In: Symp. on Principles of Dist. Comp. (PODC), vol. 10, pp. 3–25, February 1992

    Google Scholar 

  27. Kumar, P., Satyanarayanan, M.: Flexible and safe resolution of file conflicts. In: Usenix Tech. Conf., New Orleans, LA, USA, January 1995

    Google Scholar 

  28. Leino, K.R.M.: Automating induction with an SMT solver. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 315–331. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27940-9_21

    CrossRef  Google Scholar 

  29. Morgan, C., Sufrin, B.: Specification of the UNIX filing system, vol. SE-10, pp. 128–142 (1984)

    Google Scholar 

  30. Nadkarni, A.: Scale-out file systems on object-based storage platforms. IDC Technology Assessment 258393, International Data Corporation (IDC), Framingham, MA, USA (2015)

    Google Scholar 

  31. Najafzadeh, M., Gotsman, A., Yang, H., Ferreira, C., Shapiro, M.: The CISE tool: proving weakly-consistent applications correct. In: W. on Principles and Practice of Consistency for Distributed Data (PaPoC). EuroSys 2016 workshops, ACM SIG on Op. Sys. (SIGOPS). Assoc. for Computing Machinery, London, April 2016

    Google Scholar 

  32. Ntzik, G., Gardner, P.: Reasoning about the POSIX file system: local update and global pathnames. In: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, pp. 201–220. ACM, New York (2015)

    Google Scholar 

  33. Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., Thompson, J.G.: A trace-driven analysis of the UNIX 4.2 BSD file system. In: SIGOPS Oper. Syst. Rev., vol. 19, pp. 15–24. ACM, New York, December 1985

    Google Scholar 

  34. Pawlowski, B., Juszczak, C., Staubach, P., Smith, C., Lebel, D., Hitz, D.: NFS version 3 design and implementation. In: Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–152 (1994)

    Google Scholar 

  35. Petersen, K., Spreitzer, M.J., Terry, D.B., Theimer, M.M., Demers, A.J.: Flexible update propagation for weakly consistent replication. In: Symp. on Op. Sys. Principles (SOSP), pp. 288–301. ACM SIGOPS, Saint Malo, October 1997

    Google Scholar 

  36. Ramsey, N., Csirmaz, E.: An algebraic approach to file synchronization. Tech. Rep. TR-05-01, Harvard University Dept. of Computer Science, Cambridge MA, USA, May 2001

    Google Scholar 

  37. Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, LICS 2002, pp. 55–74. IEEE Computer Society, Washington, DC (2002)

    Google Scholar 

  38. Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., Lyon, B.: Design and implementation of the Sun Network Filesystem. In: Summer 1985 USENIX Conf. pp. 119–130. USENIX, Portland, June 1985

    Google Scholar 

  39. Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: Coda: A highly available file system for a distributed workstation environment. IEEE Trans. on Computers. 39, 447–459 (1990)

    CrossRef  Google Scholar 

  40. Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST 2002. USENIX Association, Berkeley (2002)

    Google Scholar 

  41. Schwan, P.: Lustre: Building a file system for 1,000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)

    Google Scholar 

  42. Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: Conflict-free replicated data types. In: Défago, X., Petit, F., Villain, V. (eds.) SSS 2011. LNCS, vol. 6976, pp. 386–400. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24550-3_29

    CrossRef  Google Scholar 

  43. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST 2010, pp. 1–10. IEEE Computer Society, Washington, DC (2010)

    Google Scholar 

  44. Spivey, J.M.: The z notation: a reference manual. In: Proceedings of the 12th IEEE International Conference on Engineering Complex Computer Systems (1998)

    Google Scholar 

  45. Tao, V., Shapiro, M., Rancurel, V.: Merging semantics for conflict updates in geo-distributed file systems. In: ACM Int. Systems and Storage Conf. (Systor), Haifa, Israel, pp. 10.1-10.12, May 2015

    Google Scholar 

  46. Thekkath, C.A., Mann, T., Lee, E.K.: Frangipani: a scalable distributed file system. In: SIGOPS Oper. Syst. Rev. vol. 31, pp. 224–237. ACM, New York, October 1997

    Google Scholar 

  47. Vogels, W.: File system usage in windows nt 4.0. In: Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (SOSP), pp. 93–109. ACM, New York (1999)

    Google Scholar 

  48. Vogels, W.: Eventually consistent. In: ACM Queue, vol. 6, pp. 14–19, October 2008

    Google Scholar 

  49. Wang, A.I., Reiher, P., Bagrodia, R., Kuenning, G.: Understanding the behavior of the conflict-rate metric in optimistic peer replication. In: Proceedings of the 13th International Workshop on Database and Expert Systems Applications, pp. 757–761, September 2002

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahsa Najafzadeh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Najafzadeh, M., Shapiro, M., Eugster, P. (2018). Co-Design and Verification of an Available File System. In: Dillig, I., Palsberg, J. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2018. Lecture Notes in Computer Science(), vol 10747. Springer, Cham. https://doi.org/10.1007/978-3-319-73721-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73721-8_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73720-1

  • Online ISBN: 978-3-319-73721-8

  • eBook Packages: Computer ScienceComputer Science (R0)