Co-Design and Verification of an Available File System

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10747)

Abstract

Distributed file systems play a vital role in large-scale enterprise services. However, the designer of a distributed file system faces a vexing choice between strong consistency and asynchronous replication. The former supports a standard sequential model by synchronising operations, but is slow and fragile. The latter is highly available and responsive, but exposes users to concurrency anomalies. In this paper, we describe a rigorous and general approach to navigating this trade-off by leveraging static verification tools that allow to verify different file system designs. We show that common file system operations can run concurrently without synchronisation, while still retaining a semantics reasonably similar to Posix hierarchical structure. The one exception is the \(\mathsf {move}\) operation, for which we prove that, unless synchronised, it will have an anomalous behaviour.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    POSIX.1-2008. The Open Group Base Specifications Issue 7Google Scholar
  3. 3.
    Google Drive (2017). https://www.google.com/drive/
  4. 4.
    Microsoft OneDrive (2017). https://onedrive.live.com/
  5. 5.
    Abrial, J.-R.: A system development process with Event-B and the Rodin platform. In: Butler, M., Hinchey, M.G., Larrondo-Petrie, M.M. (eds.) ICFEM 2007. LNCS, vol. 4789, pp. 1–3. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-76650-6_1 CrossRefGoogle Scholar
  6. 6.
    Arkoudas, K., Zee, K., Kuncak, V., Rinard, M.: Verifying a file system implementation. In: Davies, J., Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 373–390. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-30482-1_32 CrossRefGoogle Scholar
  7. 7.
    Baker, M.G., Hartman, J.H., Kupfer, M.D., Shirriff, K.W., Ousterhout, J.K.: Measurements of a distributed file system. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, SOSP 1991, pp. 198–212. ACM, New York (1991)Google Scholar
  8. 8.
    Balasubramaniam, S., Pierce, B.C.: What is a file synchronizer? In: Int. Conf. on Mobile Comp. and Netw. (MobiCom 1998). ACM/IEEE, October 1998Google Scholar
  9. 9.
    Bernstein, P., Radzilacos, V., Hadzilacos, V.: Concurrency Control and Recovery in Database Systems. Addison Wesley Publishing Company (1987)Google Scholar
  10. 10.
    Biri, N., Galmiche, D.: Models and separation logics for resource trees. Journal of Logic and Computation 17(4), 687–726 (2007)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Bjørner, N.: Models and software model checking of a distributed file replication system. In: Jones, C.B., Liu, Z., Woodcock, J. (eds.) Formal Methods and Hybrid Real-Time Systems. LNCS, vol. 4700, pp. 1–23. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-75221-9_1 CrossRefGoogle Scholar
  12. 12.
    Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: a parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, ALS 2000, Berkeley, CA, USA, pp. 28–28 (2000)Google Scholar
  13. 13.
    Clements, A.T., Kaashoek, M.F., Zeldovich, N., Morris, R.T., Kohler, E.: The scalable commutativity rule: designing scalable software for multicore processors. In: Symp. on Op. Sys. Principles (SOSP), ACM SIG on Op. Sys. (SIGOPS), pp. 1–17. Assoc. for Computing Machinery, Farmington (2013)Google Scholar
  14. 14.
    Damchoom, K., Butler, M., Abrial, J.-R.: Modelling and proof of a tree-structured file system in Event-B and Rodin. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 25–44. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-88194-0_5 CrossRefGoogle Scholar
  15. 15.
    Davidson, S.B., Garcia-Molina, H., Skeen, D.: Consistency in a partitioned network: a survey. ACM Comput. Surv. 17(3), 341–370 (1985). http://doi.acm.org/10.1145/5505.5508 CrossRefGoogle Scholar
  16. 16.
    El Ghazi, A.A., Taghdiri, M.: Analyzing alloy constraints using an SMT solver: a case study. In: 5th International Workshop on Automated Formal Methods (AFM), Edinburgh, United Kingdom (2010)Google Scholar
  17. 17.
    Freitas, L., Woodcock, J., Butterfield, A.: POSIX and the verification grand challenge: a roadmap. In: Proceedings of the 13th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 153–162, March 2008Google Scholar
  18. 18.
    Gardner, P., Ntzik, G., Wright, A.: Local reasoning for the POSIX file system. In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 169–188. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-642-54833-8_10 CrossRefGoogle Scholar
  19. 19.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. In: Symp. on Op. Sys. Principles (SOSP), pp. 29–43. Assoc. for Computing Machinery, Bolton Landing, October 2003Google Scholar
  20. 20.
    Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ‘Cause I’m strong enough: reasoning about consistency choices in distributed systems. In: Symp. on Principles of Prog. Lang. (POPL). St. Petersburg, FL (2016)Google Scholar
  21. 21.
    Guy, R., Heidemann, J.S., Mak, W., Popek, G.J., Rothmeier, D.: Implementation of the ficus replicated file system. In: USENIX Conference Proceedings, pp. 63–71 (1990)Google Scholar
  22. 22.
    Haogang, C., Daniel, Z., Tej, C., Adam, C., Frans, K.M., Nickolai, Z.: Using crash hoare logic for certifying the FSCQ file system. In: Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, pp. 18–37. ACM, New York (2015)Google Scholar
  23. 23.
    Hesselink, W.H., Lali, M.: Formalizing a hierarchical file system. Formal Aspects of Computing 24(1), 27–44 (2010)MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Hughes, J.: Specifying a visual file system in z. In: IEEE Colloquium on Formal Methods in HCI: II, pp. 3/1-3/3, December 1989Google Scholar
  25. 25.
    Jones, C.B.: Specification and design of (parallel) programs. In: IFIP Congress, North-Holland (1983)Google Scholar
  26. 26.
    Kistler, J.J., Satyanarayanan, M.: Disconnected operation in the Coda file system. In: Symp. on Principles of Dist. Comp. (PODC), vol. 10, pp. 3–25, February 1992Google Scholar
  27. 27.
    Kumar, P., Satyanarayanan, M.: Flexible and safe resolution of file conflicts. In: Usenix Tech. Conf., New Orleans, LA, USA, January 1995Google Scholar
  28. 28.
    Leino, K.R.M.: Automating induction with an SMT solver. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 315–331. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-27940-9_21 CrossRefGoogle Scholar
  29. 29.
    Morgan, C., Sufrin, B.: Specification of the UNIX filing system, vol. SE-10, pp. 128–142 (1984)Google Scholar
  30. 30.
    Nadkarni, A.: Scale-out file systems on object-based storage platforms. IDC Technology Assessment 258393, International Data Corporation (IDC), Framingham, MA, USA (2015)Google Scholar
  31. 31.
    Najafzadeh, M., Gotsman, A., Yang, H., Ferreira, C., Shapiro, M.: The CISE tool: proving weakly-consistent applications correct. In: W. on Principles and Practice of Consistency for Distributed Data (PaPoC). EuroSys 2016 workshops, ACM SIG on Op. Sys. (SIGOPS). Assoc. for Computing Machinery, London, April 2016Google Scholar
  32. 32.
    Ntzik, G., Gardner, P.: Reasoning about the POSIX file system: local update and global pathnames. In: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, pp. 201–220. ACM, New York (2015)Google Scholar
  33. 33.
    Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., Thompson, J.G.: A trace-driven analysis of the UNIX 4.2 BSD file system. In: SIGOPS Oper. Syst. Rev., vol. 19, pp. 15–24. ACM, New York, December 1985Google Scholar
  34. 34.
    Pawlowski, B., Juszczak, C., Staubach, P., Smith, C., Lebel, D., Hitz, D.: NFS version 3 design and implementation. In: Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–152 (1994)Google Scholar
  35. 35.
    Petersen, K., Spreitzer, M.J., Terry, D.B., Theimer, M.M., Demers, A.J.: Flexible update propagation for weakly consistent replication. In: Symp. on Op. Sys. Principles (SOSP), pp. 288–301. ACM SIGOPS, Saint Malo, October 1997Google Scholar
  36. 36.
    Ramsey, N., Csirmaz, E.: An algebraic approach to file synchronization. Tech. Rep. TR-05-01, Harvard University Dept. of Computer Science, Cambridge MA, USA, May 2001Google Scholar
  37. 37.
    Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, LICS 2002, pp. 55–74. IEEE Computer Society, Washington, DC (2002)Google Scholar
  38. 38.
    Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., Lyon, B.: Design and implementation of the Sun Network Filesystem. In: Summer 1985 USENIX Conf. pp. 119–130. USENIX, Portland, June 1985Google Scholar
  39. 39.
    Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: Coda: A highly available file system for a distributed workstation environment. IEEE Trans. on Computers. 39, 447–459 (1990)CrossRefGoogle Scholar
  40. 40.
    Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST 2002. USENIX Association, Berkeley (2002)Google Scholar
  41. 41.
    Schwan, P.: Lustre: Building a file system for 1,000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)Google Scholar
  42. 42.
    Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: Conflict-free replicated data types. In: Défago, X., Petit, F., Villain, V. (eds.) SSS 2011. LNCS, vol. 6976, pp. 386–400. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-24550-3_29 CrossRefGoogle Scholar
  43. 43.
    Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST 2010, pp. 1–10. IEEE Computer Society, Washington, DC (2010)Google Scholar
  44. 44.
    Spivey, J.M.: The z notation: a reference manual. In: Proceedings of the 12th IEEE International Conference on Engineering Complex Computer Systems (1998)Google Scholar
  45. 45.
    Tao, V., Shapiro, M., Rancurel, V.: Merging semantics for conflict updates in geo-distributed file systems. In: ACM Int. Systems and Storage Conf. (Systor), Haifa, Israel, pp. 10.1-10.12, May 2015Google Scholar
  46. 46.
    Thekkath, C.A., Mann, T., Lee, E.K.: Frangipani: a scalable distributed file system. In: SIGOPS Oper. Syst. Rev. vol. 31, pp. 224–237. ACM, New York, October 1997Google Scholar
  47. 47.
    Vogels, W.: File system usage in windows nt 4.0. In: Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (SOSP), pp. 93–109. ACM, New York (1999)Google Scholar
  48. 48.
    Vogels, W.: Eventually consistent. In: ACM Queue, vol. 6, pp. 14–19, October 2008Google Scholar
  49. 49.
    Wang, A.I., Reiher, P., Bagrodia, R., Kuenning, G.: Understanding the behavior of the conflict-rate metric in optimistic peer replication. In: Proceedings of the 13th International Workshop on Database and Expert Systems Applications, pp. 757–761, September 2002Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Mahsa Najafzadeh
    • 1
  • Marc Shapiro
    • 2
  • Patrick Eugster
    • 1
    • 3
  1. 1.Purdue UniversityWest LafayetteUSA
  2. 2.INRIA-LIP6ParisFrance
  3. 3.Darmstadt UniversityDarmstadtGermany

Personalised recommendations