Abstract
Distributed file systems play a vital role in large-scale enterprise services. However, the designer of a distributed file system faces a vexing choice between strong consistency and asynchronous replication. The former supports a standard sequential model by synchronising operations, but is slow and fragile. The latter is highly available and responsive, but exposes users to concurrency anomalies. In this paper, we describe a rigorous and general approach to navigating this trade-off by leveraging static verification tools that allow to verify different file system designs. We show that common file system operations can run concurrently without synchronisation, while still retaining a semantics reasonably similar to Posix hierarchical structure. The one exception is the \(\mathsf {move}\) operation, for which we prove that, unless synchronised, it will have an anomalous behaviour.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
POSIX.1-2008. The Open Group Base Specifications Issue 7
Google Drive (2017). https://www.google.com/drive/
Microsoft OneDrive (2017). https://onedrive.live.com/
Abrial, J.-R.: A system development process with Event-B and the Rodin platform. In: Butler, M., Hinchey, M.G., Larrondo-Petrie, M.M. (eds.) ICFEM 2007. LNCS, vol. 4789, pp. 1–3. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76650-6_1
Arkoudas, K., Zee, K., Kuncak, V., Rinard, M.: Verifying a file system implementation. In: Davies, J., Schulte, W., Barnett, M. (eds.) ICFEM 2004. LNCS, vol. 3308, pp. 373–390. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30482-1_32
Baker, M.G., Hartman, J.H., Kupfer, M.D., Shirriff, K.W., Ousterhout, J.K.: Measurements of a distributed file system. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, SOSP 1991, pp. 198–212. ACM, New York (1991)
Balasubramaniam, S., Pierce, B.C.: What is a file synchronizer? In: Int. Conf. on Mobile Comp. and Netw. (MobiCom 1998). ACM/IEEE, October 1998
Bernstein, P., Radzilacos, V., Hadzilacos, V.: Concurrency Control and Recovery in Database Systems. Addison Wesley Publishing Company (1987)
Biri, N., Galmiche, D.: Models and separation logics for resource trees. Journal of Logic and Computation 17(4), 687–726 (2007)
Bjørner, N.: Models and software model checking of a distributed file replication system. In: Jones, C.B., Liu, Z., Woodcock, J. (eds.) Formal Methods and Hybrid Real-Time Systems. LNCS, vol. 4700, pp. 1–23. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75221-9_1
Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: a parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, ALS 2000, Berkeley, CA, USA, pp. 28–28 (2000)
Clements, A.T., Kaashoek, M.F., Zeldovich, N., Morris, R.T., Kohler, E.: The scalable commutativity rule: designing scalable software for multicore processors. In: Symp. on Op. Sys. Principles (SOSP), ACM SIG on Op. Sys. (SIGOPS), pp. 1–17. Assoc. for Computing Machinery, Farmington (2013)
Damchoom, K., Butler, M., Abrial, J.-R.: Modelling and proof of a tree-structured file system in Event-B and Rodin. In: Liu, S., Maibaum, T., Araki, K. (eds.) ICFEM 2008. LNCS, vol. 5256, pp. 25–44. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88194-0_5
Davidson, S.B., Garcia-Molina, H., Skeen, D.: Consistency in a partitioned network: a survey. ACM Comput. Surv. 17(3), 341–370 (1985). http://doi.acm.org/10.1145/5505.5508
El Ghazi, A.A., Taghdiri, M.: Analyzing alloy constraints using an SMT solver: a case study. In: 5th International Workshop on Automated Formal Methods (AFM), Edinburgh, United Kingdom (2010)
Freitas, L., Woodcock, J., Butterfield, A.: POSIX and the verification grand challenge: a roadmap. In: Proceedings of the 13th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 153–162, March 2008
Gardner, P., Ntzik, G., Wright, A.: Local reasoning for the POSIX file system. In: Shao, Z. (ed.) ESOP 2014. LNCS, vol. 8410, pp. 169–188. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54833-8_10
Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. In: Symp. on Op. Sys. Principles (SOSP), pp. 29–43. Assoc. for Computing Machinery, Bolton Landing, October 2003
Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ‘Cause I’m strong enough: reasoning about consistency choices in distributed systems. In: Symp. on Principles of Prog. Lang. (POPL). St. Petersburg, FL (2016)
Guy, R., Heidemann, J.S., Mak, W., Popek, G.J., Rothmeier, D.: Implementation of the ficus replicated file system. In: USENIX Conference Proceedings, pp. 63–71 (1990)
Haogang, C., Daniel, Z., Tej, C., Adam, C., Frans, K.M., Nickolai, Z.: Using crash hoare logic for certifying the FSCQ file system. In: Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, pp. 18–37. ACM, New York (2015)
Hesselink, W.H., Lali, M.: Formalizing a hierarchical file system. Formal Aspects of Computing 24(1), 27–44 (2010)
Hughes, J.: Specifying a visual file system in z. In: IEEE Colloquium on Formal Methods in HCI: II, pp. 3/1-3/3, December 1989
Jones, C.B.: Specification and design of (parallel) programs. In: IFIP Congress, North-Holland (1983)
Kistler, J.J., Satyanarayanan, M.: Disconnected operation in the Coda file system. In: Symp. on Principles of Dist. Comp. (PODC), vol. 10, pp. 3–25, February 1992
Kumar, P., Satyanarayanan, M.: Flexible and safe resolution of file conflicts. In: Usenix Tech. Conf., New Orleans, LA, USA, January 1995
Leino, K.R.M.: Automating induction with an SMT solver. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 315–331. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27940-9_21
Morgan, C., Sufrin, B.: Specification of the UNIX filing system, vol. SE-10, pp. 128–142 (1984)
Nadkarni, A.: Scale-out file systems on object-based storage platforms. IDC Technology Assessment 258393, International Data Corporation (IDC), Framingham, MA, USA (2015)
Najafzadeh, M., Gotsman, A., Yang, H., Ferreira, C., Shapiro, M.: The CISE tool: proving weakly-consistent applications correct. In: W. on Principles and Practice of Consistency for Distributed Data (PaPoC). EuroSys 2016 workshops, ACM SIG on Op. Sys. (SIGOPS). Assoc. for Computing Machinery, London, April 2016
Ntzik, G., Gardner, P.: Reasoning about the POSIX file system: local update and global pathnames. In: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, pp. 201–220. ACM, New York (2015)
Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., Thompson, J.G.: A trace-driven analysis of the UNIX 4.2 BSD file system. In: SIGOPS Oper. Syst. Rev., vol. 19, pp. 15–24. ACM, New York, December 1985
Pawlowski, B., Juszczak, C., Staubach, P., Smith, C., Lebel, D., Hitz, D.: NFS version 3 design and implementation. In: Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–152 (1994)
Petersen, K., Spreitzer, M.J., Terry, D.B., Theimer, M.M., Demers, A.J.: Flexible update propagation for weakly consistent replication. In: Symp. on Op. Sys. Principles (SOSP), pp. 288–301. ACM SIGOPS, Saint Malo, October 1997
Ramsey, N., Csirmaz, E.: An algebraic approach to file synchronization. Tech. Rep. TR-05-01, Harvard University Dept. of Computer Science, Cambridge MA, USA, May 2001
Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, LICS 2002, pp. 55–74. IEEE Computer Society, Washington, DC (2002)
Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., Lyon, B.: Design and implementation of the Sun Network Filesystem. In: Summer 1985 USENIX Conf. pp. 119–130. USENIX, Portland, June 1985
Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: Coda: A highly available file system for a distributed workstation environment. IEEE Trans. on Computers. 39, 447–459 (1990)
Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST 2002. USENIX Association, Berkeley (2002)
Schwan, P.: Lustre: Building a file system for 1,000-node clusters. In: Proceedings of the 2003 Linux Symposium (2003)
Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: Conflict-free replicated data types. In: Défago, X., Petit, F., Villain, V. (eds.) SSS 2011. LNCS, vol. 6976, pp. 386–400. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24550-3_29
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST 2010, pp. 1–10. IEEE Computer Society, Washington, DC (2010)
Spivey, J.M.: The z notation: a reference manual. In: Proceedings of the 12th IEEE International Conference on Engineering Complex Computer Systems (1998)
Tao, V., Shapiro, M., Rancurel, V.: Merging semantics for conflict updates in geo-distributed file systems. In: ACM Int. Systems and Storage Conf. (Systor), Haifa, Israel, pp. 10.1-10.12, May 2015
Thekkath, C.A., Mann, T., Lee, E.K.: Frangipani: a scalable distributed file system. In: SIGOPS Oper. Syst. Rev. vol. 31, pp. 224–237. ACM, New York, October 1997
Vogels, W.: File system usage in windows nt 4.0. In: Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (SOSP), pp. 93–109. ACM, New York (1999)
Vogels, W.: Eventually consistent. In: ACM Queue, vol. 6, pp. 14–19, October 2008
Wang, A.I., Reiher, P., Bagrodia, R., Kuenning, G.: Understanding the behavior of the conflict-rate metric in optimistic peer replication. In: Proceedings of the 13th International Workshop on Database and Expert Systems Applications, pp. 757–761, September 2002
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Najafzadeh, M., Shapiro, M., Eugster, P. (2018). Co-Design and Verification of an Available File System. In: Dillig, I., Palsberg, J. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2018. Lecture Notes in Computer Science(), vol 10747. Springer, Cham. https://doi.org/10.1007/978-3-319-73721-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-73721-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73720-1
Online ISBN: 978-3-319-73721-8
eBook Packages: Computer ScienceComputer Science (R0)