File-Less Approach to Large Scale Data Management
- 1.2k Downloads
With the continuously increasing amount of online resources and data such use cases as discovery, maintenance and inter-operation become more and more complex. In particular, data management is becoming one of the main issues with respect to both scientific (large scale simulations or data mining applications) as well as consumer use cases (accessing photos or email attachments on mobile devices). We believe that one of the main bottlenecks blocking development of solutions providing truly seamless developer and user experience is the concept of file and filesystem. We present Filess, vision and architecture of file-less information systems where files are not necessary, neither in the application nor operating system layers.
KeywordsFile systems Data management Hypergraphs
Data is unnaturally clustered into files - once data item is stored in a file it becomes locked in this file, whether or not it is actually a part of a larger data structure or could be accessible on its own (consider for instance an image in a presentation, or a tag in an XML document),
Very large unnecessary data redundancy - file based data management results in very large duplication of data due to the necessity of including data directly inside the file contents instead of referencing it (again images in presentations and rich text documents, attachments in emails, etc.) [7, 10]. Existing files and filesystems do not provide means for uniform referencing of other files on a global scale, a feature which is the basis of WWW in the form of hyper links,
Inflexible hierarchical namespaces - although it was convenient to store files in tree based directory structures when users had hundreds of files, with tens or hundreds of thousands of files it is impossible to memorize where a given file could be found without global filesystem indexing tools such as Spotlight, Tracker or Search Charm, which however only match queries to files names or using string based search over textual documents contents. No semantic search can be achieved based on relations between elements contained in these files (for instance Find all images included in this paper),
Barrier for the operating system - it is impossible to address a specific piece of data inside the file from the level of the operating system (for instance for the use from the command line), thus it is in general impossible to get metadata about an image or music file, an application or specific library has to be executed to extract it, which are specific for each file format,
Lack of versioning support - versioning of data can be achieved only by storing new files under different names (or in some cases storing text or binary differences between version, in application specific ways).
Let’s consider for instance a single file such as a text document, presentation or even simple e-mail, which usually contains large amount of information which is lost in the structure of the file and thus not visible for querying by computers or even humans. For example, a corporate document prepared within some organization by several people contains several independent items (tables, charts, diagrams, paragraphs of text) of which each can have distinct author, different authorization policies and can be used in several other documents, while currently all this provenance information is lost once the document is saved as a file. The first problem is that for instance some figure from this document will be duplicated in every copy of this document and also in every new document that will use this figure. Additionally the information provided within the document itself about the picture brings very little information to the reader and no information at all that could be processed by computers. The main hypothesis of this research is that in order to make information actually reusable into knowledge in a world wide distributed setting, information must be stored in a way that it can be freely shared, reused and processed.
We propose to address this issue by introducing an architecture for information systems which departs from the concept of file and filesystem completely. We introduce a flexible and scalable data model based on hypergraphs, where data objects are stored in nodes and all relations are represented through hyperedges (edges which can connect more than 2 nodes). The hypergraph is provided to applications and operating systems via the same abstraction layer, which hides the actual storage system used as well as the fact of data distribution between devices. This paper refines the initial vision and requirements defined in our previous work [12, 13].
2 Related Work
Most research in the area of making the existing directory based file systems more flexible can be classified into the area of semantic file systems , i.e. file systems where files have attached meaning. This paper sketches a vision of file systems where files can be annotated in some way, and the basic file system operation such as copy or delete don’t take directory paths as arguments but the semantic description of the files. The problem with these solutions is that still all the information is either fragmented or clustered into files, and the semantics deal only with meta data attached to these files in the form of some attributes. Nevertheless, these solutions are very important for our work as these approaches address important issues, mainly of how information can be found in file based systems. One of the formal attempts at file system implementation based on set theoretical basis is a file system using Formal Concept Analysis , which employs the FCA formal model of classification, neighborhood estimation and Boolean querying. A similar approach, although still bounded by the constraints of regular files, is the Logical File System project . The basic role of this file system is to allow searching for files using first-order logic formulas instead of conventional directory paths. Unfortunately the use of first-order logic inference can seriously impair the scalability of the system in highly distributed settings. Until now, one major industrial attempt at abstracting the file concept from the operating system was the WinFS (Windows File System), which is a research effort from Microsoft . Its basic assumption is to store all information about data in the system, including what would usually be referred to as file in a relational database. Furthermore, on the low level of storage device controllers, there is a trend to move from block device based interfaces (i.e. supporting file oriented systems) towards more flexible solutions such as OSB (Object-based Storage Device) , where instead of storing data in fixed size chunks the data can be stored in custom clusters of data along with relevant meta data. Unfortunately, most operating system level approaches still use these devices to store files, even if more efficiently [22, 23]. However, with removal of the concept of the file all together, this approach will be a significant factor along with further adoption of SSD storage . In fact Seagate has introduced recently an actual network attached object based device called Kinetic Storage , which provides a hardware back end for object based databases without any file system protocol access. Furthermore, certain technology enablers are emerging which provide insight into how the future storage could be improved on the low level. These include for instance various NVRAM (Non-Volatile RAM) solutions in particular memristor . For prototype development an interesting solution is SanDisk’s UlltraDIMM SSD , which is an SSD storage in the form of DIMM memory units, which can physically replace computers RAM memory. As we can see, there exists already several approaches and basic technologies which can support the proposed research concept. However, none of the existing solutions addresses abandoning the concept of a file as a whole, including all its repercussions on the storage, operating system, application and user interface level.
3 Filess Vision
Since its emergence, Cloud computing has become the leading paradigm in computing. The main reason for this was the fact that users found always-online resources to be much easier and efficient to use. Here resources include computing services, web pages and data. This is also one of the reasons why Cloud storage services such as Dropbox or Google Drive  are so popular, i.e. users are mobile and have multiple devices and need access to their content wherever they are. However, all these services have to be built on top of existing operating systems, since none of the mainstream OS has support for such functionality. In fact most operating systems do not have any artifacts for supporting such scenarios as over the network data access, process migration or check-pointing, which would enable developers to provide users with truly seamless experience when using multiple devices, such as working on a single file using multiple applications on different devices simultaneously. Imagine for instance creation of a simple conference presentation. It consists of some slides with text, images, equations sometimes embedded movies. Whenever an image needs to be updated, it has to be done in a separate application, saved into a file and imported into the selected slide in the presentation editor. If the user wants to preview the presentation on her tablet, it needs to be transferred manually there using yet another application. In our vision all these applications would operate on a global data space, managed entirely by the Filess middleware. Thus an image changed in a photo editing application, would make the new image version automatically updated in the presentation and whenever the contents of the presentation had been modified, they would be instantly visible on the presentation preview on users tablet. Then, when the presentation is ready, all the user needs to know in order to present it during the talk is to know the ID of the root node in the graph data model representing the presentation.
There are no files - neither in the storage, middleware, operating system or user interface layers. Of course, at the prototype stage such approach would be very expensive in order to remove files completely from existing operating systems which use files even for communication with hardware devices,
Documents, E-mails, images, movies, web pages and all other concepts, which are in practice today synonyms for files, in our architecture are only manifestations/renderings of interconnected groups of objects shown to the user in a context dependent way,
Data and meta data exist at the same level - for instance there is no difference between the Image object and the object describing its author or authorization policy - we do not plan to introduce a meta data mechanism such as Dublin Core or even Semantic Web,
Data and information replication should be controlled by the middleware - it is not necessary for users to copy and store the information for either security or efficiency reasons. As a consequence data redundancy can be optimized by the middleware,
The proposed approach inherently supports the ubiquitous computing paradigm i.e., there is no Load document, Save document operations. It is possible to work on a laptop, then literally just shut it down and switch to pocket PC or mobile phone and all the changes will be seamlessly available there, assuming of course network access is omnipresent,
Security, especially authorization is intertwined within the global information space along with the information itself, i.e. security assertions (and any annotations for that matter) are first class objects in the infrastructure.
4 Filess Data Model
Number - this is a union data type which allows to store any numeric data type while providing users with a simple API, which handles actual data type identification on the library level,
String - this data object provides means for storing any text in UTF8,
List - most graph data modeling frameworks do not provide lists or arrays, which can be very inefficient when modeling using graph nodes. This data object provides a simple means for compositing a set of data objects into an ordered structure,
Binary - this data object provides means for storing large binary data such as videos, where the actual data is hashed and stored in a separate distributed key-value store,
Composite - composite data objects are objects which do not need to store any actual value in their node, but provide links for other data objects. Any object containing a value can also be a Composite object, in which case the value represents a flattened representation of the objects structure. This situation can occur during decomposition of an object into a graph,
Stream - buffer objects provide abstraction over I/O functionality of the operating system, these objects cannot be transferred between nodes, and are volatile, i.e. their state and value cannot be synchronized and no version information for these objects is maintained, only read or write operations are allowed. These objects enable complete removal of file and filesystem concepts from the applications code.
4.3 Object Composition and Decomposition
The most important operations on data objects from the point of view of the abstract model are composition and decomposition.
5 Representing Existing Data Structures and Formats in Filess
Typical data structures can be represented in hypergraphs in the following ways. Sets can be trivially created by creating a 1-N directed hyperedge. Lists can be created by linking consecutive nodes through a single hyperedge with identical ID such as “_:next”, the actual property name is irrevelavant as long as the application wants to interpret a path as a list it is allowed to. However for performance reasons, a special type of node which allows to create order lists has been added. Maps are naturally represented by creating an hyperedge from a head node to any number of tail nodes.
JSON is a text format used to represent key-value pairs, where keys are always strings, and values can be any of the following types: Number, String, Boolean, Array, Object and null. These types map almost naturally into Filess data model. Boolean values can be modelled using Number data object type, Array’s by creating lists and null values can be achieved using hyperedges with empty head sets. Object values can be directly represented using Composite data objects. One issue is that of namespaces, as the edges created from the JSON key’s must be attached to some namespace in order to disambiguate them from other edges. By default JSON has no concept of namespace, so it is up to the application to provide one.
XML (eXtensible Markup Language) is a W3C recommendation which is a tree based model for representing structure data on the Internet. In contrast to JSON, it provides means for specifying unique namespaces for all elements, ordering of the nodes as well as assigning attributes to nodes (unordered). The mapping of XML data into directed hypergraph can be achieved as follows. All simple tags (containing only values) are converted to simple data objects. All complex tags, which contain children tags are converted to composite data objects. All tag attributes are added to respective data objects using edges.
The representation of relational model using directed hypergraphs can be achieved as follows, assuming that the database is at least in the 3rd normal form. Each relation is composed of a set of value tuples, called rows. Each row is simply mapped to a single composite data object with edges representing the columns and their particular values as target data objects. Each relation (i.e. table) can be represented as a set of data objects representing rows. More interesting is the case of foreign key dependencies. In case of relational model it is impossible to directly create n:m relations. Consider the relations Author and Book, where it is possible that a single book could have many authors as well as a single author could have published several books. In the relational model this requires introduction of intermediate relation (e.g. BookAuthors), which assigns authors to books. In case of a hypergraph this relation is not necessary (i.e. it is not necessary to create a new data object), as the relevant property can be modelled directly using hyperedges.
6 Prototype Design and Implementation
Session - These operation enable the user to login and logout of the system. Each session combines the users key with current machine ID so that the same user can be logged in from multiple devices simultaneously, and see the same state of affairs from these devices,
Get - This category of operations allows for searching and access data objects. Currently the search is limited to node GUI’s, as the Filess layer aims to be agnostic of actual graph database backend, an ongoing work is to develop an abstract query language for this purpose,
Put - These operations enable adding new data objects and relating them to other objects,
Join - These operations enable composing existing objects into more complex objects,
Split - These operations enable decomposing existing binary or text objects into graph form,
In order to enable evaluation of the idea, Filess prototype has been developed using available technologies in the area of graph databases. We have evaluated several solutions including [11, 15]. Finally we chose OrientDB, which is a multi-document database enabling modeling using document, key-value as well as graph paradigms simultaneously. In order to support legacy applications, an intermediate FUSE filesystem plugin was implemented which allows applications to access the information in the form of files which are composed on demand from the underlying graph when applications try to gain access to the data object. The implementation is based on fuse-jna Java Fuse provider, which allowed us to use direct OrientDB Java bindings. Due to very flexible graph model in OrientDB, it was possible to create hypergraph structure by defining custom edge class. Binary, read-only data objects are stored in a separate distributed database called IPFS (Interplanetary File System), which provides efficient hashing and distribution of large binary files between multiple nodes by diving them into blocks and maintaining a tree structure based on the blocks hash values.
In this paper we have presented a novel approach to data management and representation in information systems, which departs from the filesystem based designs. Filesystem approach has become already intractable for average users for several reasons such as difficult searching for required files or lack of OS level synchronization of data between devices used to access the system. Presented approach addressing these problems has the potential to enable much more natural access to information, while minimizing the redundancy and data transfer on a global scale, allowing at the same time for highly fine grained access control, not based on files, but on actual data elements, which will enable creation of much more sophisticated and natural computing infrastructures able to handle information processing tasks on a global scale. The presented approach requires both users and application developers to shift the paradigm in which the applications are developed and used. Future work will include design of security layer enabling fine grained control over the operations performed by various users on such global data model, practical evaluation of performance depending on underlying storage solution and development of minimum viable prototype of the truly file-less operating system.
This research has been funded by Polish National Science Centre grant File-less architecture of large scale distributed information systems number: DEC-2012/05/N/ST6/03463.
- 1.Bandulet, C.: Object-based storage devices (2007). http://developers.sun.com/solaris/articles/osd.html
- 4.Drago, I., Mellia, M., Munafo, M.M., Sperotto, A., Sadre, R., Pras, A.: Inside dropbox: understanding personal cloud storage services. In: Proceedings of the 2012 ACM Conference on Internet Measurement Conference, IMC 2012, pp. 481–494. ACM, New York (2012)Google Scholar
- 7.Gantz, J., Reinsel, D.: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. International Data Corporation, December 2010. https://www.emc.com/collateral/analyst-reports/idc-digital-universe-united-states.pdf
- 8.Gifford, D.K., Jouvelot, P., Sheldon, M.A., O’Toole, Jr., J.W.: Semantic file systems. SIGOPS Oper. Syst. Rev. 25(5), 16–25 (1991)Google Scholar
- 9.Grimes, R.: Code name WinFS: Revolutionary file storage system lets users search and manage files based on content. MSDN Magazine 19(1) (2004). http://msdn.microsoft.com/msdnmag/issues/04/01/WinFS/
- 10.IDC iView: The Digital Universe Decade - Are You Ready? International Data Corporation, Framingham, MA, USA (2010). http://www.emc.com/digital_universe
- 12.Kryza, B., Kitowski, J.: Comparison of information representation formalisms for scalable file agnostic information infrastructures. Comput. Inf. 34, 473–494 (2015)Google Scholar
- 13.Kryza, B., Kitowski, J.: Filess - file-less architecture for future information systems. In: 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, BDCloud 2014, Sydney, Australia, 3–5 December 2014, pp. 281–282 (2014)Google Scholar
- 14.Mylka, A., Mylka, A., Kryza, B., Kitowski, J.: Integration of heterogeneous data sources in an ontological knowledge base. Comput. Inf. 31(1), 189–223 (2012)Google Scholar
- 15.Orient Technologies: OrientDB project website. http://www.orientechnologies.com
- 16.Padioleau, Y., Ridoux, O.: A logic file system. In: Proceedings of the General Track: 2003 USENIX Annual Technical Conference, San Antonio, Texas, USA, 9–14 June 2003, pp. 99–112 (2003)Google Scholar
- 17.Rajimwale, A., Prabhakaran, V., Davis, J.D.: Block management in solid-state devices. In: Proceedings of the 2009 Conference on USENIX Annual Technical Conference, USENIX 2009, pp. 21–21. USENIX Association, Berkeley (2009)Google Scholar
- 18.Reiser, H.: Futurue vision of reiserfs (2006). https://reiser4.wiki.kernel.org/index.php/Future_Vision
- 20.SanDisk: Ulltradimm product page. http://www.sandisk.com/enterprise/ulltradimm-ssd/
- 21.Seagate Technology LLC: The seagate kinetic open storage vision. Seagate Technology LLC (2013). http://www.seagate.com/tech-insights/kinetic-vision-how-seagate-new-developertools-meets-the-needs-of-cloud-storage-platforms-master-ti/
- 22.Stender, J., Hogqvist, M., Kolbeck, B.: Loosely time-synchronized snapshots in object-based file systems. In: IPCCC, pp. 188–197. IEEE (2010)Google Scholar
- 23.Wang, F., Brandt, S.A., Miller, E.L., Long, D.D.E.: OBFS: a file system for object-based storage devices. In: Proceedings of the 21st IEEE/12TH NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, MD, pp. 283–300 (2004)Google Scholar