File-based storage 

File-based storage is the most common storage model, and has benefits over other storage models, in terms of ease of use and simplicity. It extends the traditional filesystem architecture, maintaining data in a hierarchical structure as files. Accordingly, this type of storage has hierarchically arranged file paths, which are used as entries for accessing data in the physical storage. Big data commonly uses distributed file systems (DFS) as the basic storage system. Users require namespaces and path knowledge to access saved files in file-based storage. For cross-system file sharing, there are three main parts to the path or namespace of a file: the protocol, the domain name, and the path of the file. NFS-Family and HDFS are two common file-based storage models, and are defined as follows:

  • NFS-Family: This is a distributed filesystem protocol, originally produced by Sun Microsystems. In this system, a network file system (NFS) enables remote hosts to mount filesystems over a network, and interact with those filesystems as though they are mounted locally. NFS has been widely used in Unix and Linux-based operating systems, and also encouraged the development of modern distributed filesystems. 
  • HDFSHadoop Distributed File System (HDFS) is a distributed filesystem that was created and optimized for processing large data volumes, and for higher availability. It spreads across the local storage of a cluster consisting of many server nodes. HDFS [14] is an open source distributed filesystem, written in Java. It is the open-source implementation of Google File System (GFS), and works as the core storage for Hadoop ecosystems and the majority of the existing big data platforms. HDFS was designed for fault detection and recovery, and has the ability to handle huge datasets. 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset