Mounting Alluxio with FUSE (Beta)
- Operational assumptions and status
- Performance considerations
- Configuration Parameters For Alluxio-FUSE
Alluxio-FUSE is a new experimental feature that allows to mount a distributed Alluxio File System within the local file system hierarchy of a Linux node. Using this feature, standard tools (for example,
echo) and legacy POSIX applications will have basic access to the distributed Alluxio data store.
Given the intrinsic characteristics of Alluxio, like its write-once/read-many-times file data model, the mounted file system will not have full POSIX semantics and will have specific limitations. Please, read the rest of this document before using this feature to understand what it can and cannot do for you.
- Linux kernel 2.6.9 or newer
- JDK 1.8 or newer
- libfuse 2.9.3 or newer (2.8.3 has been reported to also work - with some warnings)
alluxio-fuse is only built with Alluxio when the
fuse maven profile is active. This profile will be automatically activated by maven when it is detected that you are building Alluxio with a JDK version 8 or newer.
For compatibility with Java 7, binary alluxio distributions may ship without alluxio-fuse support, so you will need to build your own Alluxio if you want to use alluxio-fuse on your deployment.
The best way to do so is to either clone the Alluxio GitHub repository and choose your favourite branch from git, or to grab a source distribution directly. Please, refer to this page) for building instructions.
After having properly configured and started the alluxio cluster, and from the node where you wish to mount Alluxio, point a shell to your
$ALLUXIO_HOME and run:
$ integration/fuse/bin/alluxio-fuse.sh mount <mount_point>
This will spawn a background user-space java process (alluxio-fuse) that will mount the file system on the specified
<mount_point>. Note that
<mount_point> must be an existing and empty path in your local file system hierarchy and that the user that runs the
alluxio-fuse.sh script must own the mount point and have read and write permissions on it. Also note that, currently, you are limited to have only one Alluxio-FUSE mount per node.
To umount a previoulsy mounted Alluxio-FUSE file sytem, on the node where the file system is mounted, point a shell to your
$ALLUXIO_HOME and run:
$ integration/fuse/bin/alluxio-fuse.sh umount
This will stop the background alluxio-fuse java process and unmount the file system.
Check if Alluxio-FUSE is running
$ integration/fuse/bin/alluxio-fuse.sh stat
Optional configuration steps
Alluxio-FUSE is based on the standard java alluxio-core-client to perform its operations. You might want to customize the behaviour of the alluxio client used by Alluxio-FUSE the same way you would for any other client application.
One possibility, for example, is to edit
$ALLUXIO_HOME/integration/fuse/bin/alluxio-fuse.sh and add your specific alluxio client options in the
Operational assumptions and status
Currently, most basic file system operations are supported. However, due to Alluxio implicit characteristics, please, be aware that:
- Files can be written only once, only sequentially, and never modified.
- Due to the above, any further access to a file must be read-only.
This translates in the following constraints on the UNIX system calls that will operate on the file system:
open(const char* pathname, int flags, mode_t mode) (see also
man 2 open)
pathname indicates the path of a non-existing regular file in Alluxio, then an open will only succeed if:
- The base directory of
pathnameexists in Alluxio;
O_WRONLYare passed among the
creat(const char* pathname ) calls will succeed as long as (1) holds and
pathname does not exist yet.
pathname, instead, points to an existing regular file in Alluxio, then an open call will only succeed if:
O_RDONLYis passed among the
Note that, in either cases, the
mode parameter is currently ignored by Alluxio-FUSE.
read(int fd, void* buf, size_t count) (see also
man 2 read)
A read system call will only succeed when
fd refers to an Alluxio file that has been previously opened with the
lseek(int fd, off_t off, int whence) (see also
man 2 lseek)
Seeking is supported only on files open for reading, i.e., on files that have been opened with an
write(int fd, const void* buf, size_t count) (see also
man 2 write)
A write system call will only succeed when
fd refers to an Alluxio file that has been previously
opened with the
Due to the conjunct use of FUSE and JNR, the performance of the mounted file system is expected to be considerably worse than what you would see by using the
alluxio-core-client directly. In other words, if you are concerned about performance rather then functionality, then Alluxio-FUSE is not what you are looking for.
Most of the problems come from the fact that there are several memory copies going on for each call on
write operations, and that FUSE caps the maximum granularity of writes to 128KB. This could be probably improved by a large extent by leveraging the FUSE cache write-backs feature introduced in kernel 3.15 (not supported yet, however, by libfuse 2.x userspace libs).
Configuration Parameters For Alluxio-FUSE
These are the configuration parameters for Alluxio-FUSE.
|alluxio.fuse.maxwrite.bytes||131072||The desired granularity of FUSE write upcalls in bytes. Note that 128K is currently an upper bound imposed by the linux kernel.|
|alluxio.fuse.debug.enabled||false||Enable FUSE debug output. This output will be redirected in a `fuse.out` log file inside `alluxio.logs.dir`.|
|alluxio.fuse.cachedpaths.max||500||Defines the size of the internal Alluxio-FUSE cache that maintains the most frequently used translations between local file system paths and Alluxio file URIs.|
|alluxio.fuse.mount.default||/mnt/alluxio||Default mount point, only used when the user does not specify something else at mount time.|
|alluxio.fuse.fs.root||/mnt||Path, within the Alluxio namespace, that will be used as the root of the FUSE mount.|
|alluxio.fuse.fs.name||alluxio-fuse||Descriptive name used by FUSE to mount the file system.|
This project uses jnr-fuse for FUSE on Java.