Configuration Settings

This page explains the configuration system of Alluxio and also provides recommendation on how to customize the configuration for Alluxio in different contexts.

Configuration in Alluxio

Alluxio runtime respects three sources of configuration settings:

  1. Application settings. Setting Alluxio configuration in this way is application-specific, and is required each time when running an application instance (e.g., a Spark job).
  2. Environment variables. This is an easy and fast way to set the basic properties to manage Alluxio servers and run Alluxio shell commands. Note that, configuration set through environment variables may not be realized by applications.
  3. Property files. This is a general approach to customize any supported Alluxio configuration properties. Configuration in those files can be respected by Alluxio servers, as well as applications.

The priority to load property values, from the highest to the lowest, is application settings (if any), environment variables, property files and the defaults.

Application settings

Alluxio shell users can use -Dkey=property to specify an Alluxio configuration value in commandline. For example,

$ bin/alluxio fs -Dalluxio.user.file.writetype.default=MUST_CACHE touch /foo

Spark users can add "-Dkey=property" to ${SPARK_DAEMON_JAVA_OPTS} in conf/spark-env.sh, or add it to spark.executor.extraJavaOptions (for Spark executors) and spark.driver.extraJavaOptions (for Spark drivers).

Hadoop MapReduce users can set "-Dkey=property" in hadoop jar command-lines to pass it down to Alluxio:

$ hadoop jar -Dalluxio.user.file.writetype.default=MUST_CACHE foo.jar

Note that, setting Alluxio configuration in this way is application specific and required for each job or command.

Environment variables

When you want to start Alluxio server processes, or use Alluxio command line interfaces with your specific configuration tuning, it is often fast and easy to set environment variables to customize basic Alluxio configuration. However, these environment variables will not affect application processes like Spark or MapReduce that use Alluxio as a client.

Alluxio supports a few basic and very frequently used configuration properties via the environment variables in conf/alluxio-env.sh, including:

Environment VariableMeaning
ALLUXIO_MASTER_HOSTNAME hostname of Alluxio master, defaults to localhost.
ALLUXIO_MASTER_ADDRESS deprecated by ALLUXIO_MASTER_HOSTNAME since version 1.1 and will be remove in version 2.0.
ALLUXIO_UNDERFS_ADDRESS under storage system address, defaults to ${ALLUXIO_HOME}/underFSStorage which is a local file system.
ALLUXIO_RAM_FOLDER the directory where a worker stores in-memory data, defaults to /mnt/ramdisk.
ALLUXIO_JAVA_OPTS Java VM options for both Master, Worker and Alluxio Shell configuration. Note that, by default ALLUXIO_JAVA_OPTS is included in both ALLUXIO_MASTER_JAVA_OPTS, ALLUXIO_WORKER_JAVA_OPTS and ALLUXIO_USER_JAVA_OPTS.
ALLUXIO_MASTER_JAVA_OPTS additional Java VM options for Master configuration.
ALLUXIO_WORKER_JAVA_OPTS additional Java VM options for Worker configuration.
ALLUXIO_USER_JAVA_OPTS additional Java VM options for Alluxio shell configuration.

For example, if you would like to setup an Alluxio master at localhost that talks to an HDFS cluster with a namenode also running at localhost, and enable Java remote debugging at port 7001, you can do so before starting master process using:

$ export ALLUXIO_MASTER_HOSTNAME="localhost"
$ export ALLUXIO_UNDERFS_ADDRESS="hdfs://localhost:9000"
$ export ALLUXIO_MASTER_JAVA_OPTS="$ALLUXIO_JAVA_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=7001“

Users can either set these variables through shell or in conf/alluxio-env.sh. If this file does not exist yet, Alluxio can help you bootstrap the conf/alluxio-env.sh file by running

$ ./bin/alluxio bootstrapConf <ALLUXIO_MASTER_HOSTNAME> [local|hdfs|s3|gcs|glusterfs|swift]

Alternatively, you can create one from a template we provided in the source code using:

$ cp conf/alluxio-env.sh.template conf/alluxio-env.sh

Property files

Alluxio site property file alluxio-site.properties can overwrite Alluxio configuration regardless the JVM is an Alluxio server process or a job using Alluxio client. For the site property file to be loaded, either the parent directory of this file is a part of the classpath of your target JVM process, or the file is in one of the pre-defined paths.

Using Alluxio supported environment variables has two limitations: first it only provides basic Alluxio settings, and second it does not affect non-Alluxio JVMs like Spark or MapReduce. To address them, Alluxio uses site property file alluxio-site.properties for users to customize all supported configuration properties, regardless of the JVM process. On startup, Alluxio runtime checks if the configuration property file exists and if so, it uses the content to override the default configuration. To be specific, it searches alluxio-site.properties in ${HOME}/.alluxio/, /etc/alluxio/ (can be customized by changing the default value of alluxio.site.conf.dir) and the classpath of the relevant Java VM process in order, and skips the remaining paths once a file is found.

For example, ${ALLUXIO_HOME}/conf/ is by default on the classpath of Alluxio master, worker and shell JVM processes. So you can simply create ${ALLUXIO_HOME}/conf/alluxio-site.properties by

$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties

Then customize it to fit your configuration tuning needs to start Alluxio servers or to use Alluxio shell commands:

$ cp $ALLUXIO_HOME/conf/alluxio-site.properties.template $ALLUXIO_HOME/conf/alluxio-site.properties

For applications like Spark or MapReduce to use Alluxio property files, you can append the directory of your site property files to your application classpath. For example

$ export SPARK_CLASSPATH=${ALLUXIO_HOME}/conf:${SPARK_CLASSPATH} # for Spark jobs
$ export HADOOP_CLASSPATH=${ALLUXIO_HOME}/conf:${HADOOP_CLASSPATH} # for Hadoop jobs

Alternatively, with access to paths like /etc/, one can copy the site properties to /etc/alluxio/. This configuration will be shared across processes regardless the JVM is an Alluxio server or a job using Alluxio client.

Appendix

All Alluxio configuration properties fall into one of the six categories: Common (shared by Master and Worker), Master specific, Worker specific, User specific, Cluster specific (used for running Alluxio with cluster managers like Mesos and YARN), and Security specific (shared by Master, Worker, and User).

Common Configuration

The common configuration contains constants shared by different components.

Property NameDefaultMeaning
alluxio.conf.dir ${alluxio.home}/conf
alluxio.debug false Set to true to enable debug mode which has additional logging and info in the Web UI.
alluxio.home /mnt/alluxio_default_home Alluxio installation directory.
alluxio.logs.dir ${alluxio.home}/logs The path to store log files.
alluxio.keyvalue.enabled false Whether the key-value service is enabled.
alluxio.keyvalue.partition.size.bytes.max 512MB Maximum allowable size (in bytes) of a single key-value partition in a store. This value should be no larger than the block size (alluxio.user.block.size.bytes.default)
alluxio.metrics.conf.file ${alluxio.conf.dir}/metrics.properties The file path of the metrics system configuration file. By default it is `metrics.properties` in the `conf` directory.
alluxio.network.host.resolution.​timeout.ms 5000 During startup of the Master and Worker processes Alluxio needs to ensure that they are listening on externally resolvable and reachable host names. To do this, Alluxio will automatically attempt to select an appropriate host name if one was not explicitly specified. This represents the maximum amount of time spent waiting to determine if a candidate host name is resolvable over the network.
alluxio.network.thrift.frame.​size.bytes.max 16MB (Experimental) The largest allowable frame size used for Thrift RPC communication.
alluxio.site.conf.dir ${user.home}/.alluxio/,/etc/alluxio/ Default search path for configuration files to read
alluxio.test.mode false Flag used only during tests to allow special behavior.
alluxio.underfs.address ${alluxio.work.dir}/underFSStorage Alluxio directory in the under file system.
alluxio.underfs.gcs.owner.id.to.username.mapping No default Optionally, specify a preset gcs owner id to Alluxio username static mapping in the format "id1=user1;id2=user2". The Google Cloud Storage IDs can be found at the console address https://console.cloud.google.com/storage/settings . Please use the "Owners" one.
alluxio.underfs.glusterfs.impl org.apache.hadoop.fs.glusterfs.​GlusterFileSystem Glusterfs hook with hadoop.
alluxio.underfs.glusterfs.mapred.​system.dir glusterfs:///mapred/system Optionally, specify subdirectory under GlusterFS for intermediary MapReduce data.
alluxio.underfs.hdfs.configuration ${alluxio.conf.dir}/core-site.xml Location of the hdfs configuration file.
alluxio.underfs.hdfs.impl org.apache.hadoop.hdfs.​DistributedFileSystem The implementation class of the HDFS as the under storage system.
alluxio.underfs.hdfs.prefixes hdfs://,glusterfs:///,maprfs:/// Optionally, specify which prefixes should run through the Apache Hadoop implementation of UnderFileSystem. The delimiter is any whitespace and/or ','.
alluxio.underfs.hdfs.remote false Boolean indicating whether or not the under storage worker nodes are remote with respect to Alluxio worker nodes. If set to true, Alluxio will not attempt to discover locality information from the under storage because locality is impossible. This will improve performance. The default value is false.
alluxio.underfs.listing.length 1000 The maximum number of directory entries to list in a single query to under file system. If the total number of entries is greater than the specified length, multiple queries will be issued.
alluxio.underfs.object.store.mount.shared.publicly false Whether or not to share object storage under storage system mounted point with all Alluxio users. Note that this configuration has no effect on HDFS nor local UFS. The default value is false.
alluxio.underfs.s3.owner.id.to.username.mapping No default Optionally, specify a preset s3 canonical id to Alluxio username static mapping, in the format "id1=user1;id2=user2". The AWS S3 canonical ID can be found at the console address https://console.aws.amazon.com/iam/home?#security_credential . Please expand the "Account Identifiers" tab and refer to "Canonical User ID".
alluxio.underfs.s3.endpoint No default Optinally, to reduce data latency or visit resources which are sepreted in defferent AWS regions, specify a regional endpoint to make aws requests. An endpoint is a URL that is the entry point for a web service. For example, s3.cn-north-1.amazonaws.com.cn is an entry point for the Amazon S3 service in beijing region.
alluxio.underfs.s3.proxy.host No default Optionally, specify a proxy host for communicating with S3.
alluxio.underfs.s3.proxy.https.only true If using a proxy to communicate with S3, determine whether to talk to the proxy using https.
alluxio.underfs.s3.proxy.port No default Optionally, specify a proxy port for communicating with S3.
alluxio.underfs.s3.threads.max 40 The maximum number of threads to use for communicating with S3 and the maximum number of concurrent connections to S3. Includes both threads for data upload and metadata operations. This number should be at least as large as the max admin threads plus max upload threads. The default is 40 which is the sum of the default admin and upload thread pool sizes.
alluxio.underfs.s3.admin.threads.max 20 The maximum number of threads to use for metadata operations when communicating with S3. These operations may be fairly concurrent and frequent but should not take much time to process. The default is 20.
alluxio.underfs.s3.upload.threads.max 20 The maximum number of threads to use for uploading data to S3 for multipart uploads. These operations can be fairly expensive, so multiple threads are encouraged. However, this also splits the bandwidth between threads, meaning the overall latency for completing an upload will be higher for more threads. The default value is 20.
alluxio.underfs.s3.disable.dns.buckets false Optionally, specify to make all S3 requests path style. The default value is false.
alluxio.underfs.s3a.consistency.timeout.ms 60000 The duration to wait for metadata consistency from the under storage. This is only used by internal Alluxio operations which should be successful, but may appear unsuccessful due to eventual consistency. The default value is 60000 milliseconds (1 minute).
alluxio.underfs.s3a.request.timeout.ms 60000 The timeout for a single request to S3. Infinity if set to 0. Setting this property to a non-zero value can improve performance by avoiding the long tail of requests to S3. For very slow connections to S3, consider increasing this value or setting it to 0. The default value is 60000 milliseconds (1 minute).
alluxio.underfs.s3a.secure.http.enabled false Whether or not to use HTTPS protocol when communicating with s3. The default value is false.
alluxio.underfs.s3a.server.side.encryption.enabled false Whether or not to encrypt data stored in s3. The default value is false.
alluxio.underfs.s3a.socket.timeout.ms 50000 Length of the socket timeout when communicating with s3. The default value is 50000.
alluxio.underfs.s3a.inherit_acl true Optionally disable this to disable inheriting bucket ACLs on objects. The default value is true.
alluxio.web.resources ${alluxio.home}/core/server/src/main/webapp Path to the web application resources.
alluxio.web.threads 1 How many threads to use for the web server.
alluxio.work.dir ${alluxio.home}
alluxio.zookeeper.address No default Address of ZooKeeper
alluxio.zookeeper.election.path /election Election directory in ZooKeeper.
alluxio.zookeeper.enabled false If true, setup master fault tolerant mode using ZooKeeper.
alluxio.zookeeper.leader.path /leader Leader directory in ZooKeeper.
alluxio.zookeeper.leader.inquiry.retry 10 The number of retries to inquire leader from ZooKeeper.

Master Configuration

The master configuration specifies information regarding the master node, such as the address and the port number.

Property NameDefaultMeaning
alluxio.master.bind.host 0.0.0.0 The hostname that Alluxio master binds to. See multi-homed networks
alluxio.master.heartbeat.interval.ms 1000 The interval (in milliseconds) between Alluxio master's heartbeats
alluxio.master.hostname localhost The hostname of Alluxio master.
alluxio.master.file.async.persist.handler alluxio.master.file.async.DefaultAsyncPersistHandler The handler for processing the async persistence requests.
alluxio.master.format.file_prefix "_format_" The file prefix of the file generated in the journal directory when the journal is formatted. The master will search for a file with this prefix when determining of the journal was once formatted.
alluxio.master.journal.flush.batch.time.ms 5 Time (in milliseconds) to wait for batching journal writes.
alluxio.master.journal.flush.timeout.ms 300000 The amount of time (in milliseconds) to keep retrying journal writes before giving up and shutting down the master.
alluxio.master.journal.folder ${alluxio.work.dir}/journal The path to store master journal logs.
alluxio.master.journal.formatter.class alluxio.master.journal.​ProtoBufJournalFormatter The class to serialize the journal in a specified format.
alluxio.master.journal.log.size.bytes.max 10MB If a log file is bigger than this value, it will rotate to next file
alluxio.master.journal.tailer.​shutdown.quiet.wait.time.ms 5000 Before the standby master shuts down its tailer thread, there should be no update to the leader master's journal in this specified time period (in milliseconds).
alluxio.master.journal.tailer.sleep.time.ms 1000 Time (in milliseconds) the standby master sleeps for when it cannot find anything new in leader master's journal.
alluxio.master.lineage.checkpoint.interval.ms 600000 The interval (in milliseconds) between Alluxio's checkpoint scheduling.
alluxio.master.lineage.checkpoint.class alluxio.master.lineage.checkpoint.​CheckpointLatestScheduler The class name of the checkpoint strategy for lineage output files. The default strategy is to checkpoint the latest completed lineage, i.e. the lineage whose output files are completed.
alluxio.master.lineage.recompute.interval.ms 600000 The interval (in milliseconds) between Alluxio's recompute execution. The executor scans the all the lost files tracked by lineage, and re-executes the corresponding jobs. every 10 minutes.
alluxio.master.lineage.recompute.log.path ${alluxio.logs.dir}/recompute.log The path to the log that the recompute executor redirects the job's stdout into.
alluxio.master.port 19998 The port that Alluxio master node runs on.
alluxio.master.retry 29 The number of retries that the client connects to master
alluxio.master.startup.consistency.check.enabled true Whether the system should be checked for consistency with the underlying storage on startup. During the time the check is running, Alluxio will be in read only mode. Enabled by default.
alluxio.master.ttl.checker.interval.ms 3600000 Time interval (in milliseconds) to periodically delete the files with expired ttl value.
alluxio.master.web.bind.host 0.0.0.0 The hostname Alluxio master web UI binds to. See multi-homed networks
alluxio.master.web.hostname localhost The hostname of Alluxio Master web UI.
alluxio.master.web.port 19999 The port Alluxio web UI runs on.
alluxio.master.whitelist / A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.
alluxio.master.worker.threads.max 2048 The maximum number of incoming RPC requests to master that can be handled. This value is used to configure maximum number of threads in Thrift thread pool with master.
alluxio.master.worker.threads.min 512 The minimum number of threads used to handle incoming RPC requests to master. This value is used to configure minimum number of threads in Thrift thread pool with master.
alluxio.master.worker.timeout.ms 300000 Timeout (in milliseconds) between master and worker indicating a lost worker.
alluxio.master.tieredstore.global.levels 3 The total number of storage tiers in the system
alluxio.master.tieredstore.global.level0.alias MEM The name of the highest storage tier in the entire system
alluxio.master.tieredstore.global.level1.alias SSD The name of the second highest storage tier in the entire system
alluxio.master.tieredstore.global.level2.alias HDD The name of the third highest storage tier in the entire system
alluxio.master.keytab.file Kerberos keytab file for Alluxio master.
alluxio.master.principal Kerberos principal for Alluxio master.

Worker Configuration

The worker configuration specifies information regarding the worker nodes, such as the address and the port number.

Property NameDefaultMeaning
alluxio.worker.allocator.class alluxio.worker.block.allocator.​MaxFreeAllocator The strategy that a worker uses to allocate space among storage directories in certain storage layer. Valid options include: `alluxio.worker.block.allocator.MaxFreeAllocator`, `alluxio.worker.block.allocator.GreedyAllocator`, `alluxio.worker.block.allocator.RoundRobinAllocator`.
alluxio.worker.bind.host 0.0.0.0 The hostname Alluxio's worker node binds to. See multi-homed networks
alluxio.worker.block.heartbeat.interval.ms 1000 The interval (in milliseconds) between block worker's heartbeats
alluxio.worker.block.heartbeat.timeout.ms 60000 The timeout value (in milliseconds) of block worker's heartbeat
alluxio.worker.block.threads.max 2048 The maximum number of incoming RPC requests to block worker that can be handled. This value is used to configure maximum number of threads in Thrift thread pool with block worker. This value should be greater than the sum of `alluxio.user.block.worker.client.threads` across concurrent Alluxio clients. Otherwise, the worker connection pool can be drained, preventing new connections from being established.
alluxio.worker.block.threads.min 256 The minimum number of threads used to handle incoming RPC requests to block worker. This value is used to configure minimum number of threads in Thrift thread pool with block worker.
alluxio.worker.data.bind.host 0.0.0.0 The hostname that the Alluxio worker's data server runs on. See multi-homed networks
alluxio.worker.data.folder /alluxioworker/ A relative path within each storage directory used as the data folder for Alluxio worker to put data for tiered store.
alluxio.worker.data.port 29999 The port Alluxio's worker's data server runs on.
alluxio.worker.data.server.class alluxio.worker.netty.​NettyDataServer Selects the networking stack to run the worker with. Valid options are: `alluxio.worker.netty.NettyDataServer`.
alluxio.worker.evictor.class alluxio.worker.block.​evictor.LRUEvictor The strategy that a worker uses to evict block files when a storage layer runs out of space. Valid options include `alluxio.worker.block.evictor.LRFUEvictor`, `alluxio.worker.block.evictor.GreedyEvictor`, `alluxio.worker.block.evictor.LRUEvictor`.
alluxio.worker.evictor.lrfu.attenuation.factor 2.0 A attenuation factor in [2, INF) to control the behavior of LRFU.
alluxio.worker.evictor.lrfu.step.factor 0.25 A factor in [0, 1] to control the behavior of LRFU: smaller value makes LRFU more similar to LFU; and larger value makes LRFU closer to LRU.
alluxio.worker.file.persist.pool.size 64 The size of the thread pool per worker, in which the thread persists an ASYNC_THROUGH file to under storage.
alluxio.worker.filesystem.heartbeat.interval.ms 1000 The heartbeat interval (in milliseconds) between the worker and file system master.
alluxio.worker.hostname localhost The hostname of Alluxio worker.
alluxio.worker.memory.size 128 MB Memory capacity of each worker node.
alluxio.worker.network.netty.boss.threads 1 How many threads to use for accepting new requests.
alluxio.worker.network.netty.file.transfer MAPPED When returning files to the user, select how the data is transferred; valid options are `MAPPED` (uses java MappedByteBuffer) and `TRANSFER` (uses Java FileChannel.transferTo).
alluxio.worker.network.netty.shutdown.quiet.period 2 The quiet period (in seconds). When the netty server is shutting down, it will ensure that no RPCs occur during the quiet period. If an RPC occurs, then the quiet period will restart before shutting down the netty server.
alluxio.worker.network.netty.shutdown.timeout 15 Maximum amount of time to wait (in seconds) until the netty server is shutdown (regardless of the quiet period).
alluxio.worker.network.netty.watermark.high 32768 Determines how many bytes can be in the write queue before switching to non-writable.
alluxio.worker.network.netty.watermark.low 8192 Once the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable.
alluxio.worker.network.netty.worker.threads 0 How many threads to use for processing requests. Zero defaults to #cpuCores * 2.
alluxio.worker.port 29998 The port Alluxio's worker node runs on.
alluxio.worker.session.timeout.ms 60000 Timeout (in milliseconds) between worker and client connection indicating a lost session connection.
alluxio.worker.tieredstore.block.lock.readers 1000 The max number of concurrent readers for a block lock.
alluxio.worker.tieredstore.block.locks 1000 Total number of block locks for an Alluxio block worker. Larger value leads to finer locking granularity, but uses more space.
alluxio.worker.tieredstore.levels 1 The number of storage tiers on the worker
alluxio.worker.tieredstore.level0.alias MEM The alias of the highest storage tier on this worker. It must match one of the global storage tiers from the master configuration. We disable placing an alias lower in the global hierarchy before an alias with a higher postion on the worker hierarchy. So by default, SSD cannot come before MEM on any worker.
alluxio.worker.tieredstore.level0.dirs.path /mnt/ramdisk/ The path of storage directory path for the top storage layer. Note for MacOS the value should be `/Volumes/`
alluxio.worker.tieredstore.level0.dirs.quota ${alluxio.worker.memory.size} The capacity of the top storage layer.
alluxio.worker.tieredstore.level0.reserved.ratio 0.1 The portion of space reserved in the top storage layer (a value between 0 and 1).
alluxio.worker.tieredstore.reserver.enabled false Whether to enable tiered store reserver service or not.
alluxio.worker.tieredstore.reserver.interval.ms 1000 The time period (in milliseconds) of space reserver service, which keeps certain portion of available space on each layer.
alluxio.worker.tieredstore.retry 3 The number of retries that the worker to process blocks.
alluxio.worker.web.bind.host 0.0.0.0 The hostname Alluxio worker's web server binds to. See multi-homed networks
alluxio.worker.web.hostname localhost The hostname Alluxio worker's web UI binds to.
alluxio.worker.web.port 30000 The port Alluxio worker's web UI runs on.
alluxio.worker.keytab.file Kerberos keytab file for Alluxio worker.
alluxio.worker.principal Kerberos principal for Alluxio worker.

User Configuration

The user configuration specifies values regarding file system access.

Property NameDefaultMeaning
alluxio.user.block.master.client.threads 10 The number of threads used by a block master client pool to talk to the block master.
alluxio.user.block.worker.client.threads 10 The number of threads used by a block worker client pool for heartbeating to a worker. Increase this value if worker failures affect client connections to healthy workers.
alluxio.user.block.remote.read.buffer.size.bytes 8 MB The size of the file buffer to read data from remote Alluxio worker.
alluxio.user.block.remote.reader.class alluxio.client.netty.​NettyRemoteBlockReader Selects networking stack to run the client with. Currently only `alluxio.client.netty.NettyRemoteBlockReader` (read remote data using netty) is valid. This is deprecated and will be removed in 2.0.0.
alluxio.user.block.remote.writer.class alluxio.client.netty.​NettyRemoteBlockWriter Selects networking stack to run the client with for block writes. This is deprecated and will be removed in 2.0.0.
alluxio.user.block.size.bytes.default 512MB Default block size for Alluxio files.
alluxio.user.failed.space.request.limits 3 The number of times to request space from the file system before aborting.
alluxio.user.file.buffer.bytes 1 MB The size of the file buffer to use for file system reads/writes.
alluxio.user.file.cache.partially.read.block true When read type is CACHE_PROMOTE or CACHE and this property is set to true, the entire block will be cached by Alluxio space even if the client only reads a part of this block.
alluxio.user.file.master.client.threads 10 The number of threads used by a file master client to talk to the file master.
alluxio.user.file.waitcompleted.poll.ms 1000 The time interval to poll a file for its completion status when using waitCompleted.
alluxio.user.file.worker.client.threads 10 How many threads to use for file worker clients to read from workers.
alluxio.user.file.write.location.policy.class alluxio.client.file.policy.LocalFirstPolicy The default location policy for choosing workers for writing a file's blocks
alluxio.user.file.write.avoid.eviction.policy.reserved.size.bytes 0MB The portion of space reserved in worker when user use the LocalFirstAvoidEvictionPolicy class as file write location policy, default 0 MB.
alluxio.user.file.readtype.default CACHE_PROMOTE Default read type when creating Alluxio files. Valid options are `CACHE_PROMOTE` (move data to highest tier if already in Alluxio storage, write data into highest tier of local Alluxio if data needs to be read from under storage), `CACHE` (write data into highest tier of local Alluxio if data needs to be read from under storage), `NO_CACHE` (no data interaction with Alluxio, if the read is from Alluxio data migration or eviction will not occur).
alluxio.user.file.writetype.default MUST_CACHE Default write type when creating Alluxio files. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `THROUGH` (no cache, write to UnderFS synchronously).
alluxio.user.file.write.tier.default 0 The default tier for choosing a where to write a block. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier.
alluxio.user.heartbeat.interval.ms 1000 The interval (in milliseconds) between Alluxio worker's heartbeats
alluxio.user.lineage.enabled false Flag to enable lineage feature.
alluxio.user.lineage.master.client.threads 10 The number of threads used by a lineage master client to talk to the lineage master.
alluxio.user.network.netty.timeout.ms 30000 The maximum number of milliseconds for a netty client (for block reads and block writes) to wait for a response from the data server.
alluxio.user.network.netty.worker.threads 0 How many threads to use for remote block worker client to read from remote block workers.
alluxio.user.ufs.delegation.enabled true Flag for delegating ufs data operations to the worker, enabled by default. When enabled, the client does not require any under storage libraries. Set this to false to use the client to directly communicate with the ufs, which requires the client to have the necessary under storage system libraries.
alluxio.user.ufs.delegation.read.buffer.size.bytes 8MB Size of the read buffer when reading from the ufs through the Alluxio worker. Each read request will fetch at least this many bytes. This property has no effect if the delegation flag is turned off.
alluxio.user.ufs.delegation.write.buffer.size.bytes 2MB Size of the write buffer when writing to the ufs through the Alluxio worker. Each write request will write at least this many bytes, unless the write is at the end of the file. This property has no effect if the delegation flag is turned off.
alluxio.user.ufs.file.reader.class alluxio.client.netty.​NettyUnderFileSystemFileReader Selects networking stack to run the client with for reading from under file system through a worker's data server. Currently only `alluxio.client.netty.NettyUnderFileSystemFileReader` (remote read using netty) is valid.
alluxio.user.ufs.file.writer.class alluxio.client.netty.​NettyUnderFileSystemFileWriter Selects networking stack to run the client with for writing to under file system through a worker's data server. Currently only `alluxio.client.netty.NettyUnderFileSystemFileWriter` (remote write using netty) is valid.
alluxio.user.packet.streaming.enabled false If set to true, the packet streaming data transfer protocol is used. Packet streaming is an experimental feature. It provides a more efficient data transfer mechanism.

Cluster Management

When running Alluxio with cluster managers like Mesos and YARN, Alluxio has additional configuration options.

Property NameDefaultMeaning
alluxio.integration.master.resource.cpu 1 CPU resource in terms of number of cores required to run an Alluxio master.
alluxio.integration.master.resource.mem 1024 MB Memory resource required to run an Alluxio master.
alluxio.integration.mesos.executor.dependency.path http://downloads.alluxio.org/downloads/files/${alluxio.version}/alluxio-${alluxio.version}-bin.tar.gz The URL from which Mesos executor can download Alluxio dependencies.
alluxio.integration.mesos.jdk.path jdk1.7.0_79
alluxio.integration.mesos.jdk.url https://alluxio-mesos.s3.amazonaws.com/jdk-7u79-linux-x64.tar.gz
alluxio.integration.mesos.master.name AlluxioMaster The Mesos task name for the Alluxio master task.
alluxio.integration.mesos.master.node.count 1 The number of Alluxio master processes to start.
alluxio.integration.mesos.principal alluxio Alluxio framework’s identity.
alluxio.integration.mesos.role * Role that Alluxio framework in Mesos cluster may belong to.
alluxio.integration.mesos.secret Alluxio framework’s secret.
alluxio.integration.mesos.user Account used by the Mesos executor to run Alluxio workers.
alluxio.integration.mesos.worker.name AlluxioWorker The Mesos task name for the Alluxio worker task.
alluxio.integration.worker.resource.cpu 1 CPU resource in terms of number of cores required to run an Alluxio worker.
alluxio.integration.worker.resource.mem 1024 MB Memory resource required to run an Alluxio worker. This memory does not include the memory configured for tiered storage.
alluxio.integration.yarn.workers.per.host.max 1

Security Configuration

The security configuration specifies information regarding the security features, such as authentication and file permission. Properties for authentication take effect for master, worker, and user. Properties for file permission only take effect for master. See Security for more information about security features.

Property NameDefaultMeaning
alluxio.security.authentication.type SIMPLE The authentication mode. Currently three modes are supported: NOSASL, SIMPLE, CUSTOM. The default value SIMPLE indicates that a simple authentication is enabled. Server trusts whoever the client claims to be.
alluxio.security.authentication.socket.timeout.ms 600000 The maximum amount of time (in milliseconds) for a user to create a Thrift socket which will connect to the master.
alluxio.security.authentication.custom.provider.class The class to provide customized authentication implementation, when alluxio.security.authentication.type is set to CUSTOM. It must implement the interface 'alluxio.security.authentication.AuthenticationProvider'.
alluxio.security.login.username When alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the user requesting Alluxio service. If it is not set explicitly, the OS login user will be used.
alluxio.security.authorization.permission.enabled true Whether to enable access control based on file permission.
alluxio.security.authorization.permission.umask 022 The umask of creating file and directory. The initial creation permission is 777, and the difference between directory and file is 111. So for default umask value 022, the created directory has permission 755 and file has permission 644.
alluxio.security.authorization.permission.supergroup supergroup The super group of Alluxio file system. All users in this group have super permission.
alluxio.security.group.mapping.class alluxio.security.group.provider.​ShellBasedUnixGroupsMapping The class to provide user-to-groups mapping service. Master could get the various group memberships of a given user. It must implement the interface 'alluxio.security.group.GroupMappingService'. The default implementation execute the 'groups' shell command to fetch the group memberships of a given user.

Configure multihomed networks

Alluxio configuration provides a way to take advantage of multi-homed networks. If you have more than one NICs and you want your Alluxio master to listen on all NICs, you can specify alluxio.master.bind.host to be 0.0.0.0. As a result, Alluxio clients can reach the master node from connecting to any of its NIC. This is also the same case for other properties suffixed with bind.host.

Need help? Ask a Question