Running Alluxio with YARN on EC2

Alluxio can be started and managed by Apache YARN. This guide demonstrates how to launch Alluxio with YARN on EC2 machines using the Vagrant scripts that come with Alluxio.

Prerequisites

Install Vagrant and the AWS plugins

Download Vagrant

Install AWS Vagrant plugin:

$ vagrant plugin install vagrant-aws
$ vagrant box add dummy https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box

Install Alluxio

Download Alluxio to your local machine, and unzip it:

$ wget http://alluxio.org/downloads/files/1.4.0/alluxio-1.4.0-bin.tar.gz
$ tar xvfz alluxio-1.4.0-bin.tar.gz

Install python library dependencies

Install python>=2.7, not python3.

Under deploy/vagrant directory in your Alluxio home directory, run:

$ sudo bash bin/install.sh

Alternatively, you can manually install pip, and then in deploy/vagrant run:

$ sudo pip install -r pip-req.txt

Launch a Cluster

To run an Alluxio cluster on EC2, first sign up for an Amazon EC2 account on the Amazon Web Services site.

Then create access keys and set shell environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY by:

$ export AWS_ACCESS_KEY_ID=<your access key>
$ export AWS_SECRET_ACCESS_KEY=<your secret access key>

Next generate your EC2 Key Pairs in the region you want to deploy to (us-east-1 by default). Make sure to set the permissions of your private key file so that only you can read it:

$ chmod 400 <your key pair>.pem

Copy deploy/vagrant/conf/ec2.yml.template to deploy/vagrant/conf/ec2.yml, then set the value of Keypair to your keypair name and Key_Path to the path to the pem key.

By default, the Vagrant script creates a Security Group named alluxio-vagrant-test at Region(us-east-1) and Availability Zone(us-east-1b). The security group will be set up automatically in the region with all inbound/outbound network traffic opened. You can change the security group, region and availability zone in ec2.yml.

Finally, set the “Type” field in deploy/vagrant/conf/ufs.yml to hadoop2.

Now you can launch the Alluxio cluster with Hadoop2.4.1 as under filesystem in us-east-1b by running the script under deploy/vagrant:

$ ./create <number of machines> aws

Access the cluster

Access through Web UI

After command ./create <number of machines> aws succeeds, you can see two green lines like below shown at the end of the shell output:

>>> AlluxioMaster public IP is xxx, visit xxx:19999 for Alluxio web UI<<<
>>> visit default port of the web UI of what you deployed <<<

Default port for Alluxio Web UI is 19999.

Default port for Hadoop Web UI is 50070.

Visit http://{MASTER_IP}:{PORT} in the browser to access the Web UIs.

You can also monitor the instances state through AWS web console.

Access with ssh

The nodes set up are named to AlluxioMaster, AlluxioWorker1, AlluxioWorker2 and so on.

To ssh into a node, run:

$ vagrant ssh <node name>

For example, you can ssh into AlluxioMaster with:

$ vagrant ssh AlluxioMaster

All software is installed under root directory, e.g. Alluxio is installed in /alluxio, Hadoop is installed in /hadoop.

Configure Alluxio integration with YARN

On our EC2 machines, YARN has been installed as a part of Hadoop version 2.4.1. Notice that, by default Alluxio binaries built by vagrant script do not include this YARN integration. You should first stop the default Alluxio service, re-compile Alluxio with profile “yarn” specified to have the YARN client and ApplicationMaster for Alluxio.

$ cd /alluxio
$ ./bin/alluxio-stop.sh all
$ mvn clean install -Dhadoop.version=2.4.1 -Pyarn -Dlicense.skip -DskipTests -Dfindbugs.skip -Dmaven.javadoc.skip -Dcheckstyle.skip

Note that adding -DskipTests -Dfindbugs.skip -Dmaven.javadoc.skip -Dcheckstyle.skip is not strictly necessary, but it makes the build run significantly faster.

To customize Alluxio master and worker with specific properties (e.g., tiered storage setup on each worker), see Configuration settings. To ensure your configuration can be read by both the ApplicationMaster and Alluxio master/workers, put alluxio-site.properties in ~/.alluxio under the home folders for any users that will launch an Alluxio client or server.

Start Alluxio

If Yarn does not reside in HADOOP_HOME, set the environment variable YARN_HOME to the base path of Yarn.

Use the script integration/yarn/bin/alluxio-yarn.sh to start Alluxio. This script takes three arguments:

  1. The total number of Alluxio workers to start. (required)
  2. An HDFS path to distribute the binaries for Alluxio ApplicationMaster. (required)
  3. The Yarn name for the node on which to run the Alluxio Master (optional, defaults to ALLUXIO_MASTER_HOSTNAME)

For example, here we launch an Alluxio cluster with 3 worker nodes, where an HDFS temp directory is hdfs://AlluxioMaster:9000/tmp/ and the master hostname is AlluxioMaster

$ export HADOOP_HOME=/hadoop
$ /hadoop/bin/hadoop fs -mkdir hdfs://AlluxioMaster:9000/tmp
$ /alluxio/integration/yarn/bin/alluxio-yarn.sh 3 hdfs://AlluxioMaster:9000/tmp/ AlluxioMaster

This script will launch an Alluxio Application Master on Yarn, which will then request containers for the Alluxio master and workers. You can also check http://AlluxioMaster:8088 in the browser to access the Web UIs and watch the status of the Alluxio job as well as the application ID.

The output of the above script may produce output like the following:

Using $HADOOP_HOME set to '/hadoop'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/alluxio/clients/client/target/alluxio-core-client-1.4.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Initializing Client
Starting Client
15/10/22 00:01:17 INFO client.RMProxy: Connecting to ResourceManager at AlluxioMaster/172.31.22.124:8050
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/alluxio/clients/client/target/alluxio-core-client-1.4.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ApplicationMaster command: /bin/java -Xmx256M alluxio.yarn.ApplicationMaster 3 /alluxio localhost 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
Submitting application of id application_1445469376652_0002 to ResourceManager
15/10/22 00:01:19 INFO impl.YarnClientImpl: Submitted application application_1445469376652_0002

From the output, we know the application ID to run Alluxio is application_1445469376652_0002. This application ID is needed to kill the application.

Test Alluxio

You can run tests against Alluxio to check its health:

$ /alluxio/bin/alluxio runTests

After the tests finish, visit Alluxio web UI at http://ALLUXIO_MASTER_IP:19999 again. Click Browse in the navigation bar, and you should see the files written to Alluxio by the above tests.

Stop Alluxio

Alluxio can be stopped by using the following YARN command where the application ID of Alluxio can be retrieved from either YARN web UI or the output of alluxio-yarn.sh as mentioned above. For instance, if the application Id is application_1445469376652_0002, you can stop Alluxio by killing the application using:

$ /hadoop/bin/yarn application -kill application_1445469376652_0002

Destroy the cluster

Under deploy/vagrant directory in your local machine where EC2 machines are launched, you can run:

$ ./destroy

to destroy the cluster that you created. Only one cluster can be created at a time. After the command succeeds, the EC2 instances are terminated.

Trouble Shooting

1 If you compile Alluxio with YARN integration using maven and see compilation errors like the following messages:

 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on project alluxio-integration-yarn: Compilation failure: Compilation failure:
 [ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[273,49] cannot find symbol
 [ERROR] symbol:   method $$()
 [ERROR] location: variable JAVA_HOME of type org.apache.hadoop.yarn.api.ApplicationConstants.Environment
 [ERROR] /Work/alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[307,31] cannot find symbol
 [ERROR] symbol:   variable CLASS_PATH_SEPARATOR
 [ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
 [ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[310,29] cannot find symbol
 [ERROR] symbol:   variable CLASS_PATH_SEPARATOR
 [ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
 [ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[312,47] cannot find symbol
 [ERROR] symbol:   variable CLASS_PATH_SEPARATOR
 [ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
 [ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[314,47] cannot find symbol
 [ERROR] symbol:   variable CLASS_PATH_SEPARATOR
 [ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
 [ERROR] -> [Help 1]

Please make sure you are using the proper hadoop version

$ mvn clean install -Dhadoop.version=2.4.1 -Pyarn
Need help? Ask a Question