Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
This is a guide on how to install Hadoop on a Cloud9 workspace.
First step is to Create A Workspace. Once, your workspace is up and ready, visit the Hadoop download page and copy the link to the latest build. At the time of writing the latest stable build was
Once you’ve copied the full url to the Hadoop build tar file, go back to your workspace and download the file via
wget within the terminal:
This will start the download. Please note that the file is 186MB or so, so it might take a couple of minutes.
Once the download finishes, go ahead and extract the tar file by running the following command:
tar xvf hadoop-2.6.0.tar.gz
This will create a new
hadoop-2.6.0 folder within your workspace.
Next we need to set up the JAVA_HOME environment variable within Hadoop’s configuration file. The config file we need to edit is
etc/hadoop/hadoop-env.sh. Just double click on the file as shown in the screenshot below.
Replace the line setting
and also add the
After the file has been saved, try starting Hadoop using the following command:
cd hadoop-2.6.0/ bin/hadoop
it should output something like the following screenshot. If you get the following message, this means that Hadoop has been installed.
For further information on how to run Hadoop as a Single Node Cluster please visit the Hadoop help page.