Tutorial per a la creació d'un entorn de Hadoop per a l'aprenentatge i les proves en VirtualBox

Amb aquest tutorial pot configurar el seu propi entorn Hadoop ús de màquines virtuals. To get started download and install VirtualBox .

Next you will need to get the centos virtual image. Un cop descarregat crear un directori a l'arrel de la unitat anomenada màquines virtuals i descomprimir el contingut dels centos arxiu zip en el directori de les màquines virtuals.

In VirtualBox, you will need to create a new virtual machine. triar Linux from type and then Red Hat 64-bit from version. You need to allocate 2048Mb memory and for the Hard Disk choose use an existing virtual hard disk file and navigate to the centos image directory in the VMs directory. Choose the vmdk file whose filename does not end with a number. Once you have done that the machine is ready to start up.

Next is to install the VirtualBox Guest Additions which gives you better performance amongst other enhancement. Before you install Guest Additions, run the following install commands to prepare for the installation

# yum update
# yum install gcc
# yum install kernel-devel
# yum install bzip2
# shutdown -r 00

If you need to return the cursor to Finestres press the host key which is the right Ctrl key by default. Choose devices from the VirtualBox menu and choose Insert Guest Additions CD Image. Follow the prompts to install. Restart the virtual machine for the changes to take effect.

 

Installing Apache BigTop

BigTop is ideal for learning big Data components like Hadoop. Lets get started with the installation.

First get the repo file which points to the download of Hadoop and it’s dependencies.

wget -O /etc/yum.repos.d/bigtop.repo \
http://www.apache.org/dist/bigtop/bigtop-1.0.0/repos/centos6/bigtop.repo

Next is to select and install the Hadoop components

yum install hadoop\* mahout\* oozie\* hbase\* hive\* hue\* pig\* zookeeper\*

Choose yes for the code signing prompts. Once Hadoop and the selected components is installed the next step is to configure Hadoop. After the configuration Hadoop will be ready to start.

Download and install java

yum install java-1.7.0-openjdk-devel.x86_64

Format the namenode

sudo /etc/init.d/hadoop-hdfs-namenode init

Start the Hadoop services for your cluster

for i in hadoop-hdfs-namenode hadoop-hdfs-datanode ; 

do sudo service $i start ;

done

Create a sub-directory structure in HDFS

sudo /usr/lib/hadoop/libexec/init-hdfs.sh

Start the YARN daemons

sudo service hadoop-yarn-resourcemanager start;
sudo service hadoop-yarn-nodemanager start

If everything we well you now have a working Hadoop installation.

 

Deixa un comentari