SunのJDKをいれる
[root@localhost ~]# wget -O jdk-6u17-linux-i586-rpm.bin http://cds.sun.com/is-bin/INTERSHOP.enfinity/WFS/CDS-CDS_Developer-Site/en_US/-/USD/VerifyItem-Start/jdk-6u17-linux-i586-rpm.bin?BundledLineItemUUID=x.JIBe.pGdUAAAEloNgdaDYE&OrderID=fdZIBe.pyo0AAAElkNgdaDYE&ProductID=lBFIBe.oSOMAAAEkGehn5G0y&FileName=/jdk-6u17-linux-i586-rpm.bin [root@localhost ~]# sh jdk-6u17-linux-i586-rpm.bin [root@localhost ~]# java -version java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
Hadoopをダウンロードし、解凍。(0.20.1 :2009/11/27時点最新)
[hadoop@localhost ~]# wget http://ftp.kddilabs.jp/infosystems/apache/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz [hadoop@localhost ~]$ tar -zxvf hadoop-0.20.1.tar.gz
パスワードなしでsshログインできるように。
[hadoop@localhost ~]$ ssh-keygen -t rsa -P "" [hadoop@localhost ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys [hadoop@localhost ~]$ chmod 600 .ssh/authorized_keys
conf/hadoop-env.sh に、JAVA_HOMEを設定。
HADOOP_HEAPSIZEも増やす。
# The java implementation to use. Required. export JAVA_HOME=/usr/java/default # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. export HADOOP_HEAPSIZE=2000
conf/core-site.xml を設定。
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-0.20.1/tempdir</value> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> </property> </configuration>
conf/hdfs-site.xml を設定。
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
conf/mapred-site.xml を設定。
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration>
フォーマット。
[hadoop@localhost hadoop-0.20.1]$ ./bin/hadoop namenode -format
hadoop起動。
[hadoop@localhost hadoop-0.20.1]$ ./bin/start-all.sh [hadoop@localhost hadoop-0.20.1]$ /usr/java/default/bin/jps 2614 NameNode 2808 SecondaryNameNode 2703 DataNode 3022 TaskTracker 3117 Jps 2894 JobTracker
サンプルを実行。
[hadoop@localhost hadoop-0.20.1]$ mkdir input [hadoop@localhost hadoop-0.20.1]$ emacs input/file1 [hadoop@localhost hadoop-0.20.1]$ ./bin/hadoop dfs -copyFromLocal input input [hadoop@localhost hadoop-0.20.1]$ ./bin/hadoop dfs -ls input [hadoop@localhost hadoop-0.20.1]$ ./bin/hadoop jar hadoop-0.20.1-examples.jar wordcount input output [hadoop@localhost hadoop-0.20.1]$ ./bin/hadoop dfs -cat output/part-r-00000