HADOOP 2.6 WORDCOUNT EXAMPLE

HADOOP 2.6 WORDCOUNT EXAMPLE

 

 

root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# pwd
/usr/local/hadoop/share/hadoop/mapreduce
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# cat f3.txt >> f2.txt
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# wc -l f2.txt 
7395228 f2.txt
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# hdfs dfs -put f2.txt /home/hadoop/input/f2.txt
15/02/06 21:06:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /home/hadoop/input /home/hadoop/output4
15/02/06 21:07:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/06 21:07:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/06 21:07:27 INFO input.FileInputFormat: Total input paths to process : 2
15/02/06 21:07:27 INFO mapreduce.JobSubmitter: number of splits:2
15/02/06 21:07:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1423212832128_0004
15/02/06 21:07:28 INFO impl.YarnClientImpl: Submitted application application_1423212832128_0004
15/02/06 21:07:28 INFO mapreduce.Job: The url to track the job: http://hadoop2-VirtualBox:8088/proxy/application_1423212832128_0004/
15/02/06 21:07:28 INFO mapreduce.Job: Running job: job_1423212832128_0004
15/02/06 21:07:36 INFO mapreduce.Job: Job job_1423212832128_0004 running in uber mode : false
15/02/06 21:07:36 INFO mapreduce.Job:  map 0% reduce 0%
15/02/06 21:07:47 INFO mapreduce.Job:  map 50% reduce 0%
15/02/06 21:07:52 INFO mapreduce.Job:  map 58% reduce 0%
15/02/06 21:07:55 INFO mapreduce.Job:  map 63% reduce 0%
15/02/06 21:07:58 INFO mapreduce.Job:  map 65% reduce 0%
15/02/06 21:08:02 INFO mapreduce.Job:  map 68% reduce 0%
15/02/06 21:08:05 INFO mapreduce.Job:  map 71% reduce 0%
15/02/06 21:08:06 INFO mapreduce.Job:  map 71% reduce 17%
15/02/06 21:08:08 INFO mapreduce.Job:  map 77% reduce 17%
15/02/06 21:08:11 INFO mapreduce.Job:  map 80% reduce 17%
15/02/06 21:08:14 INFO mapreduce.Job:  map 100% reduce 17%
15/02/06 21:08:16 INFO mapreduce.Job:  map 100% reduce 100%
15/02/06 21:08:16 INFO mapreduce.Job: Job job_1423212832128_0004 completed successfully
15/02/06 21:08:16 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=269
FILE: Number of bytes written=317703
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=125719110
HDFS: Number of bytes written=49
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters 
Killed map tasks=1
Launched map tasks=3
Launched reduce tasks=1
Data-local map tasks=3
Total time spent by all maps in occupied slots (ms)=65341
Total time spent by all reduces in occupied slots (ms)=26179
Total time spent by all map tasks (ms)=65341
Total time spent by all reduce tasks (ms)=26179
Total vcore-seconds taken by all map tasks=65341
Total vcore-seconds taken by all reduce tasks=26179
Total megabyte-seconds taken by all map tasks=66909184
Total megabyte-seconds taken by all reduce tasks=26807296
Map-Reduce Framework
Map input records=7395229
Map output records=14790458
Map output bytes=184880720
Map output materialized bytes=65
Input split bytes=222
Combine input records=14790470
Combine output records=16
Reduce input groups=4
Reduce shuffle bytes=65
Reduce input records=4
Reduce output records=4
Spilled Records=20
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=714
CPU time spent (ms)=14010
Physical memory (bytes) snapshot=560402432
Virtual memory (bytes) snapshot=2402254848
Total committed heap usage (bytes)=378994688
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=125718888
File Output Format Counters 
Bytes Written=49
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# hdfs dfs -ls  /home/hadoop/output4
15/02/06 21:16:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 root supergroup          0 2015-02-06 21:08 /home/hadoop/output4/_SUCCESS
-rw-r--r--   1 root supergroup         49 2015-02-06 21:08 /home/hadoop/output4/part-r-00000
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# 
root@hadoop2-VirtualBox:/usr/local/hadoop/share/hadoop/mapreduce# hdfs dfs -cat  /home/hadoop/output4/part-r-00000
15/02/06 21:17:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
FKDSFJKS	7395228
HFLESFL	7395228
hello	1
world	1

Hadoop单节点快速部署

Hadoop单节点快速部署

sudo apt-get update
sudo apt-get install openjdk-7-jdk
java -version
cd /usr/lib/jvm
ln -s java-7-openjdk-amd64 jdk
sudo addgroup hadoop_group
sudo adduser --ingroup hadoop_group hduser1
sudo adduser hduser1 sudo
su - hduser1
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
su - hduser1
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Add Hadoop bin/ directory to PATH
export PATH= $PATH:$HADOOP_HOME/bin
wget http://ftp.yz.yamagata-u.ac.jp/pub/network/apache/hadoop/common/current/hadoop-2.7.0.tar.gz
tar -zxvf hadoop-2.7.0.tar.gz 
sudo mv hadoop-2.7.0 /usr/local/hadoop 
vi ~/.bashrc
增加
#Hadoop variables
export JAVA_HOME=/usr/lib/jvm/jdk/
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
###end of paste
vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
将JAVA_HOME 这一行修改为
export JAVA_HOME=/usr/lib/jvm/jdk
vi  /usr/local/hadoop/etc/hadoop/core-site.xml
在
之间加入
修改为
fs.default.name
hdfs://localhost:9000
vi /usr/local/hadoop/etc/hadoop/yarn-site.xml
修改为
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
vi /usr/local/hadoop/etc/hadoop/mapred-site.xml 
修改为
mapreduce.framework.name
yarn
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown hduser1 /usr/local/hadoop_store/hdfs/namenode
sudo chown hduser1 /usr/local/hadoop_store/hdfs/datanode
vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
修改为
dfs.replication
1
dfs.namenode.name.dir
file:/usr/local/hadoop_store/hdfs/namenode
dfs.datanode.data.dir
file:/usr/local/hadoop_store/hdfs/datanode
sudo chown hduser1:hadoop_group -R /usr/local/hadoop_store
sudo chmod 777 -R /usr/local/hadoop_store
cd /usr/local/hadoop/
hdfs namenode -format   
cd /usr/local/hadoop/
start-all.sh
jps
10477 SecondaryNameNode
10757 NodeManager
10974 Jps
10113 NameNode
10623 ResourceManager
10251 DataNode
sudo apt-get update
sudo apt-get install openjdk-7-jdk
java -version
cd /usr/lib/jvm
ln -s java-7-openjdk-amd64 jdk
sudo addgroup hadoop_group
sudo adduser --ingroup hadoop_group hduser1
sudo adduser hduser1 sudo
su - hduser1
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
su - hduser1
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Add Hadoop bin/ directory to PATH
export PATH= $PATH:$HADOOP_HOME/bin
wget http://ftp.yz.yamagata-u.ac.jp/pub/network/apache/hadoop/common/current/hadoop-2.7.0.tar.gz
tar -zxvf hadoop-2.7.0.tar.gz 
sudo mv hadoop-2.7.0 /usr/local/hadoop 
vi ~/.bashrc
增加
#Hadoop variables
export JAVA_HOME=/usr/lib/jvm/jdk/
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
###end of paste
vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
将JAVA_HOME 这一行修改为
export JAVA_HOME=/usr/lib/jvm/jdk
vi /usr/local/hadoop/etc/hadoop/core-site.xml
在<configuration>
</configuration>之间加入
修改为
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
vi /usr/local/hadoop/etc/hadoop/yarn-site.xml
修改为
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
vi /usr/local/hadoop/etc/hadoop/mapred-site.xml 
修改为
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown hduser1 /usr/local/hadoop_store/hdfs/namenode
sudo chown hduser1 /usr/local/hadoop_store/hdfs/datanode
vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
修改为
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
sudo chown hduser1:hadoop_group -R /usr/local/hadoop_store
sudo chmod 777 -R /usr/local/hadoop_store
cd /usr/local/hadoop/
hdfs namenode -format 
cd /usr/local/hadoop/
start-all.sh
jps
10477 SecondaryNameNode
10757 NodeManager
10974 Jps
10113 NameNode
10623 ResourceManager
10251 DataNode

沪ICP备14014813号

沪公网安备 31010802001379号