Hadoop 3.1.2 环境配置

2019-06-12

Hadoop 3.1.2 环境配置

踩坑指南:

Hdoop 启动:localhost: Error: JAVA_HOME is not set and could not be found.

1,配置环境变量

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
配置java环境变量。
export JAVA_HOME=/opt/jdk1.8.0_211
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JRE_HOME=$JAVA_HOME/jre
配置hadoop环境变量。
export HADOOP_INSTALL=/opt/hadoop-3.1.2
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"

2,修改配置文件

core-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
<!-- Size of read/write buffer used in SequenceFiles. -->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<!-- 指定hadoop临时目录,自行创建 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-3.0.2/tmp</value>
</property>

</configuration>

hdfs-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>datanode1:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop-3.0.2/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop-3.0.2/hdfs/data</value>
</property>

</configuration>

mapred-site.xml

1
2
3
4
5
6
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

yarn-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>namenode:8088</value>
</property>
</configuration>

hadoop-env.sh

1
JAVA_HOME=/usr/local/jdk1.8.0_171

namenode格式化

1
hdfs namenode -format

ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.

解决办法:启动停止hdfs,yarn脚本文件需要修改,增加变量。

start-dfs.sh/stop-dfs.sh脚本文件头部增加:

1
2
3
4
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

start-yarn.sh/stop-yarn.sh脚本文件头部增加:

1
2
3
YARN_RESOURCEMANAGER_USER=root
YARN_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

namenode节点启动成功,web服务无法访问

参考文档:https://blog.csdn.net/r_aider/article/details/80020518

检查linux系统的防火墙设置

1
firewall-cmd    --state

CentOS-7 防火墙默认使用的是firewall,与之前的版本使用iptables不一样。防火墙操作命令:

1
2
3
4
关闭防火墙:systemctl stop firewalld.service
开启防火墙:systemctl start firewalld.service
关闭开机启动:systemctl disable firewalld.service
开启开机启动:systemctl enable firewalld.service

关闭防火墙并设置禁用开机启动后,成功访问

1
2
namenode:http://192.168.180.130:9870/              //我自己别的本地IP:http://192.168.180.130
Hadoop:http://192.168.180.130:8088/