openEuler搭建hadoop 伪分布式集群
openEuler搭建hadoop 伪分布式集群 Mode 伪分布式
hadoop101 | hadoop102 | hadoop103 |
---|---|---|
192.168.10.101 | 192.168.10.102 | 192.168.10.103 |
namenode | secondary namenode | recource manager |
datanode | datanode | datanode |
nodemanager | nodemanager | nodemanager |
job history | ||
job log | job log | job log |
- 升级软件
- 安装常用软件
- 关闭防火墙
- 修改主机名和IP地址
- 修改hosts配置文件
- 下载jdk和hadoop并配置环境变量
- 配置ssh免密钥登录
- 修改配置文件
- 分发软件
- 初始化集群
- windows修改hosts文件
- 测试
- 元数据
1、升级软件
yum -y update
2、安装常用软件
yum -y install gcc gcc-c++ autoconf automake cmake make \ zlib zlib-devel openssl openssl-devel pcre-devel \ rsync openssh-server vim man zip unzip net-tools tcpdump lrzsz tar wget
3、关闭防火墙
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config setenforce 0
systemctl stop firewalld systemctl disable firewalld
4、修改主机名和IP地址
hadoop101
hostnamectl set-hostname hadoop101
hadoop102
hostnamectl set-hostname hadoop102
hadoop103
hostnamectl set-hostname hadoop103
vim /etc/sysconfig/network-scripts/ifcfg-ens32
参考如下:
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=none DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=eui64 NAME=ens32 UUID=55e7ac28-39d7-4f24-b6bf-0f9fb40b7595 DEVICE=ens32 ONBOOT=yes IPADDR=192.168.10.101 PREFIX=24 GATEWAY=192.168.10.2 DNS1=192.168.10.2
5、修改hosts配置文件
vim /etc/hosts
修改内容如下:
192.168.10.101 hadoop101 192.168.10.102 hadoop102 192.168.10.103 hadoop103
重启系统
reboot
6、下载jdk和hadoop并配置环境变量
hadoop101
创建软件目录
mkdir -p /opt/soft
进入软件目录
cd /opt/soft
下载 JDK
下载 hadoop
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
解压 JDK 修改名称
tar -zxvf jdk-8u411-linux-x64.tar.gz
mv jdk1.8.0_411 jdk-8
解压 hadoop 修改名称
tar -zxvf hadoop-3.3.6.tar.gz
mv hadoop-3.3.6 hadoop-3
删除安装包 (不推荐)
rm -f *.gz
配置环境变量
vim /etc/profile.d/my_env.sh
编写以下内容:
export JAVA_HOME=/opt/soft/jdk-8 export HDFS_NAMENODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_ZKFC_USER=root export HDFS_JOURNALNODE_USER=root export HADOOP_SHELL_EXECNAME=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root export HADOOP_HOME=/opt/soft/hadoop-3 export HADOOP_INSTALL=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
生成新的环境变量
source /etc/profile
7、配置ssh免密钥登录
创建本地秘钥并将公共秘钥写入认证文件
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
ssh-copy-id root@hadoop101
ssh-copy-id root@hadoop102
ssh-copy-id root@hadoop103
8、修改配置文件
hadoop-env.sh
core-site.xml
hdfs-site.xml
workers
mapred-site.xml
yarn-site.xml
hadoop-env.sh
文档末尾追加以下内容:
export JAVA_HOME=/opt/soft/jdk-8 export HDFS_NAMENODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_ZKFC_USER=root export HDFS_JOURNALNODE_USER=root export HADOOP_SHELL_EXECNAME=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
core-site.xml
fs.defaultFS hdfs://hadoop101:8020 hadoop.tmp.dir /home/hadoop_data hadoop.http.staticuser.user root dfs.permissions.enabled false hadoop.proxyuser.root.hosts * hadoop.proxyuser.root.groups *
hdfs-site.xml
dfs.replication 3 dfs.namenode.secondary.http-address hadoop102:9868
workers
注意:
hadoop2.x中该文件名为slaves
hadoop3.x中该文件名为workers
hadoop101 hadoop102 hadoop103
mapred-site.xml
mapreduce.framework.name yarn mapreduce.application.classpath $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* mapreduce.jobhistory.address hadoop102:10020 mapreduce.jobhistory.webapp.address hadoop102:19888
yarn-site.xml
yarn.resourcemanager.hostname hadoop103 yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME yarn.nodemanager.pmem-check-enabled false yarn.nodemanager.vmem-check-enabled false yarn.log-aggregation-enable true yarn.log.server.url http://hadoop102:19888/jobhistory/logs yarn.log-aggregation.retain-seconds 604800
9、分发软件和环境变量
分发hosts文件
scp -r /etc/hosts root@hadoop102:/etc scp -r /etc/hosts root@hadoop103:/etc
分发ssh免密钥配置
scp -r /root/.ssh root@hadoop102:/root scp -r /root/.ssh root@hadoop103:/root
分发软件
scp -r /opt/soft root@hadoop102:/opt scp -r /opt/soft root@hadoop103:/opt
分发环境变量
scp /etc/profile.d/my_env.sh root@hadoop102:/etc/profile.d scp /etc/profile.d/my_env.sh root@hadoop103:/etc/profile.d
在所有节点使新的环境变量生效
source /etc/profile
10、初始化集群
hadoop101
格式化文件系统 hdfs namenode -format 启动 NameNode SecondaryNameNode DataNode start-dfs.sh 查看启动进程 jps hadoop101 看到 NameNode DataNode hadoop102 看到 SecondaryNameNode DataNode hadoop101 看到 DataNode
hadoop103
启动 ResourceManager daemon 和 NodeManager start-yarn.sh 查看启动进程 jps hadoop101 看到 NameNode DataNode NodeManager hadoop102 看到 SecondaryNameNode DataNode NodeManager hadoop101 看到 DataNode ResourceManager NodeManager
hadoop102
启动 JobHistoryServer mapred --daemon start historyserver 查看启动进程 jps hadoop101 看到 NameNode DataNode NodeManager hadoop102 看到 SecondaryNameNode DataNode NodeManager JobHistoryServer hadoop103 看到 DataNode ResourceManager NodeManager
重点提示:
关机之前 依关闭服务 Hadoop102 mapred --daemon stop historyserver hadoop103 stop-yarn.sh hadoop101 stop-dfs.sh 开机后 依次开启服务 hadoop101 start-dfs.sh hadoop103 start-yarn.sh hadoop102 mapred --daemon start historyserver
11、修改windows下hosts文件
C:\Windows\System32\drivers\etc\hosts
追加以下内容:
192.168.10.101 hadoop101 192.168.10.102 hadoop102 192.168.10.103 hadoop103
Windows11 注意 修改权限
-
开始搜索 cmd
找到命令头提示符 以管理身份运行
-
进入 C:\Windows\System32\drivers\etc 目录
cd drivers/etc
-
打开 hosts 配置文件
start hosts
-
追加以下内容后保存
192.168.10.101 hadoop101 192.168.10.102 hadoop102 192.168.10.103 hadoop103
12、测试
12.1 浏览器访问hadoop
浏览器访问: http://hadoop101:9870
浏览器访问:http://hadoop102:9868/
浏览器访问:http://hadoop103:8088
浏览器访问:http://hadoop102:19888
12.2 测试 hdfs
本地文件系统创建 测试文件 wcdata.txt
vim wcdata.txt
Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase HiveHadoop Spark HBase StormHBase Hadoop Hive FlinkHBase Flink Hive StormHive Flink HadoopHBase Hive Spark HBaseHive Flink Storm Hadoop HBase SparkFlinkHBase StormHBase Hadoop Hive
在 HDFS 上创建目录 /wordcount/input
hdfs dfs -mkdir -p /wordcount/input
查看 HDFS 目录结构
hdfs dfs -ls /
hdfs dfs -ls /wordcount
hdfs dfs -ls /wordcount/input
上传本地测试文件 wcdata.txt 到 HDFS 上 /wordcount/input
hdfs dfs -put wcdata.txt /wordcount/input
检查文件是否上传成功
hdfs dfs -ls /wordcount/input
hdfs dfs -cat /wordcount/input/wcdata.txt
12.3 测试 mapreduce
计算 PI 的值
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar pi 10 10
单词统计
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar wordcount /wordcount/input/wcdata.txt /wordcount/result
hdfs dfs -ls /wordcount/result
hdfs dfs -cat /wordcount/result/part-r-00000
count/input
hdfs dfs -mkdir -p /wordcount/input
查看 HDFS 目录结构
hdfs dfs -ls /
hdfs dfs -ls /wordcount
hdfs dfs -ls /wordcount/input
上传本地测试文件 wcdata.txt 到 HDFS 上 /wordcount/input
hdfs dfs -put wcdata.txt /wordcount/input
检查文件是否上传成功
hdfs dfs -ls /wordcount/input
hdfs dfs -cat /wordcount/input/wcdata.txt
12.3 测试 mapreduce
计算 PI 的值
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar pi 10 10
单词统计
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar wordcount /wordcount/input/wcdata.txt /wordcount/result
hdfs dfs -ls /wordcount/result
hdfs dfs -cat /wordcount/result/part-r-00000