1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > 华为云 和 阿里云 跨服务器搭建Hadoop集群

华为云 和 阿里云 跨服务器搭建Hadoop集群

时间:2021-01-20 04:29:09

相关推荐

华为云 和 阿里云 跨服务器搭建Hadoop集群

目录

华为云 和 阿里云 跨服务器搭建Hadoop集群说明期间遇到的问题CentOS6 7 8更换阿里yum源修改服务器名称安装JDK安装Hadoop编写集群分发脚本 xsyncscp(secure copy) 安全拷贝rsync远程同步工具xsync 集群分发脚本无密访问集群配置(着急直接看这)配置hosthadoop102Hadoop103Hadoop104核心配置文件HDFS 配置文件YARN 配置文件MapReduce 配置文件分发配置群起集群配置 **workers**启动集群集群停止单独启动某些进程单独启动hdfs的相关进程单独启动yarn的相关进程

华为云 和 阿里云 跨服务器搭建Hadoop集群

说明

我有三个服务器:华为云102、阿里云103、阿里云104,跨服务器(机房)搭建一个hadoop集群

期间遇到的问题

CentOS6 7 8更换阿里yum源

阿里云Linux安装镜像源地址:

配置方法:

1.备份

mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup

2.下载 新的CentOS-Base.repo到/etc/yum.repos.d

CentOS 6wget -O /etc/yum.repos.d/CentOS-Base.repo /repo/Centos-6.repoCentOS 7wget -O /etc/yum.repos.d/CentOS-Base.repo /repo/Centos-7.repoCentOS 8wget -O /etc/yum.repos.d/CentOS-Base.repo /repo/Centos-8.repo

执行 2 可能遇到

CentOS 7 下载yum源报错:正在解析主机 ()... 失败:未知的名称或服务。

解决方法

解决方法 :

登录root用户,用vim /etc/resolv.conf ,打开rsolv.conf,添加DNS地址

nameserver 8.8.8.8 nameserver 8.8.4.4nameserver 223.5.5.5nameserver 223.6.6.6

(nameserver 223.5.5.5 和 nameserver 223.6.6.6选择其中一个添加即可)

若未解决,查看网络配置,使用ifconfigip addr查看网卡名,用 vim

/etc/sysconfig/network-scripts/ifcfg-(网卡名),查看网络参数是否正常

3.更新缓存

yum clean all && yum makecacheyum update -y

此时换源操作已完成

保障ping 百度能ping通

ping

配置到这发现我这边服务器是ping不通百度,能ping ip 但是域名不行 网上找了好多教程没有对的,(鄙视乱写的)

我自己的解决方案

vim /etc/resolv.conf

改为

nameserver 8.8.8.8

改完之后就可以ping通

修改服务器名称

如果你需要的话,我目前是单台服务器,后期肯定会上多台服务器,方便日后的配置管理

vim /etc/hostname

快速开发: tab键提示命令

安装JDK

压缩包下载 关注后端码匠回复电脑环境获取,也可以自己去oracle官网下载

配置环境变量

vim /etc/profilecd /etc/profile.dvim my_env.sh#JAVA_HOMEexport JAVA_HOME=/usr/java/jdk1.8.0_221export PATH=$PATH:$JAVA_HOME/binexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarsource /etc/profile

安装Hadoop

解压缩

tar -zxvf hadoop-3.1.1.tar.gz -C /opt/module/pwd/opt/module/hadoop-3.1.1cd /etc/profile.dvim my_env.sh#HADOOP_HOMEexport HADOOP_HOME=/opt/module/hadoop-3.1.1export PATH=$PATH:$HADOOP_HOME/binexport PATH=$PATH:$HADOOP_HOME/sbinsource /etc/profile

删除文件夹命令

rmdir dir

rm -rf dir/

测试分词统计

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount wcinput/ ./wcoutput

[root@linuxmxz hadoop-3.1.1]# cd wcoutput[root@linuxmxz wcoutput]# ll总用量 4-rw-r--r-- 1 root root 41 4月 1 11:24 part-r-00000-rw-r--r-- 1 root root 0 4月 1 11:24 _SUCCESS[root@linuxmxz wcoutput]# cat part-r-00000abnzhang1bobo1cls2mike1s1sss1

编写集群分发脚本 xsync

scp(secure copy) 安全拷贝

scp可以实现服务器与服务器之间的数据拷贝(from server 1 to server2)

scp -r pdir/pdir/pdir/fname user@host:user@host:user@host:pdir/$fname

命令 递归 要拷贝的文件路径/名称 目的地用户@主机:目的地路径/名称

例如

本机向远程推数据

scp -r jdk1.8.0_212/ root@ip:/opt/module/

本机上远程拉数据

scp -r root@ip:/opt/module/hadoop-3.1.1 ./

本机操作另外两个机器进行传输

scp -r root@ip:/opt/module/* root@ip:/opt/module/

删除文件夹 rm -rf wcinput/ wcoutput/ 删除两个文件夹

rsync远程同步工具

rsync只对差异文件进行更新,scp是将所有文件都复制过去

基本语法

rsync -av pdir/pdir/pdir/fname user@user@user@host:pdir/pdir/pdir/fname

命令 选项参数 要拷贝的文件路径/名称 目的地用户@主机:目的地路径名称

rsync -av hadoop-3.1.1/ root@ip:/opt/module/hadoop-3.1.1/

xsync 集群分发脚本

#! /bin/bash#1 获取输入参数个数,如果没有参数,直接退出pcount=$#if 【 $pcount -lt 1 】thenecho No Enough Arguement!exit;fi#2. 遍历集群所有机器for host in hadoop102 hadoop103 hadoop104doecho ==================== $host ====================#3. 递归遍历所有目录for file in $@do#4 判断文件是否存在if 【 -e $file 】then#5. 获取全路径pdir=$(cd -P $(dirname $file); pwd)echo pdir=$pdir#6. 获取当前文件的名称fname=$(basename $file)echo fname=$fname#7. 通过ssh执行命令:在$host主机上递归创建文件夹(如果存在该文件夹)ssh $host "source /etc/profile;mkdir -p $pdir"#8. 远程同步文件至$host主机的$USER用户的$pdir文件夹下rsync -av $pdir/$fname $USER@$host:$pdirelseecho $file Does Not Exists!fidonedone

无密访问

adduser codingcepasswd codingcecchown -R codingce:codingce hadoop-3.1.1/chmod 770 hadoop-3.1.1/ls -al 查询所有的底层文件 ssh-keygen -t rsacat id_rsa#私钥 cat id_rsa_pub #公钥 # 把公钥放在 .ssh 文件夹[codingce@linuxmxz .ssh]# ssh-copy-id 66.108.177.66/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keyscodingce@39.108.177.66's password: Number of key(s) added: 1# 操作之后就可以通过 ssh ip 直接访问改服务器 还得做下对自己免密码登录。ssh ip

集群配置(着急直接看这)

集群部署规划

注意 NameNode 和 SecondaryNameNode 不要安装在同一台服务器上

ResourceManager 也很耗内存, 不要和 NameNode、SecondaryNameNode配置在同一台服务器上

配置host

hadoop102

就是这块踩了一下午坑

[root@linuxmxz hadoop-3.1.1]# vim /etc/hosts#内网102另外两台是外网内网iphadoop102外网ip hadoop103外网ip hadoop104

Hadoop103

[root@linuxmxz hadoop-3.1.1]# vim /etc/hosts外网iphadoop102内网ip hadoop103外网ip hadoop104

Hadoop104

[root@linuxmxz hadoop-3.1.1]# vim /etc/hosts外网iphadoop102外网ip hadoop103内网ip hadoop104

核心配置文件

核心配置文件 core-site.xml

[root@linuxmxz hadoop]# cd $HADOOP_HOME/etc/hadoop[codingce@linuxmxz hadoop]$ vim core-site.xml<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License at/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration><!-- 指定 NameNode 的地址 --> <property><name>fs.defaultFS</name><value>hdfs://hadoop102:8020</value> </property><!-- 指定 hadoop 数据的存储目录 --><property><name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-3.1.1/data</value></property><!-- 配置 HDFS 网页登录使用的静态用户为 codingce --> <property><name>hadoop.http.staticuser.user</name><value>codingce</value> </property></configuration>

HDFS 配置文件

配置 hdfs-site.xml

[codingce@linuxmxz hadoop]$ vim hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License at/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration> <!-- nn web端访问地址--> <property><name>dfs.namenode.http-address</name><value>hadoop102:9870</value> </property><!-- 2nn web 端访问地址--> <property><name>dfs.namenode.secondary.http-address</name><value>hadoop104:9868</value> </property></configuration>

YARN 配置文件

配置 yarn-site.xml

[codingce@linuxmxz hadoop]$ vim yarn-site.xml<configuration><!-- 指定 MR 走 shuffle --> <property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value> </property><!-- 指定 ResourceManager 的地址--> <property><name>yarn.resourcemanager.hostname</name><value>hadoop103</value> </property><!-- 环境变量的继承 --><property> <name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CO NF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAP RED_HOME</value></property></configuration>

MapReduce 配置文件

配置 mapred-site.xml

[codingce@linuxmxz hadoop]$ vim mapred-site.xml<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><!-- 指定 MapReduce 程序运行在 Yarn 上 --> <property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>yarn.app.mapreduce.am.env</name><value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.1</value></property><property><name>mapreduce.map.env</name><value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.1</value></property><property><name>mapreduce.reduce.env</name><value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.1.1</value></property></configuration>

分发配置

# 一[codingce@linuxmxz hadoop]$ rsync -av core-site.xml codingce@66.108.177.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listcore-site.xmlsent 599 bytes received 47 bytes 1,292.00 bytes/sectotal size is 1,176 speedup is 1.82[codingce@linuxmxz hadoop]$ rsync -av core-site.xml codingce@119.23.69.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listcore-site.xmlsent 599 bytes received 47 bytes 1,292.00 bytes/sectotal size is 1,176 speedup is 1.82# 二[codingce@linuxmxz hadoop]$ rsync -av hdfs-site.xml codingce@119.23.69.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listhdfs-site.xmlsent 511 bytes received 47 bytes 1,116.00 bytes/sectotal size is 1,088 speedup is 1.95[codingce@linuxmxz hadoop]$ rsync -av hdfs-site.xml codingce@66.108.177.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listhdfs-site.xmlsent 511 bytes received 47 bytes 1,116.00 bytes/sectotal size is 1,088 speedup is 1.95# 三[codingce@linuxmxz hadoop]$ rsync -av yarn-site.xml codingce@66.108.177.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listyarn-site.xmlsent 651 bytes received 47 bytes 1,396.00 bytes/sectotal size is 1,228 speedup is 1.76[codingce@linuxmxz hadoop]$ rsync -av yarn-site.xml codingce@119.23.69.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listyarn-site.xmlsent 651 bytes received 47 bytes 1,396.00 bytes/sectotal size is 1,228 speedup is 1.76# 四[codingce@linuxmxz hadoop]$ rsync -av mapred-site.xml codingce@119.23.69.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listsent 73 bytes received 12 bytes 170.00 bytes/sectotal size is 1,340 speedup is 15.76[codingce@linuxmxz hadoop]$ rsync -av mapred-site.xml codingce@66.108.177.66:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listsent 73 bytes received 12 bytes 170.00 bytes/sectotal size is 1,340 speedup is 15.76

群起集群

配置workers

[codingce@linuxmxz hadoop]$ vim workershadoop102hadoop103hadoop104

注意:该文件中添加的内容结尾不允许有空格,文件中不允许有空行。

同步所有节点配置文件

[codingce@linuxmxz hadoop]$ rsync -av workers codingce@39.108.177.65:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listworkerssent 143 bytes received 41 bytes 368.00 bytes/sectotal size is 30 speedup is 0.16[codingce@linuxmxz hadoop]$ rsync -av workers codingce@119.23.69.213:/opt/module/hadoop-3.1.1/etc/hadoop/sending incremental file listworkerssent 143 bytes received 41 bytes 122.67 bytes/sectotal size is 30 speedup is 0.16

启动集群

(1) 如果集群是第一次启动,需要在 hadoop102 节点格式化 NameNode(注意:格式

化 NameNode,会产生新的集群 id,导致 NameNode 和 DataNode 的集群 id 不一致,集群找 不到已往数据。如果集群在运行过程中报错,需要重新格式化 NameNode 的话,一定要先停 止 namenode 和 datanode 进程,并且要删除所有机器的 data 和 logs 目录,然后再进行格式

化。)

[codingce@linuxmxz hadoop-3.1.1]$ hdfs namenode -format

(2) 启动 HDFS

[codingce@linuxmxz hadoop-3.1.1]$ sbin/start-dfs.shStarting namenodes on [39.108.177.65]Starting datanodes39.108.177.65: datanode is running as process 17487. Stop it first.119.23.69.213: datanode is running as process 7274. Stop it first.Starting secondary namenodes [119.23.69.213][codingce@linuxmxz hadoop-3.1.1]$[codingce@linuxmxz ~]$ jps23621 NodeManager23766 Jps23339 DataNode

[codingce@linuxmxz hadoop-3.1.1]$ ssh 66.108.177.66 [codingce@hyf hadoop-3.1.1]$ sbin/start-yarn.shStarting resourcemanagerStarting nodemanagers[codingce@hyf ~]$ jps19204 Jps18533 NodeManager17487 DataNode

[codingce@hyf ~]$ ssh 119.23.69.66[codingce@zjx ~]$ jps7824 NodeManager7274 DataNode7965 Jps

(3) 在配置了ResourceManager的节点(hadoop103)启动 YARN

sbin/start-dfs.shstop-dfs.shstop-yarn.shsbin/start-yarn.shnetstat -tlpn 查询所有开放的ip

集群停止

停止hdfs,任意节点执行:

stop-dfs.sh

停止yarn,在yarn主节点执行:

stop-yarn.sh

如果是伪分布式环境,也可以直接执行:

stop-all.sh

单独启动某些进程

如果启动集群的过程,有些进程没有启动,可以尝试单独启动对应的进程:

单独启动hdfs的相关进程

hdfs --daemon start hdfs进程

hdfs --daemon start namenodehdfs --daemon start datanodehdfs --daemon start secondarynamenode

单独启动yarn的相关进程

yarn --daemon start yarn的相关进程

yarn --daemon start resourcemanageryarn --daemon start nodemanager

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。