部署Hadoop(3.3.0)僞分佈式集羣

 

前言:

本文主要介紹部署Hadoop(3.3.0)僞分佈式集羣

注:本文部署僞分佈式集羣的前提是已經裝好hadoop(3.3.0)以及jvm。


一、什麼叫做僞分佈式?

 顧名思義,僞分佈就是假分佈式,假就假在只有一臺機器而不是多臺機器來完成一個任務,但是模擬了分佈式的這個過程,所以僞分佈式下Hadoop也就是雖然在一個機器上配置了hadoop的所有節點,但僞分佈式完成了所有分佈式所必須的事件。僞分佈式Hadoop和單機版最大區別就在於需要配置HDFS。

二、配置僞分佈式Hadoop集羣

1.修改配置文件core-site.xml

文件位置: /home/hadoop/hadoop/etc/hadoop/

[[email protected] hadoop]$ vim core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name> <value>hdfs://localhost:9000</value>

</property>

</configuration>

2.修改配置文件hdfs-site.xml

文件位置:/home/hadoop/hadoop/etc/hadoop/

[[email protected] hadoop]$ vim hdfs-site.xml

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value> ##自己充當節點,設置爲1;

</property>

</configuration>

 

3.生成**進行免密登陸

[[email protected] ~]$ ssh-******    #一路回車即可;

[[email protected] ~]$ ssh-copy-id localhost      #分發鑰匙;

Now try logging into the machine, with:   "ssh 'localhost'"

and check to make sure that only the key(s) you wanted were added.

[[email protected] ~]$ ssh localhost    #嘗試連接

Last login: Wed Oct 28 23:01:23 2020

[[email protected] ~]$ exit  #已經免密登陸成功,退出登陸;

logout

Connection to localhost closed.

 

4.格式化HDFS,並開啓服務   

注:安裝成功之後需要格式化,非必要條件切忌重複格式化,否則會出現問題;

[[email protected] hadoop]# pwd

/home/hadoop/hadoop

[[email protected] hadoop]# bin/hdfs namenode -format 

5.開啓服務的時候報錯,日誌沒有寫入的權限

[[email protected] hadoop]# pwd

/home/hadoop/hadoop/sbin

[[email protected] sbin]$ ./start-dfs.sh   #開啓服務

Starting namenodes on [localhost]

localhost: ERROR: Unable to write in /home/hadoop/hadoop-3.3.0/logs. Aborting.

Starting datanodes

localhost: ERROR: Unable to write in /home/hadoop/hadoop-3.3.0/logs. Aborting.

Starting datanodes

localhost: ERROR: Unable to write in /home/hadoop/hadoop-3.3.0/logs. Aborting.

Starting secondary namenodes [localhost.localdomain]

localhost.localdomain: ERROR: Unable to write in /home/hadoop/hadoop-3.3.0/logs. Aborting.

解決思路:

#進入超戶進行修改777權限

[[email protected] sbin]# chmod 777 /home/hadoop/hadoop-3.3.0/logs/SecurityAuth-root.audit

[[email protected] sbin]# ll /home/hadoop/hadoop-3.3.0/logs

total 0

-rwxrwxrwx. 1 root root 0 Oct 28 23:05 SecurityAuth-root.audit

重新開啓服務:

[[email protected] sbin]$ pwd

/home/hadoop/hadoop/sbin

[[email protected] sbin]$ ./start-dfs.sh

Starting namenodes on [localhost]

Starting datanodes

Starting secondary namenodes [localhost.localdomain]

 

6.此時看到已經開啓namenode ,datanode

[[email protected] sbin]$ jps

3328 NameNode

3442 DataNode

3767 Jps

3642 SecondaryNameNode

7.瀏覽器查看

http://192.168.56.88:9870

#關閉防火牆在瀏覽器上輸入地址纔可以看到web頁面

[[email protected] sbin]# systemctl stop firewalld

 

8.測試

#創建目錄,上傳

[[email protected] hadoop]$ bin/hdfs dfs -mkdir -p /user/hadoop

[[email protected] hadoop]$ bin/hdfs dfs -ls

[[email protected] hadoop]$ bin/hdfs dfs -put input

[[email protected] hadoop]$ bin/hdfs dfs -ls

Found 1 items

drwxr-xr-x   - hadoop supergroup          0 2020-10-28 23:23 input

 

 

9.刪除input以及output文件夾

[[email protected] hadoop]# pwd

/home/hadoop/hadoop

[[email protected] hadoop]# ls

bin  include  lib      LICENSE-binary   LICENSE.txt  NOTICE-binary  output      sbin

etc  input    libexec  licenses-binary  logs         NOTICE.txt     README.txt  share

[[email protected] hadoop]# rm -fr input/ output/

[[email protected] hadoop]# ls

bin  include  libexec         licenses-binary  logs           NOTICE.txt  sbin

etc  lib      LICENSE-binary  LICENSE.txt      NOTICE-binary  README.txt  share

**此時input和output不會出現在當前目錄下,而是上傳到了分佈式文件系統中,網頁上可以看到**

 

10.從分佈式上面get目錄

[[email protected] hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0.jar grep input output 'dfs[a-z.]+'

[[email protected] hadoop]# ls

bin  etc  include  lib  libexec  LICENSE-binary  licenses-binary  LICENSE.txt  logs  NOTICE-binary  N

[[email protected] hadoop]# bin/hdfs dfs -get /user/hadoop/output

[[email protected] hadoop]# ls

bin  include  libexec         licenses-binary  logs           NOTICE.txt  README.txt  share

etc  lib      LICENSE-binary  LICENSE.txt      NOTICE-binary  output      sbin

[[email protected] hadoop]# cd output/

[[email protected] output]# ls

part-r-00000  _SUCCESS

[[email protected] output]# cat *

  1. Dfsadmin

DFSAdmin 命令用來管理HDFS集羣。這些命令只有HDSF的管理員才能使用。


文章到這裏就已經結束了,歡迎關注後續!!!