Hadoop hdfs+Spark配置

Hadoop hdfs配置(版本2.7)java

如下文件都位於hadoop安裝目錄的/etc/hadoop/目錄下node

hadoop-env.shlinux

export JAVA_HOME=/home/java/jdk1.8.0_45apache

hdfs-site.xmlbootstrap

<configuration>
<property>
    <name>dfs.nameservices</name>
    <value>guanjian</value>
</property>
<property>
    <name>dfs.ha.namenodes.guanjian</name>
    <value>nn1,nn2</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.guanjian.nn1</name>
    <value>host1:8020</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.guanjian.nn2</name>
    <value>host2:8020</value>
</property>
<property>
    <name>dfs.namenode.http-address.guanjian.nn1</name>
    <value>host1:50070</value>
</property>
<property>
    <name>dfs.namenode.http-address.guanjian.nn2</name>
    <value>host2:50070</value>
</property>
<property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://host1:8485;host2:8485/guanjian</value>
</property>
<property>
    <name>dfs.client.failover.proxy.provider.guanjian</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
</property>
<property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/root/.ssh/id_dsa</value>
</property>
<property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/opt/jn/data</value>
</property>
<property>ssh

<property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>ide

core-site.xmloop

<configuration>
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://guanjian</value>
</property>
<property>
    <name>ha.zookeeper.quorum</name>
    <value>192.168.5.129:2181</value>
</property>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/hadoop2</value>
</property>
</configuration>url

slavesspa

host1
host2

在/etc/hosts中,host1,host2分別制定爲

192.168.5.129 host1
192.168.5.182 host2

手動建兩個文件夾

mkdir -p /opt/jn/data

mkdir /opt/hadoop2

在sbin目錄下啓動journalnode

./hadoop-daemon.sh start journalnode

格式化namenode,在bin目錄下(此處只格式化一臺便可)

./hdfs namenode -format

同機啓動namenode,在/sbin

./hadoop-daemon.sh start namenode

在沒有格式化的機器上,在/bin

./hdfs namenode -bootstrapStandby

./hadoop-daemon.sh start namenode

中止全部的dfs,在/sbin

./stop-dfs.sh

格式化zkfc,在/bin

./hdfs zkfc -formatZK

進入zookeeper查看

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha, guanjian]

咱們能夠看到多了一個hadoop-ha節點

一次性啓動所有hdfs,在/sbin

./start-dfs.sh

訪問192.168.5.182:50070(active)

訪問192.168.5.129:50070(standby)

建立目錄,在/bin

./hdfs dfs -mkdir -p /usr/file

上傳文件,在/bin

./hdfs dfs -put /home/soft/jdk-8u45-linux-x64.tar.gz /usr/file

點擊jdk-XXX.tar.gz能夠看到它有2個Block(1個Block128M)

Spark配置(版本2.2.0)

spark-env.sh

export JAVA_HOME=/home/java/jdk1.8.0_45
#export SPARK_MASTER_HOST=192.168.5.182
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=192.168.5.129:2181 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_MASTER_PORT=7077

slaves

host1
host2

修改Web端口,/sbin下

start-master.sh

if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then
  SPARK_MASTER_WEBUI_PORT=8091       //原始端口8080,容易與其餘衝突
fi

在其中一臺啓動,如在host2啓動,/sbin下

./start-all.sh

在另一臺host1啓動master,/sbin下

./start-master.sh

host2:alive

host1:standby

進入zookeeper查看,多了一個spark節點

WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, spark, hadoop-ha, guanjian]

相關文章
相關標籤/搜索