這裏是當初在三個ECS節點上搭建hadoop+zookeeper+hbase+solr的主要步驟,文章內容未通過潤色,請參考的同窗搭配其餘博客一同使用,並記得根據實際狀況調整相關參數。java
jdk,推薦1.8node
關閉防火牆web
開放ECS安全組centos
三臺機器之間的免密登錄ssh安全
ip映射:【question1】hadoop啓動時出現報錯java.net.BindException: Cannot assign requested address服務器
說明ip映射沒有配置正確,正確的方式是在每個節點上,都執行"內外外"的配置方式,即將本機與本機的內網ip對應,其餘機器設置爲外網ipapp
下面的文件要在每一個節點上都修改ssh
1. vi /etc/profile /opt/hadoop/hadoop-2.7.7 export HADOOP_HOME=/opt/hadoop/hadoop-2.7.7 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export PATH=.:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH #使環境變量生效 souce /etc/profile #檢驗 hadoop version
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://Gwj:8020</value> <description>定義默認的文件系統主機和端口</description> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> <description>流文件的緩衝區爲4K</description> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/opt/hadoop/hadoop-2.7.7/tempdata</value> <description>A base for other temporary directories.</description> </property> </configuration>
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/hadoop/hadoop-2.7.7/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/hadoop/hadoop-2.7.7/dfs/data</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <!--後增,若是想讓solr索引存放到hdfs中,則還須添加下面兩個屬性--> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <!--【question2】SecondayNameNode默認與NameNode在同一臺節點上,在實際生產過程當中有安全隱患。解決方法:加入以下配置信息,指定NameNode和SecondaryNameNode節點位置--> <property> <name>dfs.http.address</name> <value>Gwj:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>Ssj:50090</value> </property> </configuration>
<configuration> <property> <name>mapreduce.framework.name</name> <value>local</value> </property> <!-- 指定mapreduce jobhistory地址 --> <property> <name>mapreduce.jobhistory.address</name> <value>0.0.0.0:10020</value> </property> <!-- 任務歷史服務器的web地址 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>0.0.0.0:19888</value> </property> </configuration>
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>Gwj</value> <description>指定resourcemanager所在的hostname</description> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>NodeManager上運行的附屬服務。需配置成mapreduce_shuffle,纔可運行MapReduce程序 </description> </property> </configuration>
老版本是slaves文件,3.0.3 用 workers 文件代替 slaves 文件webapp
將localhost刪掉,加入dataNode節點的主機名 [root@Gwj ~]# cat /opt/hadoop/hadoop-2.7.7/etc/hadoop/slaves Ssj Pyf
hdfs namenode -format
/.../hadoop-2.7.7/sbin/start/start-all.sh hdfs /.../hadoop-2.7.7/sbin/start/start-dfs.sh Yarn /.../hadoop-2.7.7/sbin/start/start-yarn.sh #start可替換爲stop、status
使用jps檢驗 hadoop hdfs Master---NameNode (SecondaryNameNode) Slave---DataNode Yarn Master---ResourceManager Slave---NodeManager
或者使用 「Master ip+50070」tcp
---如下的yarn未設置,注意<configuration>!!! <property> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> </property> <property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>${yarn.resourcemanager.hostname}:8030</value> </property> <property> <description>The http address of the RM web application.</description> <name>yarn.resourcemanager.webapp.address</name> <value>${yarn.resourcemanager.hostname}:8088</value> </property> <property> <description>The https adddress of the RM web application.</description> <name>yarn.resourcemanager.webapp.https.address</name> <value>${yarn.resourcemanager.hostname}:8090</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <description>The address of the RM admin interface.</description> <name>yarn.resourcemanager.admin.address</name> <value>${yarn.resourcemanager.hostname}:8033</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> <discription>每一個節點可用內存,單位MB,默認8182MB,根據阿里雲ECS性能配置爲2048MB</discription> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>