最近又機會接觸hadoop,順便分享一下同事的記錄
1、主服務器設置
1.建立用戶
#useradd hadoop
2.設置密碼
#passwd hadoop
3.切換hadoop登陸
#su - hadoop
4.解壓hadoop壓縮包
#tar zxvf hadoop-1.0.3.tar.gz
5.設置目錄權限
#chown -R hadoop.hadoop hadoop-1.0.3
6.編輯環境變量
#vim hadoop-1.0.3/conf/hadoop-env.sh
編輯「JAVA_HOME」路徑:
export JAVA_HOME=/usr/local/jdk1.7.0_05
增長一行(取消過期警告):
export HADOOP_HOME_WARN_SUPPRESS=1
7.編輯系統環境變量(使用root用戶)
#vim /etc/profile
追加以下內容:
export JAVA_HOME=/usr/local/jdk1.7.0_05
export PATH=$JAVA_HOME/bin:$ANT_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/home/hadoop/hadoop-1.0.3
export PATH=$PATH:$HADOOP_HOME/bin
8.執行環境設置
#source /etc/profile
9.設置主IP
#vim hadoop-1.0.3/conf/masters
把localhost替換成192.168.1.247
10.設置從IP
#vim hadoop-1.0.3/conf/slaves
把localhost替換成192.168.1.248
11.設置hdfs
#vim hadoop-1.0.3/conf/hdfs-site.xml
增長以下內容:
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-1.0.3/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-1.0.3/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
12.配置任務
#vim hadoop-1.0.3/conf/mapred-site.xml
增長以下內容:
<configuration>
<property>
<name>mapred.job.tracker</name>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/hadoop-1.0.3/var</value>
</property>
</configuration>
13.配置核心文件
#vim hadoop-1.0.3/conf/core-site.xml
<configuration>
<property>
</property>
<property>
<name>fs.checkpoint.period</name>
<value>3600</value>
</property>
<property>
<name>fs.checkpoint.size</name>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-1.0.3/tmp</value>
</property>
</configuration>
14.實現主從自動登陸(單方向)
建立公鑰:
#ssh-keygen -t rsa
本身能夠ssh:
#cat .ssh/id_rsa.pub >> .ssh/authorized_keys
設置文件權限:
#chmod 700 .ssh/authorized_keys
測試一下,不須要密碼就表示成功
#ssh localhost
#exit
2、從服務器設置
1.建立用戶
#useradd hadoop
2.設置密碼
#passwd hadoop
3.編輯系統環境變量(使用root用戶)
#vim /etc/profile
追加以下內容:
export JAVA_HOME=/usr/local/jdk1.7.0_05
export PATH=$JAVA_HOME/bin:$ANT_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/home/hadoop/hadoop-1.0.3
export PATH=$PATH:$HADOOP_HOME/bin
4.執行環境設置
#source /etc/profile
5.解壓縮hadoop壓縮包
#tar zxvf hadoop-1.0.3.tar.gz
6.設置目錄權限
#chown -R hadoop.hadoop hadoop-1.0.3
3、回到主服務器
1.切換hadoop用戶
#su - hadoop
2.複製公私到從服務器
3.測試
#ssh 192.168.1.248
配置正確的話是不須要密碼就能登陸的
4.向從服務器發送配置文件
#scp -r hadoop-1.0.3/conf hadoop@192.168.1.248:/home/hadoop/hadoop-1.0.3
5.格式化分佈式文件系統
#hadoop-1.0.3/bin/hadoop namenode -format
6.啓動hadoop服務
#hadoop-1.0.3/bin/start-dfs.sh
#hadoop-1.0.3/bin/start-mapred.sh
7.查看運行狀況
4、安裝hive(主服務器)
1.壓縮包裝包(hadoop用戶)
#tar zxvf hive-0.9.0.tar.gz
#mv hive-0.9.0 hadoop-1.0.3
2.配置hive環境變量
#cp hadoop-1.0.3/hive-0.9.0/conf/hive-env.sh.template hadoop-1.0.3/hive-0.9.0/conf/hive-env.sh
#vim hadoop-1.0.3/hive-0.9.0/conf/hive-env.sh
增長一行:
HADOOP_HOME=$HADOOP_HOME
3.配置hive元數據保存到mysql
創建數據數用戶,數據庫使用latin1字符集:
mysql>CREATE DATABASE hive CHARACTER SET latin1;
mysql>GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hivepasswd';
修改配置文件:
#cp hadoop-1.0.3/hive-0.9.0/conf/hive-default.xml.template hadoop-1.0.3/hive-0.9.0/conf/hive-site.xml
#vim hadoop-1.0.3/hive-0.9.0/conf/hive-site.xml
修改四個地方:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepasswd</value>
<description>password to use against metastore database</description>
</property>
5.複製mysql鏈接庫包
#cp mysql-connector-java-5.1.11-bin.jar hadoop-1.0.3/hive-0.9.0/lib
4.啓動hive
#hadoop-1.0.3/hive-0.9.0/bin/hive
5.去掉log4j過時警告
#cp hadoop-1.0.3/hive-0.9.0/conf/hive-log4j.properties.template hadoop-1.0.3/hive-0.9.0/conf/hive-log4j.properties
#vim hadoop-1.0.3/hive-0.9.0/conf/hive-log4j.properties
找到「log4j.appender.EventCounter」將他的值變動爲:「org.apache.hadoop.log.metrics.EventCounter」
5、安裝php訪問hive數據庫插件
1.解壓縮thrift包
#tar zxvf thrift-0.8.0.tar.gz
2.不要ruby支持不然會報錯
#cd thrift-0.8.0
#./configure --without-ruby
#make && make install
3.後臺啓動hive
#hadoop-1.0.3/hive-0.9.0/bin/hive --service hiveserver>/dev/null 2>&1 &
4.準備Thrifht庫包放在/home/hadoop目錄下,這個包能夠在網上下載
5.編寫php程序
- <?
- //php鏈接hive thrift依賴包路徑
- $GLOBALS['THRIFT_ROOT'] = '/home/hadoop/Thrift/';
- //load the required files for connecting to Hive
- require_once $GLOBALS['THRIFT_ROOT'] . 'packages/hive_service/ThriftHive.php';
- require_once $GLOBALS['THRIFT_ROOT'] . 'transport/TSocket.php';
- require_once $GLOBALS['THRIFT_ROOT'] . 'protocol/TBinaryProtocol.php';
- //Set up the transport/protocol/client
- $transport = new TSocket('192.168.1.247', 10000);
- $protocol = new TBinaryProtocol($transport);
- $client = new ThriftHiveClient($protocol);
- $transport->open();
- //run queries, metadata calls etc
- $client->execute('show tables');
- var_dump($client->fetchAll());
- $transport->close();
- ?>