1、環境搭建mysql
1.Hadoopweb
http://my.oschina.net/u/204498/blog/519789 sql
2.Sqoop2.xshell
http://my.oschina.net/u/204498/blog/518941數據庫
3. mysqlapache
2、從mysql導入hdfside
1.建立mysql數據庫、表、以及測試數據oop
xxxxxxxx$ mysql -uroot -p Enter password: mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | test | +--------------------+ 4 rows in set (0.00 sec) test => 是新建的數據庫 mysql> use test; mysql> show tables; +----------------------+ | Tables_in_test | +----------------------+ | | test | +----------------------+ 1 rows in set (0.00 sec) test => 是新增的表 mysql> desc test; +-------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | name | varchar(45) | YES | | NULL | | | age | int(11) | YES | | NULL | | +-------+-------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) mysql> select * from test; +----+------+------+ | id | name | age | +----+------+------+ | 7 | a | 1 | | 8 | b | 2 | | 9 | c | 3 | +----+------+------+ 3 rows in set (0.00 sec)
2. 爲各個用戶受權測試
注意:sqoop提交job後,各個節點在map階段會訪問數據庫,因此需事先受權ui
mysql> grant [all | select | ...] on {db}.{table} to {user}@{host} identified by {passwd}; mysql> flush privileges; #我給特定的hostname受權 username:root passwd:root 訪問db:test 中任意table,權限是all mysql> grant all on test.* to 'root'@{host} identified by 'root';
3.啓動sqoop2-server
[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd /home/hadoop/sqoop-1.99.6-bin-hadoop200 [hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-server start ... ... webui能夠訪問校驗,也能夠查看log
4.啓動sqoop2-shell
[hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ pwd /home/hadoop/sqoop-1.99.6-bin-hadoop200 [hadoop@hftclclw0001 sqoop-1.99.6-bin-hadoop200]$ ./bin/sqoop2-shell ... ... sqoop:000> show version ... ... sqoop:000> show connector +----+------------------------+---------+------------------------------------------------------+----------------------+ | Id | Name | Version | Class | Supported Directions | +----+------------------------+---------+------------------------------------------------------+----------------------+ | 1 | generic-jdbc-connector | 1.99.6 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | | 2 | kite-connector | 1.99.6 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | 3 | hdfs-connector | 1.99.6 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | 4 | kafka-connector | 1.99.6 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | +----+------------------------+---------+------------------------------------------------------+----------------------+ 根據你的connector建立connector sqoop:000> create link -c 1 => 先建立jdbc 會填寫name、jdbc-driver、url、username、passwd等等 sqoop:000> create link -c 3 => 建立hdfs 會填寫name、hdfs url、等等 sqoop:000> show link +----+-------------+--------------+------------------------+---------+ | Id | Name | Connector Id | Connector Name | Enabled | +----+-------------+--------------+------------------------+---------+ | 3 | 10-21_jdbc1 | 1 | generic-jdbc-connector | true | | 4 | 10-21_hdfs1 | 3 | hdfs-connector | true | +----+-------------+--------------+------------------------+---------+ 建立job -f=> from -t to 即從哪些導入到哪裏 sqoop:000> create job -f 3 -t 4 會填寫,相應的table信息。還有hdfs信息 sqoop:000> show job +----+---------------+----------------+--------------+---------+ | Id | Name | From Connector | To Connector | Enabled | +----+---------------+----------------+--------------+---------+ | 1 | 10-20_sqoopy2 | 1 | 3 | true | +----+---------------+----------------+--------------+---------+ #啓動job sqoop:000> start job -j 2 ... ... ... 能夠再webui上訪問到,查看進度,也能夠使用 sqoop:000> status job -j 2
sqoop的guide
5.troubleshooting
多看日誌,慢慢的排查