Sqoop是一個用來完成Hadoop和關係型數據庫中的數據相互轉移的工具,它能夠將關係型數據庫中的數據導入到Hadoop的HDFS中,也能夠將HDFS的數據導入到關係型數據庫中。html
Kafka是一個開源的分佈式消息訂閱系統java
1、Sqoop的安裝mysql
1.http://www-eu.apache.org/dist/sqoop/1.4.7/下載sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz並解壓到/home/jun下linux
[jun@master sqoop-1.4.7.bin__hadoop-2.6.0]$ ls -l total 2020 drwxr-xr-x. 2 jun jun 4096 Dec 19 2017 bin -rw-rw-r--. 1 jun jun 55089 Dec 19 2017 build.xml -rw-rw-r--. 1 jun jun 47426 Dec 19 2017 CHANGELOG.txt -rw-rw-r--. 1 jun jun 9880 Dec 19 2017 COMPILING.txt drwxr-xr-x. 2 jun jun 150 Dec 19 2017 conf drwxr-xr-x. 5 jun jun 169 Dec 19 2017 docs drwxr-xr-x. 2 jun jun 96 Dec 19 2017 ivy -rw-rw-r--. 1 jun jun 11163 Dec 19 2017 ivy.xml drwxr-xr-x. 2 jun jun 4096 Dec 19 2017 lib -rw-rw-r--. 1 jun jun 15419 Dec 19 2017 LICENSE.txt -rw-rw-r--. 1 jun jun 505 Dec 19 2017 NOTICE.txt -rw-rw-r--. 1 jun jun 18772 Dec 19 2017 pom-old.xml -rw-rw-r--. 1 jun jun 1096 Dec 19 2017 README.txt -rw-rw-r--. 1 jun jun 1108073 Dec 19 2017 sqoop-1.4.7.jar -rw-rw-r--. 1 jun jun 6554 Dec 19 2017 sqoop-patch-review.py -rw-rw-r--. 1 jun jun 765184 Dec 19 2017 sqoop-test-1.4.7.jar drwxr-xr-x. 7 jun jun 73 Dec 19 2017 src drwxr-xr-x. 4 jun jun 114 Dec 19 2017 testdata
2.配置MySQL鏈接器sql
[jun@master sqoop-1.4.7.bin__hadoop-2.6.0]$ cp /home/jun/Resources/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
3.配置Sqoop環境變量數據庫
編輯配置文件apache
[jun@master lib]$ cd /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/conf/ [jun@master conf]$ ls oraoop-site-template.xml sqoop-env-template.cmd sqoop-env-template.sh sqoop-site-template.xml sqoop-site.xml [jun@master conf]$ cp sqoop-env-template.sh sqoop-env.sh [jun@master conf]$ gedit sqoop-env.sh
增長下面的配置app
#Set path to where bin/hadoop is available export HADOOP_COMMON_HOME=/home/jun/hadoop #Set path to where hadoop-*-core.jar is available export HADOOP_MAPRED_HOME=/home/jun/hadoop #set the path to where bin/hbase is available export HBASE_HOME=/home/jun/hbase-1.2.6.1 #Set the path to where bin/hive is available export HIVE_HOME=/home/jun/apache-hive-2.3.3-bin #Set the path for where zookeper config dir is export ZOOCFGDIR=/usr/local/zk
4.配置linux環境變量分佈式
#sqoop export SQOOP_HOME=/home/jun/sqoop-1.4.7.bin__hadoop-2.6.0 export PATH=$PATH:$SQOOP_HOME/bin
5.啓動Sqoop,若是出現下面的內容就說明安裝成功ide
[jun@master ~]$ sqoop-help Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 15:56:36 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information See 'sqoop help COMMAND' for information on a specific command.
6.測試與MySQL的鏈接
(1)列出MySQL的全部數據庫
[jun@master ~]$ sqoop-list-databases --connect jdbc:mysql://localhost:3306 --username root -P Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:03:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:03:05 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:03:05 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. information_schema hive_db mysql performance_schema sys
(2)列出數據庫下的全部數據表
[jun@master ~]$ sqoop-list-tables --connect jdbc:mysql://localhost:3306/mysql --username root -P Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:06:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:06:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:06:09 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. columns_priv db engine_cost event func general_log gtid_executed help_category help_keyword help_relation help_topic innodb_index_stats innodb_table_stats ndb_binlog_index plugin proc procs_priv proxies_priv server_cost servers slave_master_info slave_relay_log_info slave_worker_info slow_log tables_priv time_zone time_zone_leap_second time_zone_name time_zone_transition time_zone_transition_type user
(3)執行MySQL的查詢語句
[jun@master ~]$ sqoop-eval --connect jdbc:mysql://localhost:3306/mysql --username root -P --query "select * from plugin" Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/jun/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/07/23 16:09:33 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jun/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jun/hbase-1.2.6.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Enter password: 18/07/23 16:09:36 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Mon Jul 23 16:09:37 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. ----------------------------------------------- | name | dl | ----------------------------------------------- | validate_password | validate_password.so | -----------------------------------------------