Mac OS X or Linux(測試使用的Centos7.2)html
Java 8 Update 92 or higher (8u92+), 64-bit(測試使用的1.8.0_121,64-bit)java
Presto版本0.172,下載連接。node
Hadoop版本:Apache Hadoop2.6.4mysql
Hive版本:Apache Hive 1.2.1linux
MongoDB版本:mongodb-linux-x86_64-rhel70-3.4.2sql
下載安裝包至目錄/opt/beh/core,解壓縮,建立軟鏈接mongodb
cd /opt/beh/corejvm
tar zxf presto-server-0.172.tar.gzmaven
ln -s presto-server-0.172 prestoide
cd presto
建立配置目錄,而且建立相關配置文件。
cd /opt/beh/core/presto
mkdir data
mkdir etc
cd etc
touch config.properties
touch jvm.config
touch node.properties
touch log.properties
備註:
Config Properties: configuration for the Presto server
JVM Config: command line options for the Java Virtual Machine
Node Properties: environmental configuration specific to each node
Catalog Properties: configuration for Connectors (data sources)
建立data目錄對應的是Node Properties 的參數node.data-dir。
coordinator=true
discovery-server.enabled=true
discovery.uri=http://master:8080
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=60GB
query.max-memory-per-node=20GB
備註:
These properties require some explanation:
coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).
node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.
http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.
query.max-memory: The maximum amount of distributed memory that a query may use.
query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.
discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.
discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace master:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.
-server
-Xmx40G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
備註:
node.environment=production
node.id=ffffffff-ffff-ffff-ffff-fffffffffff1
node.data-dir=/opt/beh/core/presto/data
備註:
The above properties are described below:
node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.
node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.
node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here,Two softlink for directory 「etc」 and 「plugin」, and var/run will store server pid file,var/log store log.
com.facebook.presto=INFO
com.facebook.presto.server=INFO
com.facebook.presto.hive=INFO
備註:
The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR.
建立鏈接器配置目錄,而且配置相關鏈接器配置
cd /opt/beh/core/presto/etc
mkdir catalog
cd catalog
touch hive.properties
touch jmx.properties
touch mongodb.properties
touch mysql.properties
備註:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://localhost:9083
hive.config.resources=/opt/beh/core/hadoop/etc/hadoop/core-site.xml,/opt/beh/core/hadoop/etc/hadoop/hdfs-site.xml
connector.name=jmx
jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler
jmx.dump-period=10s
jmx.max-entries=86400
connector.name=mongodb
mongodb.seeds=hadoop001:37025,hadoop002:37025,hadoop003:37025
connector.name=mysql
connection-url=jdbc:mysql://mysqlhost:3306
connection-user=mysqluser
connection-password=mysqlpassword
export PRESTO_HOME=/opt/beh/core/presto
export PATH=$PATH:$PRESTO_HOME/bin
命令行執行,或者添加到/opt/beh/conf/beh_env中
cd /opt/beh/core/presto
./bin/launcher start
備註:
The installation directory contains the launcher script in bin/launcher. Presto can be started as a daemon by running the following:
bin/launcher start
Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools):
bin/launcher run
Run the launcher with --help to see the supported commands and command line options. In particular, the --verbose option is very useful for debugging the installation.
日誌:
After launching, you can find the log files in var/log:
launcher.log: This log is created by the launcher and is connected to the stdout and stderr streams of the server. It will contain a few log messages that occur while the server logging is being initialized and any errors or diagnostics produced by the JVM.
server.log: This is the main log file used by Presto. It will typically contain the relevant information if the server fails during initialization. It is automatically rotated and compressed.
http-request.log: This is the HTTP request log which contains every HTTP request received by the server. It is automatically rotated and compressed.
下載命令行接口程序拷貝至/opt/beh/core/presto/bin:下載地址
cd /opt/beh/core/presto/bin
chmod -x presto-cli-0.172-executable.jar
ln -s presto-cli-0.172-executable.jar presto
測試鏈接:
./presto --server localhost:8080 --catalog hive --schema default
[hadoop@sparktest bin]$ ./presto --server localhost:8580 --catalog hive --schema default
presto:default> HELP
Supported commands:
QUIT
EXPLAIN [ ( option [, ...] ) ] <query>
options: FORMAT { TEXT | GRAPHVIZ }
TYPE { LOGICAL | DISTRIBUTED }
DESCRIBE <table>
SHOW COLUMNS FROM <table>
SHOW FUNCTIONS
SHOW CATALOGS [LIKE <pattern>]
SHOW SCHEMAS [FROM <catalog>] [LIKE <pattern>]
SHOW TABLES [FROM <schema>] [LIKE <pattern>]
SHOW PARTITIONS FROM <table> [WHERE ...] [ORDER BY ...] [LIMIT n]
USE [<catalog>.]<schema>
presto:default> SHOW CATALOGS;
Catalog
---------
hive
jmx
mongodb
mysql
system
(5 rows)
Query 20170418_121353_00035_yr3tu, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
presto:default> SHOW SCHEMAS FROM HIVE;
Schema
--------------------
default
information_schema
tmp
tpc100g
(4 rows)
Query 20170418_121409_00036_yr3tu, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [4 rows, 55B] [43 rows/s, 601B/s]
presto:default> USE hive.tmp;
presto:tmp> show tables;
Table
-------------
date_dim
item
store_sales
(3 rows)
Query 20170418_121459_00040_yr3tu, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [3 rows, 62B] [40 rows/s, 830B/s]
presto:tmp> select count(*) from item;
_col0
--------
204000
(1 row)
Query 20170418_121540_00041_yr3tu, FINISHED, 3 nodes
Splits: 20 total, 20 done (100.00%)
0:02 [204K rows, 11.8MB] [81.8K rows/s, 4.74MB/s]
presto:tmp> quit