\ | master1 | master2 | slave1 | slave2 | slave3 |
---|---|---|---|---|---|
組件 | scheduler, webserver, flower, airflow-scheduler-failover-controller | webserver, airflow-scheduler-failover-controller | worker | worker | worker |
在每臺機器上執行node
pip install apache-airflow pip install apache-airflow[mysql] pip install celery pip install redis
因爲airflow採用python編寫,爲了方便請自行安裝python環境和pip,本文檔採用python2.7和pip 19.2.1,airflow1.10.1python
問題記錄:執行pip install apache-airflow[mysql]時若報錯mysql_config不存在可運行yum install python-devel mysql-devel命令mysql
在每臺機器上執行git
export AIRFLOW_HOME=~/airflow
在每臺機器上執行github
airflow
命令,使其在家目錄下生成配置文件airflow.cfgweb
## 時區設置 default_timezone = Asia/Shanghai ## 不加載案例 load_examples = False ## 執行webserver默認啓動端口 web_server_port = 9999 ## 數據庫鏈接 sql_alchemy_conn = mysql://airflow:123456@172.19.131.108/airflow ## 使用的執行器 executor = CeleryExecutor ## 設置消息的中間代理 broker_url = redis://redis:Kingsoftcom_123@172.19.131.108:6379/1 ## 設定結果存儲後端 backend ## 固然也可使用 Redis :result_backend =redis://redis:Kingsoftcom_123@172.19.131.108:6379/1 result_backend = db+mysql://airflow:123456@172.19.131.108/airflow
ps:redis
https://blog.csdn.net/crazy__hope/article/details/83688986sql
master1,master2上執行數據庫
pip install git+git://github.com/teamclairvoyant/airflow-scheduler-failover-controller.git@v1.0.2
scheduler_failover_controller init
初始化時,會向airflow.cfg中追加內容,所以須要先安裝 airflow 並初始化apache
scheduler_nodes_in_cluster=master1,master2
host name 能夠經過scheduler_failover_controller get_current_host命令得到
scheduler_failover_controller test_connection
ps: 須要先配好master1和master2的ssh免密登陸
nohup scheduler_failover_controller start > /dev/null &
注意:先不要執行改命令,待airflow全部組件啓動以後再執行
使用同步腳本,每次更新dags目錄時執行該腳本
參考腳本 https://www.jianshu.com/p/e74fbb091144
airflow scheduler -D airflow webserver -D
airflow webserver -D
airflow worker -D
解決辦法:pip install -U werkzeug