以前在騰訊藍鯨智雲-單機離線部署測試中,遇到了幾個安裝問題,本文記錄下3.2 app_mgr組件安裝失敗 的解決過程,由於這個問題卡了好久(可能也是由於筆者對python相關知識和藍鯨產品不夠熟悉),雖然最終解決了,但過程自己更值得記錄。html
離線安裝app_mgr組件時失敗:
安裝命令:./bk_install app_mgr
報錯信息以下:node
create virtualenv for paas_agent Requirement already satisfied: pbr in /usr/local/lib/python2.7/site-packages Requirement already satisfied: virtualenvwrapper in /usr/local/lib/python2.7/site-packages Requirement already satisfied: virtualenv-clone in /usr/local/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied: stevedore in /usr/local/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied: virtualenv in /usr/local/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied: pbr>=1.6 in /usr/local/lib/python2.7/site-packages (from stevedore->virtualenvwrapper) Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python2.7/site-packages (from stevedore->virtualenvwrapper) [192.168.1.6]20200303-174651 224 mkvirtualenv -a /data/bkce/paas_agent/paas_agent --extra-search-dir=/data/install/pip --no-download -p /usr/local/bin/python paas_agent Already using interpreter /usr/local/bin/python New python executable in /data/bkce/.envs/paas_agent/bin/python Installing setuptools, pip, wheel...done. Setting project for paas_agent to /data/bkce/paas_agent/paas_agent Ignoring indexes: http://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied (use --upgrade to upgrade): pbr in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages Ignoring indexes: http://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied (use --upgrade to upgrade): virtualenvwrapper in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages Requirement already satisfied (use --upgrade to upgrade): virtualenv-clone in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied (use --upgrade to upgrade): stevedore in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied (use --upgrade to upgrade): virtualenv in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from virtualenvwrapper) Requirement already satisfied (use --upgrade to upgrade): pbr>=1.6 in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from stevedore->virtualenvwrapper) Requirement already satisfied (use --upgrade to upgrade): six>=1.9.0 in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from stevedore->virtualenvwrapper) Ignoring indexes: http://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied (use --upgrade to upgrade): supervisor in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages Requirement already satisfied (use --upgrade to upgrade): six in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages Requirement already satisfied (use --upgrade to upgrade): meld3>=0.6.5 in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from supervisor) [192.168.1.6]20200303-174801 233 generate env variable settings. [192.168.1.6]20200303-174801 151 exec: pip install --no-cache-dir -r requirements.txt (/data/bkce/paas_agent/paas_agent) Collecting Django==1.8.11 (from -r requirements.txt (line 1)) Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f7b58e91150>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /pypi/simple/django/ Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f7b58e91d50>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /pypi/simple/django/ Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f7b58e91f10>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /pypi/simple/django/ Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f7b58e5c110>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /pypi/simple/django/ Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.HTTPConnection object at 0x7f7b58e5c2d0>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /pypi/simple/django/ Could not find a version that satisfies the requirement Django==1.8.11 (from -r requirements.txt (line 1)) (from versions: ) No matching distribution found for Django==1.8.11 (from -r requirements.txt (line 1)) [192.168.1.6]20200303-174900 177 pip install (--no-cache-dir ) for paas_agent. FAILED [192.168.1.6]20200303-174900 47 Abort
注意:離線安裝就是指安裝環境沒法鏈接互聯網,若是你的部署環境容許能夠鏈接外網,測試過該組件安裝會很是順利。python
首先,比較奇怪的是隻有離線安裝app_mgr這個組件時,報錯沒法鏈接網絡,回顧上面的報錯日誌,發現安裝這個組件時:nginx
[192.168.1.6]20200303-174801 233 generate env variable settings. [192.168.1.6]20200303-174801 151 exec: pip install --no-cache-dir -r requirements.txt (/data/bkce/paas_agent/paas_agent)
看起來這個pip 命令沒有使用--find-links
參數指定本地的路徑,因此嘗試鏈接外網的pip源。
而在其餘組件安裝時,都是有指定這個參數到各自本地路徑的:redis
--好比安裝fta: [192.168.1.6]20200302-001610 233 generate env variable settings. [192.168.1.6]20200302-001610 151 exec: pip install --no-cache-dir --no-index --find-links=/data/src/fta/support-files/pkgs -r requirements.txt (/data/bkce/fta/fta) --好比安裝bkdata [192.168.1.6]20200302-003237 233 generate env variable settings. [192.168.1.6]20200302-003237 151 exec: pip install --no-cache-dir --no-index --find-links=/data/src/bkdata/support-files/pkgs -r requirements.txt (/data/bkce/bkdata/dataapi)
能夠看到這類組件安裝在一樣相似的步驟時,都有使用--find-links
參數各自指定本地包存放的路徑。shell
初步進行了一些嘗試:django
2.1 直接使用pip離線安裝後再次嘗試單獨安裝app_mgrjson
pip install --no-cache-dir --no-index --find-links=/data/src/paas_agent/support-files/pkgs -r /data/bkce/paas_agent/paas_agent/requirements.txt
pip離線安裝成功,可是再調用安裝./bk_install app_mgr 組件依然報錯,說明手工提早安裝無效。
這大概是由於程序是進入到對應的virtualenv執行的,而虛擬環境相對是獨立的。c#
2.2 找到一些pip.conf的配置文件,備份原文件,修改配置指定本地路徑
嘗試修過的配置文件:/data/src/.pip/pip.conf、/data/install/pip/pip.conf,內容改成:api
[global] find-links = /data/src/paas_agent/support-files/pkgs [install] find-links = /data/src/paas_agent/support-files/pkgs
可是調用安裝./bk_install app_mgr 組件依然報一樣錯誤,說明無效。
後面其餘嘗試會發現有更多的pip.conf,所有修改也是不行。
2.3 設置環境變量
官方文檔搜到一個環境變量PIP_FIND_LINKS:
export PIP_FIND_LINKS=/data/src/paas_agent/support-files/pkgs
再次嘗試調用./bk_install app_mgr
安裝組件,報錯不變。
這大概是由於寫死在程序裏的,相似crontab定時任務同樣,在外部設置變量干預也沒用,必須找到裏面的設置。
2.4 其餘嘗試
好比在bk_install中app_mgr模塊下手工加入上面的環境變量設置,也不行,報錯不變。
問題有些陷入僵局,並且顯然是有問題,與客戶反饋上述分析,一致認爲極可能是bug,找藍鯨客服進行反饋。
客服人員的答覆是離線安裝建議配置完整的本地pip源,考慮到全量pip源要接近2T的空間申請,轉換爲進行指定包的pip源搭建。
並且這個解決方案更像是workaround,跳過了問題本質,由於實際其餘組件都不須要,會使用find-links參數指定本地的包目錄。
由於以前沒接觸過,配置本地pip源也耗費了很多時間搜索驗證:
[root@rbtnode1 bin]# find /data -name pip.conf /data/install/pip/pip.conf /data/install/pip.conf /data/src/service/.pip/pip.conf /data/src/.pip/pip.conf /data/src/pip.conf cat /data/install/pip/pip.conf cat /data/install/pip.conf cat /data/src/service/.pip/pip.conf cat /data/src/.pip/pip.conf cat /data/src/pip.conf cat ~/.pip/pip.conf
不清楚究竟會用到哪一個pip.conf,因此全部配置文件備份,而後內容統一都改成本地pip源:
[global] trusted-host = 192.168.1.6 index-url = http://192.168.1.6:8080/simple
關於本地pip源的具體配置,可參考網上這兩篇文章:
可是嘗試安裝仍是報錯。修改globals.env配置文件:
# 設置訪問網絡資源如yum源所使用的HTTP代理地址, 如: BK_PROXY=http://192.168.0.1:8833 export BK_PROXY=http://192.168.1.6:8080/simple
和同事也聊到這個事情,從邏輯上來看仍是應該解決如何跟其餘組件同樣能夠指定find-links參數才能夠。
思路只能是本身從腳本源頭去找,看有沒有對應的設置。從bk_install這個主腳本開始爲入口。
開始看腳本沒多久就看下去了,由於本身不多運用腳本能力,自己也是弱項。從bk_install到bkcec就看到裏面調用了好多文件,一時找不到頭緒。此時又回頭看最初的報錯日誌,看報錯以前有這樣一行,像是腳本的輸出內容:
[192.168.1.6]20200303-174801 233 generate env variable settings. [192.168.1.6]20200303-174801 151 exec: pip install --no-cache-dir -r requirements.txt (/data/bkce/paas_agent/paas_agent)
依據"generate env variable settings"搜索/data/install下全部的文件,發現只有utils.fc文件包含:
[root@rbtnode1 install]# grep "generate env variable settings" * grep: agent_setup: Is a directory grep: appmgr: Is a directory grep: bcs: Is a directory grep: bin: Is a directory grep: build: Is a directory grep: deck: Is a directory grep: extra: Is a directory grep: health_check: Is a directory grep: migrate: Is a directory grep: pip: Is a directory grep: scripts: Is a directory grep: setuptools-36.0.1: Is a directory grep: support-files: Is a directory grep: templates: Is a directory grep: uninstall: Is a directory utils.fc: log "generate env variable settings." grep: verify: Is a directory [root@rbtnode1 install]# ls -l utils.fc -rw-r--r-- 1 root root 38897 Jan 9 16:11 utils.fc [root@rbtnode1 install]# scp utils.fc 192.168.1.61:/tmp/
拷貝下來去看發現有這樣一段代碼比較像:
_install_pypkgs () { local module=$1 local project=$2 local local_pip_src=$PKG_SRC_PATH/$module/support-files/pkgs local pip_options="--no-cache-dir " local _ordered_requirement_files=( $( shopt -s nullglob; echo 0[0-9]_requirements*.txt) ) if [ "${#_ordered_requirement_files[@]}" -eq 0 ]; then _ordered_requirement_files=( requirements.txt ) fi for reqr_file in ${_ordered_requirement_files[@]}; do if [ "${reqr_file//_local/}" != "$reqr_file" -o -f SELF_CONTAINED_PIP_PKG ]; then pip_options="--no-cache-dir --no-index --find-links=$local_pip_src" fi log "exec: pip install $pip_options -r $reqr_file ($PWD)" http_proxy=$BK_PROXY https_proxy=$BK_PROXY \ pip install $pip_options -r $reqr_file <-- 這裏pip install 帶的參數$pip_options極可能沒有find-links參數 nassert "pip install ($pip_options) for $venv_name" done #shopt -s nullglob }
上面標註的那一行,指出這裏pip install 帶的參數$pip_options極可能沒有find-links參數,由於上面賦予pip_options變量的是在if條件裏面,暫時來不及總體梳理分析,嘗試直接修改 utils.fc 文件加入pip_options的定義:
_install_pypkgs () { local module=$1 local project=$2 local local_pip_src=$PKG_SRC_PATH/$module/support-files/pkgs local pip_options="--no-cache-dir " local _ordered_requirement_files=( $( shopt -s nullglob; echo 0[0-9]_requirements*.txt) ) if [ "${#_ordered_requirement_files[@]}" -eq 0 ]; then _ordered_requirement_files=( requirements.txt ) fi for reqr_file in ${_ordered_requirement_files[@]}; do if [ "${reqr_file//_local/}" != "$reqr_file" -o -f SELF_CONTAINED_PIP_PKG ]; then pip_options="--no-cache-dir --no-index --find-links=$local_pip_src" fi log "exec: pip install $pip_options -r $reqr_file ($PWD)" http_proxy=$BK_PROXY https_proxy=$BK_PROXY \ #pip install $pip_options -r $reqr_file <-- 以前的這一行註釋,下面兩行是新增,指定pip_options參數值後再調用pip install pip_options="--no-cache-dir --no-index --find-links=$local_pip_src" pip install $pip_options -r $reqr_file nassert "pip install ($pip_options) for $venv_name" done #shopt -s nullglob }
修改 utils.fc 後再次測試,發現以前報錯的位置再也不報錯(雖然顯示尚未find-links參數,但實際已經有了):
[192.168.1.6]20200303-214725 235 generate env variable settings. [192.168.1.6]20200303-214726 151 exec: pip install --no-cache-dir -r requirements.txt (/data/bkce/paas_agent/paas_agent) Ignoring indexes: http://192.168.1.6:8080/simple Collecting Django==1.8.11 (from -r requirements.txt (line 1)) Collecting PyMySQL==0.6.7 (from -r requirements.txt (line 2)) 省略部分輸出.. Collecting idna<2.9,>=2.5 (from requests==2.21.0->-r requirements.txt (line 3)) Could not find a version that satisfies the requirement idna<2.9,>=2.5 (from requests==2.21.0->-r requirements.txt (line 3)) (from versions: ) No matching distribution found for idna<2.9,>=2.5 (from requests==2.21.0->-r requirements.txt (line 3)) [192.168.1.6]20200303-214856 177 pip install (--no-cache-dir --no-index --find-links=/data/src/paas_agent/support-files/pkgs) for paas_agent. FAILED [192.168.1.6]20200303-214856 47 Abort [root@rbtnode1 install]#
但最後又由於缺包停止了安裝。
這個 idna<2.9,>=2.5 在paas_agent的requirements.txt中實際沒有列出來,但實際須要。能夠將其餘位置的包都統一打包到一個目錄(/data/localpip),而後拷貝其餘的包到這個目錄下:
[root@rbtnode1 pkgs]# pwd /data/src/paas_agent/support-files/pkgs [root@rbtnode1 pkgs]# ls -l |wc -l 62 [root@rbtnode1 pkgs]# cp -n /data/localpip/* ./ [root@rbtnode1 pkgs]# pwd /data/src/paas_agent/support-files/pkgs [root@rbtnode1 pkgs]# ls -l |wc -l 281
而後再嘗試安裝app_mgr:
[root@rbtnode1 pkgs]# cd /data/install/ [root@rbtnode1 install]# ./bk_install app_mgr
此次終於成功了,日誌以下,能夠看到appt安裝成功後接下來仍是安裝appo,均可以成功:
Collecting chardet<3.1.0,>=3.0.2 (from requests==2.21.0->-r requirements.txt (line 3)) Collecting idna<2.9,>=2.5 (from requests==2.21.0->-r requirements.txt (line 3)) Collecting certifi>=2017.4.17 (from requests==2.21.0->-r requirements.txt (line 3)) Installing collected packages: Django, PyMySQL, urllib3, chardet, idna, certifi, requests, pytz, amqp, anyjson, kombu, billiard, celery, django-celery, redis, httplib2, xlrd, xlwt, MarkupSafe, Mako, Jinja2, pycrypto, gunicorn, six, SQLAlchemy, suds, supervisor, uWSGI, pytest-runner, setuptools-scm Running setup.py install for anyjson: started Running setup.py install for anyjson: finished with status 'done' Running setup.py install for billiard: started Running setup.py install for billiard: finished with status 'done' 省略部分輸出.. Successfully installed Django-1.8.11 Jinja2-2.8 Mako-1.0.4 MarkupSafe-0.23 PyMySQL-0.6.7 SQLAlchemy-1.0.12 amqp-1.4.9 anyjson-0.3.3 billiard-3.3.0.23 celery-3.1.18 certifi-2019.3.9 chardet-3.0.4 django-celery-3.2.1 gunicorn-19.6.0 httplib2-0.9.1 idna-2.8 kombu-3.0.35 pycrypto-2.6.1 pytest-runner-2.8 pytz-2016.6.1 redis-2.10.5 requests-2.21.0 setuptools-scm-1.11.1 six-1.10.0 suds-0.4 supervisor-3.3.1 uWSGI-2.0.13.1 urllib3-1.24.1 xlrd-1.0.0 xlwt-1.1.2 [192.168.1.6]20200303-222848 175 pip install (--no-cache-dir --no-index --find-links=/data/src/paas_agent/support-files/pkgs) for paas_agent. OK [192.168.1.6]20200303-222858 453 apps isolate mode: virutalenv Ignoring indexes: http://192.168.1.6:8080/simple Requirement already satisfied (use --upgrade to upgrade): Django==1.8.11 in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from -r requirements.txt (line 1)) Requirement already satisfied (use --upgrade to upgrade): PyMySQL==0.6.7 in /data/bkce/.envs/paas_agent/lib/python2.7/site-packages (from -r requirements.txt (line 2)) 省略部分輸出.. [192.168.1.6]20200303-222926 151 install python package for virtualenv paas_agent done. [192.168.1.6]20200303-222927 468 local nginx is required for paas_agent. going to install it. Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Package 1:nginx-1.12.2-2.el7.x86_64 already installed and latest version Nothing to do [192.168.1.6]20200303-222934 175 render: #etc#nginx.conf -> /data/bkce//etc/nginx.conf. OK [192.168.1.6]20200303-222935 175 render: #etc#nginx#paasagent.conf -> /data/bkce//etc/nginx/paasagent.conf. OK [192.168.1.6]20200303-222936 322 PLACE HOLDER __SID__ is replaced into empty [192.168.1.6]20200303-222937 322 PLACE HOLDER __TOKEN__ is replaced into empty [192.168.1.6]20200303-222937 175 render: #etc#paas_agent_config.yaml.tpl -> /data/bkce//etc/paas_agent_config.yaml. OK [192.168.1.6]20200303-222938 175 render: #etc#supervisor-paas_agent.conf -> /data/bkce//etc/supervisor-paas_agent.conf. OK [192.168.1.6]20200303-222939 56 install appt(allproject) done initdata for appt() [192.168.1.6]20200303-222946 182 exec initdata_appt on 192.168.1.6 [192.168.1.6]20200303-222958 262 update config file: paas_agent_config.yaml [192.168.1.6]20200303-222958 268 register appt succeded. [192.168.1.6]20200303-222958 502 create database bksuite_common [192.168.1.6]20200303-222958 504 add version info to db [192.168.1.6]20200303-223001 98 starting appt(ALL) on host: 192.168.1.6 [192.168.1.6]20200303-223052 77 activate appt(192.168.1.6) succeded #這裏appt已經安裝成功,接下來安裝appo 省略部分輸出.. install appo(all) [192.168.1.6]20200303-223102 112 check dependences for paas_agent 省略部分輸出.. initdata for appo() [192.168.1.6]20200303-223509 182 exec initdata_appo on 192.168.1.6 [192.168.1.6]20200303-223533 262 update config file: paas_agent_config.yaml [192.168.1.6]20200303-223534 268 register appo succeded. [192.168.1.6]20200303-223535 502 create database bksuite_common [192.168.1.6]20200303-223535 504 add version info to db [192.168.1.6]20200303-223541 98 starting appo(ALL) on host: 192.168.1.6 [192.168.1.6]20200303-223613 77 activate appo(192.168.1.6) succeded [192.168.1.6] paas_agent() paas_agent RUNNING pid 23792, uptime 0:06:10 [192.168.1.6] nginx: RUNNING [192.168.1.6] paas_agent() paas_agent RUNNING pid 23792, uptime 0:06:42 [192.168.1.6] nginx: RUNNING [192.168.1.6] rabbitmq: RUNNING 若是以上步驟沒有報錯, 你如今能夠完成正式環境及測試環境的部署,能夠: 1. 經過./bk_install saas-o bk_nodeman 部署節點管理app, 或 2. 經過開發者中心部署app. 若要安裝藍鯨監控, 日誌檢索, 須要先經過 ./bk_install bkdata 安裝 bkdata [root@rbtnode1 install]#
終於跌跌撞撞的解決了這個困惑許久的問題。後續本身還須要增強python和shell的腳本能力。