ubuntu系統下安裝pyspider:使用supervisord啓動並管理pyspider進程配置及說明

首先感謝segmentfault.com的「imperat0r_」用戶的文章新浪的「小菜一碟」用戶的文章。這是他們的配置文件。我參考也寫了一個,在最後呢。html

重點說明寫在前面。本人用supervisord配置好pyspider後,pyspider一直有問題,不能正常運行。找了好久緣由。最後想起,supervisord啓動的進程是否正常這個問題。因而果斷用supervisorctl命令查看全部管理的進程。果真發現有兩個進程啓動失敗。怎麼辦?立刻修改錯誤的參數啊!python

參數!參數!參數!必定要把參數配置正確,這是王道。web

「imperat0r_」的配置

若是你使用源代碼啓動,可使用這個配置。若是你使用已編譯過的pyspider,請參考下面的配置。只有一個區別,就是啓動的路徑不同。我本身的配置文件裏,我對參數進行了簡要的說明。shell

[group:pyspider]
program=pyspider-webui,pyspider-scheduler,pyspider-processor,pyspider-result_worker,pyspider-fetcher,pyspider-phantomjs
priority=999

[program:pyspider-webui]
command=/usr/local/bin/pyspider/run.py -c /root/config.json webui
directory=/root
autostart=true
autorestart=true
priority=905
user=root

[program:pyspider-scheduler]
command=/usr/local/bin/pyspider/run.py -c /root/config.json scheduler
directory=/root
autostart=true
autorestart=true
priority=900
user=root

[program:pyspider-processor]
command=/usr/local/bin/pyspider/run.py -c /root/config.json processor
directory=/root
autostart=true
autorestart=true
priority=903
user=root

[program:pyspider-result_worker]
command=/usr/local/bin/pyspider/run.py -c /root/config.json result_worker
directory=/root
autostart=true
autorestart=true
priority=904
user=root

[program:pyspider-fetcher]
command=/usr/local/bin/pyspider/run.py -c /root/config.json --phantomjs-proxy="localhost:25555" fetcher
directory=/root
autostart=true
autorestart=true
priority=902
user=root

[program:pyspider-phantomjs]
command=/usr/local/bin/pyspider/run.py -c /root/config.json phantomjs
directory=/root
autostart=true
autorestart=true
priority=901
user=root

新浪的「小菜一碟」的配置:

若是你使用已編譯過的pyspider,請參考這個配置。只有一個區別,就是啓動的路徑不同。json

[group:pyspider]
program=pyspider-webui,pyspider-scheduler,pyspider-processor,pyspider-result_worker,pyspider-fetcher,pyspider-phantomjs
priority=999

[program:pyspider-webui]
command=pyspider -c config.json webui
autostart=true
autorestart=true
priority=905
user=root
directory=/usr/pyspider/


[program:pyspider-scheduler]
command=pyspider -c config.json webui scheduler
directory=/usr/pyspider/
autostart=true
autorestart=true
priority=900
user=root
directory=/usr/pyspider/

[program:pyspider-processor]
command=pyspider -c config.json  processor
directory=/usr/pyspider/
autostart=true
autorestart=true
priority=903
user=root

[program:pyspider-result_worker]
command=pyspider -c config.json result_worker
directory=/usr/pyspider/
autostart=true
autorestart=true
priority=904
user=root

[program:pyspider-fetcher]
command=pyspider -c config.json  --phantomjs-proxy="localhost:25555" fetcher
directory=/usr/pyspider/
autostart=true
autorestart=true
priority=902
user=root

[program:pyspider-phantomjs]
command=pyspider -c config.json phantomjs --phantomjs-path ./phantomjs/bin/phantomjs
directory=/usr/pyspider/
autostart=true
autorestart=true
priority=901
user=root

本人本身的配置文件。

這個配置文件可使pyspider每一個組件單獨啓動進程,單獨管理,不影響總體運行。我對這個配置文件學了好久,下面記錄一下詳細信息,但願對新手有用。每一個參數的中文說明見下一節。segmentfault

[group:pyspider]
program=pyspider-webui,pyspider-scheduler,pyspider-processor,pyspider-result_worker,pyspider-fetcher,pyspider-phantomjs
priority=999
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log            
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log                

[program:pyspider-webui]                                                                                  
command=/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json webui
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=905
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log

[program:pyspider-scheduler]
command=/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json scheduler
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=900
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log


[program:pyspider-processor]
command=p/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json processor
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=903
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log

[program:pyspider-result_worker]
command=/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json result_worker
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=904
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log

[program:pyspider-fetcher]
command=/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json --phantomjs-proxy="localhost:25555" fetcher
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=902
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log

[program:pyspider-phantomjs]
command=/home/chg/py3env-pyspider/bin/pyspider -c /home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/config.json phantomjs
directory=/home/chg/py3env-pyspider/bin/
autostart=true
autorestart=true
priority=901
user=chg
stderr_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider_err.log
stdout_logfile=/home/chg/py3env-pyspider/lib/python3.5/site-packages/pyspider/pyspider.log

參數中文說明

感謝」使用supervisord來管理process「的文章。socket

; Sample supervisor config file.
;
; For more information on the config file, please see:
; http://supervisord.org/configuration.html
;
; Note: shell expansion ("~" or "$HOME") is not supported.  Environment
; variables can be expanded using this syntax: "%(ENV_HOME)s".
 
[unix_http_server]          ; supervisord的unix socket服務配置
file=/tmp/supervisor.sock   ; socket文件的保存目錄
;chmod=0700                 ; socket的文件權限 (default 0700)
;chown=nobody:nogroup       ; socket的擁有者和組名
;username=user              ; 默認不須要登錄用戶 (open server)
;password=123               ; 默認不須要登錄密碼 (open server)
 
;[inet_http_server]         ; supervisord的tcp服務配置
;port=127.0.0.1:9001        ; tcp端口
;username=user              ; tcp登錄用戶
;password=123               ; tcp登錄密碼
 
[supervisord]                ; supervisord的主進程配置
logfile=/tmp/supervisord.log ; 主要的進程日誌配置
logfile_maxbytes=50MB        ; 最大日誌體積,默認50MB
logfile_backups=10           ; 日誌文件備份數目,默認10
loglevel=info                ; 日誌級別,默認info; 還有:debug,warn,trace
pidfile=/tmp/supervisord.pid ; supervisord的pidfile文件
nodaemon=false               ; 是否以守護進程的方式啓動
minfds=1024                  ; 最小的有效文件描述符,默認1024
minprocs=200                 ; 最小的有效進程描述符,默認200
;umask=022                   ; 進程文件的umask,默認200
;user=chrism                 ; 默認爲當前用戶,若是爲root則必填
;identifier=supervisor       ; supervisord的表示符, 默認時'supervisor'
;directory=/tmp              ; 默認不cd到當前目錄
;nocleanup=true              ; 不在啓動的時候清除臨時文件,默認false
;childlogdir=/tmp            ; ('AUTO' child log dir, default $TEMP)
;environment=KEY=value       ; 初始鍵值對傳遞給進程
;strip_ansi=false            ; (strip ansi escape codes in logs; def. false)
 
; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
 
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; 若是設置應該與http_username相同
;password=123                ; 若是設置應該與http_password相同
;prompt=mysupervisor         ; 命令行提示符,默認"supervisor"
;history_file=~/.sc_history  ; 命令行歷史紀錄
 
; The below sample program section shows all possible program subsection values,
; create one or more 'real' program: sections to be able to control them under
; supervisor.
 
;[program:theprogramname]
;command=/bin/cat              ; 運行的程序 (相對使用PATH路徑, 可使用參數)
;process_name=%(program_name)s ; 進程名錶達式,默認爲%(program_name)s
;numprocs=1                    ; 默認啓動的進程數目,默認爲1
;directory=/tmp                ; 在運行前cwd到指定的目錄,默認不執行cmd
;umask=022                     ; 進程umask,默認None
;priority=999                  ; 程序運行的優先級,默認999
;autostart=true                ; 默認隨supervisord自動啓動,默認true
;autorestart=unexpected        ; whether/when to restart (default: unexpected)
;startsecs=1                   ; number of secs prog must stay running (def. 1)
;startretries=3                ; max # of serial start failures (default 3)
;exitcodes=0,2                 ; 指望的退出碼,默認0,2
;stopsignal=QUIT               ; 殺死進程的信號,默認TERM
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; 向unix進程組發送中止信號,默認false
;killasgroup=false             ; 向unix進程組發送SIGKILL信號,默認false
;user=chrism                   ; 爲運行程序的unix賬號設置setuid
;redirect_stderr=true          ; 將標準錯誤重定向到標準輸出,默認false
;stdout_logfile=/a/path        ; 標準輸出的文件路徑NONE=none;默認AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
;stderr_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A=1,B=2           ; process environment additions (def no adds)
;serverurl=AUTO                ; override serverurl computation (childutils)
 
; The below sample eventlistener section shows all possible
; eventlistener subsection values, create one or more 'real'
; eventlistener: sections to be able to handle event notifications
; sent by supervisor.
 
;[eventlistener:theeventlistenername]
;command=/bin/eventlistener    ; 運行的程序 (相對使用PATH路徑, 可使用參數)
;process_name=%(program_name)s ; 進程名錶達式,默認爲%(program_name)s
;numprocs=1                    ; 默認啓動的進程數目,默認爲1
;events=EVENT                  ; event notif. types to subscribe to (req'd)
;buffer_size=10                ; 事件緩衝區隊列大小,默認10
;directory=/tmp                ; 在運行前cwd到指定的目錄,默認不執行cmd
;umask=022                     ; 進程umask,默認None
;priority=-1                   ; 程序運行的優先級,默認-1
;autostart=true                ; 默認隨supervisord自動啓動,默認true
;autorestart=unexpected        ; whether/when to restart (default: unexpected)
;startsecs=1                   ; number of secs prog must stay running (def. 1)
;startretries=3                ; max # of serial start failures (default 3)
;exitcodes=0,2                 ; 指望的退出碼,默認0,2
;stopsignal=QUIT               ; 殺死進程的信號,默認TERM
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; 向unix進程組發送中止信號,默認false
;killasgroup=false             ; 向unix進程組發送SIGKILL信號,默認false
;user=chrism                   ; setuid to this UNIX account to run the program
;redirect_stderr=true          ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path        ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups        ; # of stderr logfile backups (default 10)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A=1,B=2           ; process environment additions
;serverurl=AUTO                ; override serverurl computation (childutils)
 
; The below sample group section shows all possible group values,
; create one or more 'real' group: sections to create "heterogeneous"
; process groups.
 
;[group:thegroupname]
;programs=progname1,progname2  ; 任何在[program:x]中定義的x
;priority=999                  ; 程序運行的優先級,默認999
 
; The [include] section can just contain the "files" setting.  This
; setting can list multiple files (separated by whitespace or
; newlines).  It can also contain wildcards.  The filenames are
; interpreted as relative to this file.  Included files *cannot*
; include files themselves.
 
;[include]
;files = relative/directory/*.ini
相關文章
相關標籤/搜索