一、根據pyspider官方推薦的安裝方法,使用pip命令直接安裝pyspider
# sudo pip install pyspider
Installing collected packages: click, itsdangerous, Werkzeug, Flask, chardet, cssselect, lxml, pyquery, ordereddict, backports.ssl-match-hostname, singledispatch, certifi, backports-abc, tornado, Flask-Login, u-msgpack-python, wsgidav, pyspider
Running setup.py install for click
Running setup.py install for itsdangerous
Running setup.py install for chardet
Running setup.py install for lxml
Complete output from command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-rpK3h8/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-mmscE4-record/install-record.txt --single-version-externally-managed --compile:
Building lxml version 3.6.0.
Building without Cython.
ERROR: /bin/sh: xslt-config: command not found
** make sure the development packages of libxml2 and libxslt are installed **
Using build configuration of libxslt
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.6
creating build/lib.linux-x86_64-2.6/lxml
copying src/lxml/_elementpath.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/doctestcompare.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/cssselect.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/ElementInclude.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/__init__.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/builder.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/pyclasslookup.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/usedoctest.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/sax.py -> build/lib.linux-x86_64-2.6/lxml
creating build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/__init__.py -> build/lib.linux-x86_64-2.6/lxml/includes
creating build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/defs.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/html5parser.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_html5builder.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_setmixin.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_diffcommand.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/__init__.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/builder.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/clean.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/ElementSoup.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/usedoctest.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/soupparser.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/formfill.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/diff.py -> build/lib.linux-x86_64-2.6/lxml/html
creating build/lib.linux-x86_64-2.6/lxml/isoschematron
copying src/lxml/isoschematron/__init__.py -> build/lib.linux-x86_64-2.6/lxml/isoschematron
copying src/lxml/lxml.etree.h -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/lxml.etree_api.h -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/includes/config.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/schematron.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlschema.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/relaxng.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/c14n.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xslt.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xpath.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/uri.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/etreepublic.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/tree.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlerror.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/htmlparser.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/dtdvalid.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlparser.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xinclude.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/etree_defs.h -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/lxml-version.h -> build/lib.linux-x86_64-2.6/lxml/includes
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/rng
copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/rng
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
running build_ext
building 'lxml.etree' extension
creating build/temp.linux-x86_64-2.6
creating build/temp.linux-x86_64-2.6/src
creating build/temp.linux-x86_64-2.6/src/lxml
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Isrc/lxml/includes -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.6/src/lxml/lxml.etree.o -w
src/lxml/lxml.etree.c:82:20: error: Python.h: No such file or directory
src/lxml/lxml.etree.c:84:6: error: #error Python headers needed to compile C extensions, please install development version of Python.
Compile failed: command 'gcc' failed with exit status 1
creating tmp
cc -I/usr/include/libxml2 -c /tmp/xmlXPathInitlSjq1o.c -o tmp/xmlXPathInitlSjq1o.o
/tmp/xmlXPathInitlSjq1o.c:1:26: error: libxml/xpath.h: No such file or directory
*********************************************************************************
Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
*********************************************************************************
error: command 'gcc' failed with exit status 1
根據錯誤提示可知,致使安裝出錯的緣由是libxml2和libxslt這兩個庫沒有安裝,並且找不到xslt-config這個可執行程序,解決方法以下:
# sudo yum install libxml2
# sudo yum install libxslt
# sudo yum install libxslt-devel
二、繼續使用pip命令安裝pyspider
# sudo pip install pyspider
...
Running setup.py install for lxml
Complete output from command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-EQanqI/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-WzvLSh-record/install-record.txt --single-version-externally-managed --compile:
Building lxml version 3.6.0.
Building without Cython.
Using build configuration of libxslt 1.1.26
Building against libxml2/libxslt in the following directory: /usr/lib64
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.6
creating build/lib.linux-x86_64-2.6/lxml
copying src/lxml/_elementpath.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/doctestcompare.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/cssselect.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/ElementInclude.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/__init__.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/builder.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/pyclasslookup.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/usedoctest.py -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/sax.py -> build/lib.linux-x86_64-2.6/lxml
creating build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/__init__.py -> build/lib.linux-x86_64-2.6/lxml/includes
creating build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/defs.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/html5parser.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_html5builder.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_setmixin.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/_diffcommand.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/__init__.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/builder.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/clean.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/ElementSoup.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/usedoctest.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/soupparser.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/formfill.py -> build/lib.linux-x86_64-2.6/lxml/html
copying src/lxml/html/diff.py -> build/lib.linux-x86_64-2.6/lxml/html
creating build/lib.linux-x86_64-2.6/lxml/isoschematron
copying src/lxml/isoschematron/__init__.py -> build/lib.linux-x86_64-2.6/lxml/isoschematron
copying src/lxml/lxml.etree.h -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/lxml.etree_api.h -> build/lib.linux-x86_64-2.6/lxml
copying src/lxml/includes/config.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/schematron.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlschema.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/relaxng.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/c14n.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xslt.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xpath.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/uri.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/etreepublic.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/tree.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlerror.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/htmlparser.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/dtdvalid.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xmlparser.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/xinclude.pxd -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/etree_defs.h -> build/lib.linux-x86_64-2.6/lxml/includes
copying src/lxml/includes/lxml-version.h -> build/lib.linux-x86_64-2.6/lxml/includes
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/rng
copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/rng
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl
creating build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.linux-x86_64-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
running build_ext
building 'lxml.etree' extension
creating build/temp.linux-x86_64-2.6
creating build/temp.linux-x86_64-2.6/src
creating build/temp.linux-x86_64-2.6/src/lxml
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/libxml2 -Isrc/lxml/includes -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.6/src/lxml/lxml.etree.o -w
src/lxml/lxml.etree.c:82:20: error: Python.h: No such file or directory
src/lxml/lxml.etree.c:84:6: error: #error Python headers needed to compile C extensions, please install development version of Python.
Compile failed: command 'gcc' failed with exit status 1
creating tmp
cc -I/usr/include/libxml2 -I/usr/include/libxml2 -c /tmp/xmlXPathInitd0Umi6.c -o tmp/xmlXPathInitd0Umi6.o
cc tmp/xmlXPathInitd0Umi6.o -L/usr/lib64 -lxml2 -o a.out
error: command 'gcc' failed with exit status 1
根據錯誤提示可知,致使安裝出錯的緣由是Python.h: No such file or directory,即找不到Python.h這個頭文件,解決方法以下:
# sudo yum install python-devel
至此,再次執行sudo pip install pyspider就能夠完成pyspider的安裝。
三、啓動pyspider,驗證安裝是否成功
# pyspider
Traceback (most recent call last):
File "/usr/bin/pyspider", line 5, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in
working_set.require(__requires__)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: pyquery
根據錯誤提示可知,在執行pyspider時找不到pyquery,但實際上在安裝pyspider時pyquery已經安裝了,經過pip freeze也能夠看到,致使該問題的緣由不詳,解決方法以下:
# sudo pip install -U setuptools
四、安裝pyspider所需使用的驅動
經過單純的pyspider命令來啓動pyspider時,默認會在當前目錄下建立data目錄,其中存放了sqlite數據庫文件,該狀況下無需安裝任務數據庫connector。若是要以分佈式集羣的方式來運行pyspider的話,就須要使用mysql、mongodb或postgresql做爲後臺數據庫,使用rabbitmq、redis、beanstalk或kombu做爲消息隊列。官方推薦的安裝方式是採用pip install pyspider[all]命令來安裝全部驅動,但該方法一方面會安裝一些無用的驅動,另外一方面極有可能致使安裝失敗。所以若是已經肯定後臺數據庫軟件和消息隊列軟件的話,那麼選擇安裝須要的驅動便可。例如採用mongodb做爲後臺數據庫,採用redis做爲消息隊列的話,就能夠經過以下命令安裝:
# sudo pip install pymongo
# sudo pip install redis