Win7 64bit 安裝爬蟲Scrapy

時間 2019-11-06

標籤 win7 win 64bit bit 安裝爬蟲 scrapy 欄目 Windows 简体版

原文原文鏈接

安裝Scrapy進過的坑

在學習爬蟲的時候，也上網搜過很多相關教程，最終決定選擇在Linux上開發，只能用虛擬機了，可是虛擬機比較卡，也比較佔用系統資源，因此決定嘗試在Windows win7上安裝爬蟲Scrapy，能夠說安裝過程是這個坑跳到那個坑，累覺不愛啊。後來通過多方打探，終於找到一款安裝Scrapy的利器，真正的利器，下面放上地址：https://www.continuum.io/downloadscss

安裝Python2.7

系統版本：Win7 64位python

選擇的版本爲2.7，由於2.7比較成熟，點擊下載，一路安裝，其中有一個界面是選擇是否要覆蓋本地已經安裝的Python版本，選擇是，最好是和安裝包一塊兒配套安裝，否則會出現不可知的錯誤。或者直接卸載本地已經安裝的Python版本，目錄手動刪除。我就是先卸載本地安裝的版本，刪除目錄，而後一路next，這樣更省心。默認會安裝最新版本的Python。git

安裝完成後，檢測Python版本，以管理員身份打開cmd：github

使用命令：pythonshell

說明已是最新的版本了，這下就放心了。api

安裝爬蟲Scrapy

使用命令：conda intall scrapybash

C:\Windows\System32>
C:\Windows\System32>conda install scrapy
Fetching package metadata .........
Solving package specifications: ..........

Package plan for installation in environment C:\Program Files\Anaconda2:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    twisted-16.6.0             |           py27_0         4.4 MB
    service_identity-16.0.0    |           py27_0          13 KB
    scrapy-1.1.1               |           py27_0         378 KB
    ------------------------------------------------------------
                                           Total:         4.8 MB

The following NEW packages will be INSTALLED:

    attrs:            15.2.0-py27_0
    conda-env:        2.6.0-0
    constantly:       15.1.0-py27_0
    cssselect:        1.0.0-py27_0
    incremental:      16.10.1-py27_0
    parsel:           1.0.3-py27_0
    pyasn1-modules:   0.0.8-py27_0
    pydispatcher:     2.0.5-py27_0
    queuelib:         1.4.2-py27_0
    scrapy:           1.1.1-py27_0
    service_identity: 16.0.0-py27_0
    twisted:          16.6.0-py27_0
    w3lib:            1.16.0-py27_0
    zope:             1.0-py27_0
    zope.interface:   4.3.2-py27_0

The following packages will be UPDATED:

    conda:            4.2.9-py27_0   --> 4.2.13-py27_0

Proceed ([y]/n)? y

Fetching packages ...
An unexpected error has occurred.                   | ETA:  0:11:48   4.17 kB/s
Please consider posting the following information to the
conda GitHub issue tracker at:

    https://github.com/conda/conda/issues



Current conda install:

               platform : win-64
          conda version : 4.2.9
       conda is private : False
      conda-env version : 4.2.9
    conda-build version : 2.0.2
         python version : 2.7.12.final.0
       requests version : 2.11.1
       root environment : C:\Program Files\Anaconda2  (writable)
    default environment : C:\Program Files\Anaconda2
       envs directories : C:\Program Files\Anaconda2\envs
          package cache : C:\Program Files\Anaconda2\pkgs
           channel URLs : https://repo.continuum.io/pkgs/free/win-64/
                          https://repo.continuum.io/pkgs/free/noarch/
                          https://repo.continuum.io/pkgs/pro/win-64/
                          https://repo.continuum.io/pkgs/pro/noarch/
                          https://repo.continuum.io/pkgs/msys2/win-64/
                          https://repo.continuum.io/pkgs/msys2/noarch/
            config file : None
           offline mode : False



`$ C:\Program Files\Anaconda2\Scripts\conda-script.py install scrapy`




    Traceback (most recent call last):
      File "C:\Program Files\Anaconda2\lib\site-packages\conda\exceptions.py", l
ine 473, in conda_exception_handler
        return_value = func(*args, **kwargs)
      File "C:\Program Files\Anaconda2\lib\site-packages\conda\cli\main.py", lin
e 144, in _main
        exit_code = args.func(args, p)
      File "C:\Program Files\Anaconda2\lib\site-packages\conda\cli\main_install.
py", line 80, in execute
        install(args, parser, 'install')
      File "C:\Program Files\Anaconda2\lib\site-packages\conda\cli\install.py",
line 420, in install
        raise CondaRuntimeError('RuntimeError: %s' % e)
    CondaRuntimeError: Runtime error: RuntimeError: Runtime error: Could not ope
n u'C:\\Program Files\\Anaconda2\\pkgs\\twisted-16.6.0-py27_0.tar.bz2.part' for
writing (HTTPSConnectionPool(host='repo.continuum.io', port=443): Read timed out
.).

手動安裝twisted庫

發現是在安裝Twisted庫的時候超時了，因此呢，就單獨安裝這個庫吧scrapy

使用命令：conda install twistedide

C:\Windows\System32>conda install twisted
Fetching package metadata .........
Solving package specifications: ..........

Package plan for installation in environment C:\Program Files\Anaconda2:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    twisted-16.6.0             |           py27_0         4.4 MB

The following NEW packages will be INSTALLED:

    conda-env:      2.6.0-0
    constantly:     15.1.0-py27_0
    incremental:    16.10.1-py27_0
    twisted:        16.6.0-py27_0
    zope:           1.0-py27_0
    zope.interface: 4.3.2-py27_0

The following packages will be UPDATED:

    conda:          4.2.9-py27_0   --> 4.2.13-py27_0

Proceed ([y]/n)? y

Fetching packages ...
twisted-16.6.0 100% |###############################| Time: 0:01:09  66.89 kB/s
Extracting packages ...
[      COMPLETE      ]|##################################################| 100%
Unlinking packages ...
[      COMPLETE      ]|##################################################| 100%
Linking packages ...
[      COMPLETE      ]|##################################################| 100%

顯示安裝成功，沒有任何錯誤，而後開始安裝爬蟲Scrapypost

使用命令：conda install scrapy

C:\Windows\System32>conda install scrapy
Fetching package metadata .........
Solving package specifications: ..........

Package plan for installation in environment C:\Program Files\Anaconda2:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    service_identity-16.0.0    |           py27_0          13 KB
    scrapy-1.1.1               |           py27_0         378 KB
    ------------------------------------------------------------
                                           Total:         391 KB

The following NEW packages will be INSTALLED:

    attrs:            15.2.0-py27_0
    cssselect:        1.0.0-py27_0
    parsel:           1.0.3-py27_0
    pyasn1-modules:   0.0.8-py27_0
    pydispatcher:     2.0.5-py27_0
    queuelib:         1.4.2-py27_0
    scrapy:           1.1.1-py27_0
    service_identity: 16.0.0-py27_0
    w3lib:            1.16.0-py27_0

Proceed ([y]/n)? y

Fetching packages ...
service_identi 100% |###############################| Time: 0:00:00  68.39 kB/s
scrapy-1.1.1-p 100% |###############################| Time: 0:00:05  65.50 kB/s
Extracting packages ...
[      COMPLETE      ]|##################################################| 100%
Linking packages ...
[      COMPLETE      ]|##################################################| 100%

剛纔已經安裝過Twisted庫了，此次不會超時了，顯示安裝成功，沒有任何報錯

測試安裝Scrapy是否成功

測試是否已經安裝成功了，

測試命令：scrapy

scrapy startproject hello

C:\Windows\System32>scrapy
Scrapy 1.1.1 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  commands
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

  [ more ]      More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command

C:\Windows\System32>d:

D:\>dir
 驅動器 D 中的卷沒有標籤。
 卷的序列號是 0002-9E3C

 D:\ 的目錄

2016/12/03  12:20       399,546,128 Anaconda2-4.2.0-Windows-x86_64.exe
2016/12/03  09:43    <DIR>          Program Files (x86)
2016/12/03  16:57    <DIR>          python-project
2016/12/03  09:43    <DIR>          新建文件夾
2016/12/03  12:19    <DIR>          迅雷下載
               1 個文件    399,546,128 字節
               4 個目錄 38,932,201,472 可用字節

D:\>cd python-project

D:\python-project>scrapy startproject hello
New Scrapy project 'hello', using template directory 'C:\\Program Files\\Anacond
a2\\lib\\site-packages\\scrapy\\templates\\project', created in:
    D:\python-project\hello

You can start your first spider with:
    cd hello
    scrapy genspider example example.com

D:\python-project>tree /f
文件夾 PATH 列表
卷序列號爲 0002-9E3C
D:.
└─hello
    │  scrapy.cfg
    │
    └─hello
        │  items.py
        │  pipelines.py
        │  settings.py
        │  __init__.py
        │
        └─spiders
                __init__.py


D:\python-project>

能夠看出能夠使用scrapy命令建立爬蟲工程，剩下的就是快樂的啪啪啪吧。