寫給已有編程經驗的 Python 初學者的總結

時間 2019-11-21

原文原文鏈接

當我開始學習Python的時候，有些事我但願我一早就知道。我花費了不少時間才學會這些東西。我想要把這些重點都編纂到一篇文章當中。這篇文章的目標讀者，是剛剛開始學習Python語言的有經驗的程序員，想要跳過前幾個月研究Python使用的那些他們已經在用的相似工具。包管理和標準工具這兩節對於初學者來講一樣頗有幫助。html

個人經驗主要基於Python 2.7，可是大多數的工具對任何版本都有效。python

若是你歷來沒有使用過Python，我強烈建議你閱讀Python introduction，由於你須要知道基本的語法和類型。linux

包管理

Python世界最棒的地方之一，就是大量的第三方程序包。一樣，管理這些包也很是容易。按照慣例，會在 requirements.txt 文件中列出項目所須要的包。每一個包占一行，一般還包含版本號。這裏有一個例子，本博客使用Pelican：程序員

pelican==3.3
Markdown
pelican-extended-sitemap==1.0.0

Python 程序包有一個缺陷是，它們默認會進行全局安裝。咱們將要使用一個工具，使咱們每一個項目都有一個獨立的環境，這個工具叫virtualenv。咱們一樣要安裝一個更高級的包管理工具，叫作pip，他能夠和virtualenv配合工做。shell

首先，咱們須要安裝pip。大多數python安裝程序已經內置了easy_install（python默認的包管理工具），因此咱們就使用easy_install pip來安裝pip。這應該是你最後一次使用easy_install 了。若是你並無安裝easy_install ，在linux系統中，貌似從python-setuptools 包中能夠得到。數據庫

若是你使用的Python版本高於等於3.3，那麼Virtualenv 已是標準庫的一部分了，因此沒有必要再去安裝它了。編程

下一步，你但願安裝virtualenv和virtualenvwrapper。Virtualenv使你可以爲每一個項目創造一個獨立的環境。尤爲是當你的不一樣項目使用不一樣版本的包時，這一點特別有用。Virtualenv wrapper 提供了一些不錯的腳本，可讓一些事情變得容易。api

sudo pip install virtualenvwrapper

當virtualenvwrapper安裝後，它會把virtualenv列爲依賴包，因此會自動安裝。app

打開一個新的shell，輸入mkvirtualenv test 。若是你打開另一個shell，則你就不在這個virtualenv中了，你能夠經過workon test 來啓動。若是你的工做完成了，可使用deactivate 來停用。python2.7

IPython

IPython是標準Python交互式的編程環境的一個替代品，支持自動補全，文檔快速訪問，以及標準交互式編程環境本應該具有的不少其餘功能。

當你處在一個虛擬環境中的時候，能夠很簡單的使用pip install ipython 來進行安裝，在命令行中使用ipython 來啓動

另外一個不錯的功能是」筆記本」，這個功能須要額外的組件。安裝完成後，你可使用ipython notebook，並且會有一個不錯的網頁UI，你能夠建立筆記本。這在科學計算領域很流行。

測試

我推薦使用nose或是py.test。我大部分狀況下用nose。它們基本上是相似的。我將講解nose的一些細節。

這裏有一我的爲建立的好笑的使用nose進行測試的例子。在一個以test_開頭的文件中的全部以test_開頭的函數，都會被調用：

def test_equality():
    assert True == False

不出所料，當運行nose的時候，咱們的測試沒有經過。

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests                                                                                                                                      
F
======================================================================
FAIL: test_nose_example.test_equality
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jhaddad/.virtualenvs/test/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line 3, in test_equality
    assert True == False
AssertionError

----------------------------------------------------------------------

nose.tools中一樣也有一些便捷的方法能夠調用

from nose.tools import assert_true
def test_equality():
    assert_true(False)

若是你想使用更加相似JUnit的方法，也是能夠的：

from nose.tools import assert_true
from unittest import TestCase

class ExampleTest(TestCase):

    def setUp(self): # setUp & tearDown are both available
        self.blah = False

    def test_blah(self):
        self.assertTrue(self.blah)

開始測試：

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests                                                                                                                                      
F
======================================================================
FAIL: test_blah (test_nose_example.ExampleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line 11, in test_blah
    self.assertTrue(self.blah)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 1 test in 0.003s

FAILED (failures=1)

卓越的Mock庫包含在Python 3 中，可是若是你在使用Python 2，可使用pypi來獲取。這個測試將進行一個遠程調用，可是此次調用將耗時10s。這個例子顯然是人爲捏造的。咱們使用mock來返回樣本數據而不是真正的進行調用。

import mock

from mock import patch
from time import sleep

class Sweetness(object):
    def slow_remote_call(self):
        sleep(10)
        return "some_data" # lets pretend we get this back from our remote api call

def test_long_call():
    s = Sweetness()
    result = s.slow_remote_call()
    assert result == "some_data"

固然，咱們的測試須要很長的時間。

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests test_mock.py                                                                                                                         

Ran 1 test in 10.001s

OK

太慢了！所以咱們會問本身，咱們在測試什麼？咱們須要測試遠程調用是否有用，仍是咱們要測試當咱們得到數據後要作什麼？大多數狀況下是後者。讓咱們擺脫這個愚蠢的遠程調用吧：

import mock

from mock import patch
from time import sleep

class Sweetness(object):
    def slow_remote_call(self):
        sleep(10)
        return "some_data" # lets pretend we get this back from our remote api call

def test_long_call():
    s = Sweetness()
    with patch.object(s, "slow_remote_call", return_value="some_data"):
        result = s.slow_remote_call()
    assert result == "some_data"

好吧，讓咱們再試一次：

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests test_mock.py                                                                                                                         
.
----------------------------------------------------------------------
Ran 1 test in 0.001s

OK

好多了。記住，這個例子進行了荒唐的簡化。就我我的來說，我僅僅會忽略從遠程系統的調用，而不是個人數據庫調用。

nose-progressive是一個很好的模塊，它能夠改善nose的輸出，讓錯誤在發生時就顯示出來，而不是留到最後。若是你的測試須要花費必定的時間，那麼這是件好事。
pip install nose-progressive 而且在你的nosetests中添加--with-progressive

調試

iPDB是一個極好的工具，我已經用它查出了不少匪夷所思的bug。pip install ipdb 安裝該工具，而後在你的代碼中import ipdb; ipdb.set_trace()，而後你會在你的程序運行時，得到一個很好的交互式提示。它每次執行程序的一行而且檢查變量。

python內置了一個很好的追蹤模塊，幫助我搞清楚發生了什麼。這裏有一個沒什麼用的python程序：

a = 1
b = 2
a = b

這裏是對這個程序的追蹤結果：

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ python -m trace --trace tracing.py                                                                                                        1 ↵  
 --- modulename: tracing, funcname: <module>
tracing.py(1): a = 1
tracing.py(2): b = 2
tracing.py(3): a = b
 --- modulename: trace, funcname: _unsettrace
trace.py(80):         sys.settrace(None)

當你想要搞清楚其餘程序的內部構造的時候，這個功能很是有用。若是你之前用過strace，它們的工做方式很相像

在一些場合，我使用pycallgraph來追蹤性能問題。它能夠建立函數調用時間和次數的圖表。

最後，objgraph對於查找內存泄露很是有用。這裏有一篇關於如何使用它查找內存泄露的好文。

Gevent

Gevent 是一個很好的庫，封裝了Greenlets，使得Python具有了異步調用的功能。是的，很是棒。我最愛的功能是Pool，它抽象了異步調用部分，給咱們提供了能夠簡單使用的途徑，一個異步的map()函數：

from gevent import monkey
monkey.patch_all()

from time import sleep, time

def fetch_url(url):
    print "Fetching %s" % url
    sleep(10)
    print "Done fetching %s" % url

from gevent.pool import Pool

urls = ["http://test.com", "http://bacon.com", "http://eggs.com"]

p = Pool(10)

start = time()
p.map(fetch_url, urls)
print time() - start

很是重要的是，須要注意這段代碼頂部對gevent monkey進行的補丁，若是沒有它的話，就不能正確的運行。若是咱們讓Python連續調用 fetch_url 3次，一般咱們指望這個過程花費30秒時間。使用gevent：

(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ python g.py                                                                                                                                    
Fetching http://test.com
Fetching http://bacon.com
Fetching http://eggs.com
Done fetching http://test.com
Done fetching http://bacon.com
Done fetching http://eggs.com
10.001791954

若是你有不少數據庫調用或是從遠程URLs獲取，這是很是有用的。我並非很喜歡回調函數，因此這一抽象對我來講效果很好。