使用urllib2的HttpResponse致使內存不回收(內存泄漏)

  • 問題出現環境:python 2.7.1(X)及如下, Windows(或CentOS)

這個問題產生在lib/urllib2.py的line 1174 (python 2.7.1),致使造成了cycle,即便調用gc.collect()也不能釋放到HttpResponse等相關聯對象(gc.garbage能夠查看)python

 1    r.recv = r.read
 2 
 3         fp = socket._fileobject(r, close=True)
 4 
 5          resp = addinfourl(fp, r.msg, req.get_full_url())
 6 
 7         resp.code = r.status
 8 
 9         resp.msg = r.reason
10 
11         return resp 

在python官方網站上很早發現了此BUG(見如下兩個issues),但就是沒有正式解決此問題。不過如下兩個threads能夠獲得workarounds。shell

 

http://bugs.python.org/issue1208304socket

http://bugs.python.org/issue7464字體


  • 引伸一下,若是python代碼寫成這樣(本身寫代碼犯的一個錯誤),會致使以上相同cycle問題,從而致使內存泄漏。
1 class T(object):
2     def __init__(self):
3         self.test = self.test0
4         
5     def test0(self, d={}):
6         d['a'] = 1

在python shell運行以下:網站

 1 Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
 2 Type "help", "copyright", "credits" or "license" for more information.
 3 >>> import gc
 4 >>> gc.set_debug(gc.DEBUG_LEAK)
 5 >>> class T(object):
 6 ...     def __init__(self):
 7 ...         self.test = self.test0
 8 ...
 9 ...     def test0(self, d={}):
10 ...         d['a'] = 1
11 ...
12 >>> t=T()
13 >>> del t
14 >>> gc.collect()
15 gc: collectable <T 0260D870>
16 gc: collectable <instancemethod 01DCFDF0>
17 gc: collectable <dict 0260EA50>
18 3
19 >>> for _item in gc.garbage:
20 ...     print _item
21 ...
22 <__main__.T object at 0x0260D870>
23 <bound method T.test0 of <__main__.T object at 0x0260D870>>
24 {'test': <bound method T.test0 of <__main__.T object at 0x0260D870>>}

致使不能釋放內存便是以上紅色字體部分,能夠經過調用GC自帶兩方法查看爲何會造成cycle。ui

 1 >>> t2=T()
 2 >>> gc.get_referrers(t2)
 3 [<bound method T.test0 of <__main__.T object at 0x0260D890>>, {'__builtins__': <module '__builtin__' (built-in)>, 't2': <__main__.T object at 0x0260D890>, '__package__': None, 'gc'
 4 : <module 'gc' (built-in)>, 'T': <class '__main__.T'>, '__name__': '__main__', '__doc__': None, '_item': {'test': <bound method T.test0 of <__main__.T object at 0x0260D870>>}}]
 5 >>> for _item in gc.get_referrers(t2):
 6 ...     print _item
 7 ...
 8 <bound method T.test0 of <__main__.T object at 0x0260D890>>
 9 {'__builtins__': <module '__builtin__' (built-in)>, 't2': <__main__.T object at 0x0260D890>, '__package__': None, 'gc': <module 'gc' (built-in)>, 'T': <class '__main__.T'>, '__name
10 __': '__main__', '__doc__': None, '_item': {...}}
11 >>> for _item in gc.get_referents(t2):
12 ...     print _item
13 ...
14 {'test': <bound method T.test0 of <__main__.T object at 0x0260D890>>}
15 <class '__main__.T'>
gc.get_referrers:Return the list of objects that directly refer to any of objs.
返回引用t2的對象,包括<bound method T.test0 of <__main__.T object at 0x0260D890>>對象
gc.get_referents:Return a list of objects directly referred to by any of the arguments.
返回被t2引用的對象,包括<bound method T.test0 of <__main__.T object at 0x0260D890>>對象
  • 如下狀況不產生cycle:
 1 class T2(object):
 2     def __init__(self):
 3         pass
 4         
 5     def test(self):
 6         return self.test0()
 7         
 8     def test0(self, d={}):
 9         d['a'] = 1
10 class T3(object):
11     def __init__(self):
12         self.test = self.test0
13         
14     @classmethod    
15     def test0(cls, d={}):
16         d['a'] = 1
17     kkk = test0
相關文章
相關標籤/搜索