python itchat 爬取微信好友信息

時間 2019-12-07

標籤 python itchat 微信好友信息欄目 Python 简体版

原文原文鏈接

原文連接： https://mp.weixin.qq.com/s/4EXgR4GkriTnAzVxluJxmg

「itchat」一個開源的微信我的接口，今天咱們就用itchat爬取微信好友信息，無圖言虛空
三張圖分別是「微信好友頭像拼接圖」、「性別統計圖」、「個性簽名統計圖」 git

「微信好友頭像拼接圖」
github

「性別統計圖」
微信

「個性簽名統計圖」
dom

安裝

pip3 install itchat

主要用到的方法：
itchat.login() 微信掃描二維碼登陸
itchat.get_friends() 返回完整的好友列表，每一個好友爲一個字典, 其中第一項爲本人的帳號信息，傳入update=True, 將更新好友列表並返回, get_friends(update=True)
itchat.get_head_img(userName="") 根據userName獲取好友頭像字體

微信好友頭像拼接圖

獲取好友信息，get_head_img拿到每一個好友的頭像，保存文件，將頭像縮小拼接至一張大圖。
先獲取好友頭像：spa

def headImg():
    itchat.login()
    friends = itchat.get_friends(update=True)
    # itchat.get_head_img() 獲取到頭像二進制，並寫入文件，保存每張頭像
    for count, f in enumerate(friends):
        # 根據userName獲取頭像
        img = itchat.get_head_img(userName=f["UserName"])
        imgFile = open("img/" + str(count) + ".jpg", "wb")
        imgFile.write(img)
        imgFile.close()

這裏須要提早在同目錄下新建了文件夾img，不然會報No such file or directory錯誤，img用於保存頭像圖片，遍歷好友列表，根據下標count命名頭像，到這裏能夠看到文件夾裏已經保存了全部好友的頭像。 code

接下來就是對頭像進行拼接 orm

遍歷文件夾的圖片，random.shuffle(imgs)將圖片順序打亂blog

用640*640的大圖來平均分每一張頭像，計算出每張正方形小圖的長寬，壓縮頭像，拼接圖片，一行排滿，換行拼接，好友頭像多的話，能夠適當增長大圖的面積，具體代碼以下：接口

def createImg():
    x = 0
    y = 0
    imgs = os.listdir("img")
    random.shuffle(imgs)
    # 建立640*640的圖片用於填充各小圖片
    newImg = Image.new('RGBA', (640, 640))
    # 以640*640來拼接圖片，math.sqrt()開平方根計算每張小圖片的寬高，
    width = int(math.sqrt(640 * 640 / len(imgs)))
    # 每行圖片數
    numLine = int(640 / width)

    for i in imgs:
        img = Image.open("img/" + i)
        # 縮小圖片
        img = img.resize((width, width), Image.ANTIALIAS)
        # 拼接圖片，一行排滿，換行拼接
        newImg.paste(img, (x * width, y * width))
        x += 1
        if x >= numLine:
            x = 0
            y += 1

    newImg.save("all.png")

好友頭像圖成型，頭像是隨機打亂拼接的

性別統計圖

一樣itchat.login()登陸獲取好友信息，根據Sex字段判斷性別，1 表明男性（man），2 表明女性（women），3 未知（unknown）

def getSex():
    itchat.login()
    friends = itchat.get_friends(update=True)
    sex = dict()
    for f in friends:
        if f["Sex"] == 1: #男
            sex["man"] = sex.get("man", 0) + 1
        elif f["Sex"] == 2: #女
            sex["women"] = sex.get("women", 0) + 1
        else: #未知
            sex["unknown"] = sex.get("unknown", 0) + 1
    # 柱狀圖展現
    for i, key in enumerate(sex):
        plt.bar(key, sex[key])
    plt.show()

性別統計柱狀圖

個性簽名統計圖

獲取好友信息，Signature字段是好友的簽名，將個性簽名保存到.txt文件，部分簽名裏有表情之類的會變成emoji 類的詞，將這些還有特殊符號的替換掉。

def getSignature():
    itchat.login()
    friends = itchat.get_friends(update=True)
    file = open('sign.txt', 'a', encoding='utf-8')
    for f in friends:
        signature = f["Signature"].strip().replace("emoji", "").replace("span", "").replace("class", "")
        # 正則匹配
        rec = re.compile("1f\d+\w*|[<>/=]")
        signature = rec.sub("", signature)
        file.write(signature + "\n")

sign.txt文件裏寫入了全部好友的個性簽名，使用wordcloud包生成詞雲圖，pip install wordcloud
一樣能夠採用jieba分詞生成詞圖，不使用分詞的話就是句子展現，使用jieba分詞的話能夠適當把max_font_size屬性調大，好比100。
須要注意的是運行不要在虛擬環境下，deactivate 退出虛擬環境再跑，詳細代碼以下：

# 生成詞雲圖
def create_word_cloud(filename):
    # 讀取文件內容
    text = open("{}.txt".format(filename), encoding='utf-8').read()

    # 註釋部分採用結巴分詞
    # wordlist = jieba.cut(text, cut_all=True)
    # wl = " ".join(wordlist)

    # 設置詞雲
    wc = WordCloud(
        # 設置背景顏色
        background_color="white",
        # 設置最大顯示的詞雲數
        max_words=2000,
        # 這種字體都在電腦字體中，window在C:\Windows\Fonts\下，mac下可選/System/Library/Fonts/PingFang.ttc 字體
        font_path='C:\\Windows\\Fonts\\simfang.ttf',
        height=500,
        width=500,
        # 設置字體最大值
        max_font_size=60,
        # 設置有多少種隨機生成狀態，即有多少種配色方案
        random_state=30,
    )

    myword = wc.generate(text)  # 生成詞雲 若是用結巴分詞的話，使用wl 取代 text， 生成詞雲圖
    # 展現詞雲圖
    plt.imshow(myword)
    plt.axis("off")
    plt.show()
    wc.to_file('signature.png')  # 把詞雲保存下

句子圖

使用jieba分詞產生的詞雲圖

看來，「努力」「生活」仍是很重要的

itchat 除了以上的信息，還有省市區等等信息均可以抓取，另外還能夠實現機器人自動聊天等功能，這裏就不一一律述了。

最後附上github地址：https://github.com/taixiang/itchat_wechat

歡迎關注個人博客：https://blog.manjiexiang.cn/
歡迎關注微信號：春風十里不如認識你