要從公衆號導出關注用戶,共須要三步:javascript
公衆號文檔對access_token的簡介:java
URL: https://api.weixin.qq.com/cgi-bin/token?grant_type=client_credential&appid=APPID&secret=APPSECRET 返回結果:{"access_token":"ACCESS_TOKEN","expires_in":7200} 字段解析: grant_type 是 獲取access_token填寫client_credential appid 是 第三方用戶惟一憑證 secret 是 第三方用戶惟一憑證密鑰,即appsecret
獲取access_token的代碼:python
access_token = None def get_access_token(): url = 'https://api.weixin.qq.com/cgi-bin/token' global access_token params = { 'grant_type':'client_credential', 'appid':'xxxxx', #從公衆號上找到本身的appid 'secret':'xxxx', #從公衆號上找到本身的secret } response = get(url, params) # print(response.text) json_data = json.loads(response.text) access_token = json_data['access_token']
access_token的有效期是7200秒,過時須要從新獲取。mysql
公衆號文檔對獲取關注用戶的open_id列表的文檔:sql
URL: https://api.weixin.qq.com/cgi-bin/user/get?access_token=ACCESS_TOKEN&next_openid=NEXT_OPENID 返回結果:{ "total":2, "count":2, "data":{ "openid":["OPENID1","OPENID2"]}, "next_openid":"NEXT_OPENID" } 字段解析: total 關注該公衆帳號的總用戶數 count 拉取的OPENID個數,最大值爲10000 data 列表數據,OPENID的列表 next_openid 拉取列表的最後一個用戶的OPENID
獲取open_id列表的python代碼:數據庫
openids = [] def get_openids(next_openid=''): global openids url = 'https://api.weixin.qq.com/cgi-bin/user/get' params = { 'access_token':access_token, 'next_openid':next_openid, } response = get(url, params) # print(response.text) json_data = json.loads(response.text) count = json_data['count'] openid_list = json_data['data']['openid'] openids.extend(openid_list) next_openid = json_data['next_openid'] logger.info('>>> count=%s, next_openid=%s'%(count,next_openid)) if count == 10000: get_openids(next_openid)
每一個請求最多返回10000個open_id。 第一次請求,next_open_id是空的。 我這裏直接判斷上次返回的open_id的數量。若是返回數量是10000個,就嘗試再請求下一批Open_id。json
公衆號文檔對獲取用戶信息的文檔描述:windows
URL: https://api.weixin.qq.com/cgi-bin/user/info/batchget?access_token=ACCESS_TOKEN 請求參數: { "user_list": [ { "openid": "otvxTs4dckWG7imySrJd6jSi0CWE", "lang": "zh_CN" }, { "openid": "otvxTs_JZ6SEiP0imdhpi50fuSZg", "lang": "zh_CN" } ] } 返回結果: { "subscribe": 1, "openid": "o6_bmjrPTlm6_2sgVt7hMZOPfL2M", "nickname": "Band", "sex": 1, "language": "zh_CN", "city": "廣州", "province": "廣東", "country": "中國", "headimgurl":"http://thirdwx.qlogo.cn/mmopen/g3MonUZtNHkdmzicIlibx6iaFqAc56vxLSUfpb6n5WKSYVY0ChQKkiaJSgQ1dZuTOgvLLrhJbERQQ4eMsv84eavHiaiceqxibJxCfHe/0", "subscribe_time": 1382694957, "unionid": " o6_bmasdasdsad6_2sgVt7hMZOPfL" "remark": "", "groupid": 0, "tagid_list":[128,2], "subscribe_scene": "ADD_SCENE_QR_CODE", "qr_scene": 98765, "qr_scene_str": "" } 字段解析: subscribe 用戶是否訂閱該公衆號標識,值爲0時,表明此用戶沒有關注該公衆號,拉取不到其他信息。 openid 用戶的標識,對當前公衆號惟一 nickname 用戶的暱稱 sex 用戶的性別,值爲1時是男性,值爲2時是女性,值爲0時是未知 city 用戶所在城市 country 用戶所在國家 province 用戶所在省份 language 用戶的語言,簡體中文爲zh_CN headimgurl 用戶頭像,最後一個數值表明正方形頭像大小(有0、4六、6四、9六、132數值可選,0表明640*640正方形頭像),用戶沒有頭像時該項爲空。若用戶更換頭像,原有頭像URL將失效。 subscribe_time 用戶關注時間,爲時間戳。若是用戶曾屢次關注,則取最後關注時間 unionid 只有在用戶將公衆號綁定到微信開放平臺賬號後,纔會出現該字段。 remark 公衆號運營者對粉絲的備註,公衆號運營者可在微信公衆平臺用戶管理界面對粉絲添加備註 groupid 用戶所在的分組ID(兼容舊的用戶分組接口) tagid_list 用戶被打上的標籤ID列表 subscribe_scene 返回用戶關注的渠道來源,ADD_SCENE_SEARCH 公衆號搜索,ADD_SCENE_ACCOUNT_MIGRATION 公衆號遷移,ADD_SCENE_PROFILE_CARD 名片分享,ADD_SCENE_QR_CODE 掃描二維碼,ADD_SCENEPROFILE LINK 圖文頁內名稱點擊,ADD_SCENE_PROFILE_ITEM 圖文頁右上角菜單,ADD_SCENE_PAID 支付後關注,ADD_SCENE_OTHERS 其餘 qr_scene 二維碼掃碼場景(開發者自定義) qr_scene_str 二維碼掃碼場景描述(開發者自定義)
獲取用戶信息的代碼:api
user_info_list=[] def get_unionids(): url = 'https://api.weixin.qq.com/cgi-bin/user/info/batchget?access_token=%s'%access_token total=len(openids) step=100 repeat=int(total/step+(1 if total%step>0 else 0)) global user_info_list for i in range(repeat): low_idx=step*i high_idx=len(openids) if len(openids) <= step*(i+1) else step*(i+1) if len(openids) <= low_idx: break openid_params = [] for openid in openids[low_idx:high_idx]: openid_params.append({'openid':openid,'lang':'zh_CN'}) payload = {'user_list':openid_params} logger.info('step=%s,low index=%s,high index=%s' % (i,low_idx,high_idx)) #json_data = json.dumps(payload) #logger.info(json_data) response = post_json(url, payload) json_data = json.loads(response.text) try: user_infos = json_data['user_info_list'] user_info_list.extend(user_infos) #store_user_infos() #exit() except KeyError: logger.error(response.status_code) logger.error(response.text)
這個請求須要用post發送。 每次只能發送100個open_id。微信
這裏遇到兩個問題:
import sys import io sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='gb18030')
用上面的代碼片斷,將打印輸出流的編碼指定爲gb18030,能夠支持中文輸出
修改數據庫編碼的腳本以下:
ALTER DATABASE hello_moto CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci; ALTER TABLE t_yown_user CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ALTER TABLE tb_user CHANGE name name VARCHAR(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
數據庫的配置(/etc/my.cnf):
[mysql] default-character-set=utf8mb4 [mysqld] character_set_server=utf8mb4 init_connect='SET NAMES utf8mb4'
爲了簡化和重用發送http請求的代碼,作了一個小工具類 http.py:
import requests requests.packages.urllib3.disable_warnings() s=requests.session() #獲取會話對象 def get(url, data=None): response = s.get(url, params=data) if data else s.get(url) return response def post_json(url, payload): myheader = { "Content-Encoding":"application/json; encoding=utf-8", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0", "Accept": "application/json, text/javascript, */*; q=0.01", "Accept-Language": "zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3", "Accept-Encoding": "gzip, deflate, br", "Content-Type": "application/json; charset=utf-8", "Connection": "keep-alive" } response = s.post(url, headers=myheader, json=payload, verify=False) response.encoding = 'utf-8' return response
這裏面的get, post_json函數的使用,在上面的代碼裏有。