Twitter——咱們的機器人相互交流之地

時間 2019-11-25

標籤咱們機器人相互交流之地简体版

原文原文鏈接

全文共4713字，預計學習時長9分鐘python

圖片由Jonny Lindner拍攝，源自Pixabaylinux

閱讀此文，看看如何用Python建立Twitter機器人來進行一項有爭議的實驗的。本文包含全部的代碼和詳細說明。docker

摘要json

本文會詳細說明我是如何建立一個Twitter聊天機器人的，它能利用一些人類的基本情感來創造粉絲。這種事在咱們身邊天天都會發生。我認爲重要的是要讓你們知道這很容易。我會分享一些有趣的經歷，並向你們展現在Twitter中插入機器人的實例。api

本文後半部分對機器人的代碼和功能進行了詳細描述。Github上有當前的產品代碼。我在AWS上運行機器人，並經過Serverless架構對其進行部署，這連AWS每個月免費層級的5%都用不到。因此值得一試。微信

動機架構

圖片由Patrick Fore拍攝，源自Unsplashapp

我想寫一些文章，幫助人們學會使用Python，這樣也能幫助拓寬他們的技能。部分緣由是一些團隊成員想學習Python。開始的時候，我告訴本身，先作兩個月左右，看看結果如何。若是萬不得已，我會找一些結構清晰、脈絡清楚的文章來教你們使用Python。相信每一個人均可以從中學到一些Python的知識。less

Twitterdom

圖片由Nicholas Green拍攝

Tweepy

圖片由Safar Safarov拍攝，源自Unsplas

Tweepy是一個能夠訪問Twitter API的Python庫。 tweepy裏的文檔看起來很整潔，代碼維護得很好。我想試試這個。

憑證

設置身份驗證和獲取憑證比想象得要更容易：

1.轉到Twitter的開發人員頁面

2.登陸Twitter帳號

3.建立一個應用程序並獲取憑證（在Keys and Tokens下，參見紅色圓圈）

建立一個Twitter應用程序的步驟

機器人的結構

我用了一個簡單的API調用測試了Jupyter Notebook中的憑證，一切彷佛都沒什麼問題。如今是時候開始下一步了。個人twitter機器人應該作兩件事：

· 創造粉絲

· 宣傳Medium的文章

創造粉絲

幾年前，我作了一個Instagram機器人的實驗，我知道了如何實現第一個目標。這是不道德的，由於它利用了人類渴望被喜歡、被承認的心理，它所創造的一切都是假的。但再說一遍，這是Twitter。此外，我認爲有必要聊聊具體的作法。應當向你們演示一下這些機器人是如何工做的，而且其效果如何，這是很是重要的。此外，也頗有必要向你們展現，這些機器人天天都在普遍使用。

機器人的工做方式是——給予人們承認和關注:

1.與用戶互動（好比轉發，評論他們的推文，並關注他們）

2.等待並觀察

3.看他們回關你

4.再等一段時間，而後取關他們

所以，暫且拋開全部的倫理問題，下面是相應的代碼。

①與用戶互動

在項目中，我一般會將配置模塊用做配置設定的抽象層。

import os

import yaml as _yaml

import logging

logger = logging.getLogger()

logger.setLevel(logging.INFO)

defget_config():

config_path = os.path.join(os.path.dirname(__file__), '..', 'config', 'production.yml')

try:

withopen(config_path) as config_file:

return _yaml.load(config_file)

exceptFileNotFoundError:

logger.error(f'You probably forgot to create a production.yml, as we could not find {config_path}')

raise

defget_post_data():

data_path = os.path.join(os.path.dirname(__file__), '..', 'config', 'post_data.yml')

withopen(data_path) as config_file:

return _yaml.load(config_file)

twitter_config.py hosted with ❤by GitHub

bots.config

配置以下所示:

# Write Access also

API_KEY : "YOUR API KEY HERE"

API_KEY_SECRET : "YOUR API SECRET HERE"

ACCESS_TOKEN : "YOUR ACCESS TOKEN HERE"

ACCESS_TOKEN_SECRET : "YOUR ACCESS TOKEN SECRET HERE"

twitter_sample_config.yml hosted with ❤by GitHub

production.yml

而後能夠設置一個模塊來提供Twitter API，以下所示:

import tweepy

from bots.config import get_config

__API=None

defconfigure_twitter_api():

API_KEY= get_config()['API_KEY']

API_KEY_SECRET= get_config()['API_KEY_SECRET']

ACCESS_TOKEN= get_config()['ACCESS_TOKEN']

ACCESS_TOKEN_SECRET= get_config()['ACCESS_TOKEN_SECRET']

auth = tweepy.OAuthHandler(API_KEY, API_KEY_SECRET)

auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

return api

defget_twitter_api():

global__API

ifnot__API:

__API= configure_twitter_api()

return__API

twitter_api.py hosted with ❤by GitHub

bots.twitter_api

下面的代碼包含了交互邏輯。

import tweepy

from bots.twitter_api import get_twitter_api

import bots.utils as _utils

import datetime

import logging

import random

import time

logger = logging.getLogger()

logger.setLevel(logging.INFO)

COMMENTS= [

'Nice piece!', 'Interesting', '', 'I am going to read up on this', 'Thanks for sharing!', 'This is helpful',

'Insightful', 'thought-provoking', 'Will check this out'

]

HASHTAG_SETS= [

{'Python', 'DataScience', 'Machinelearning'},

{'Python', 'Keras'},

{'Python', 'DataScience'},

{'Python', 'Pandas'},

{'Python', 'PyTorch', 'Machinelearning'},

{'Python', 'Scikitlearn'},

{'Python', 'Statisitcs'},

]

deffetch_most_original_tweets(user):

results = []

for tweet in get_twitter_api().user_timeline(user.id, count=20):

ifnot (tweet.retweeted or tweet.in_reply_to_status_id):

tweet.score = score_tweet(tweet)

results.append(tweet)

return results

definteract_with_user(user, following_history, hashtags):

ifnot user.following:

logger.info(f"Following {user.name}")

user.follow()

following_history[user.id_str] = {'followed_at': datetime.datetime.now().isoformat()}

user_tweets =sorted(fetch_most_original_tweets(user), key=lambda x: x.score, reverse=True)

iflen(user_tweets) >0:

interactions =0

for tweet in user_tweets:

tags = {tag['text'].lower() for tag in tweet.entities.get('hashtags')}

lower_given_tag = {tag.lower() for tag in hashtags}

for given_tag in lower_given_tag:

if given_tag in tweet.text.lower():

found_tag_in_text =True

break

else:

found_tag_in_text =False

if (len(tags & lower_given_tag) >0) or found_tag_in_text:

interaction =0

if random.random() >0.95:

comment =f'@{user.screen_name}{random.choice(COMMENTS)}'

logger.info(f"Commenting: {tweet.id} with: {comment}")

get_twitter_api().update_status(

comment,

in_reply_to_status_id=tweet.id_str,

auto_populate_reply_metadata=True

)

time.sleep(random.random()/2)

interaction |=1

ifnot tweet.favorited and (random.random() >.5) and tweet.lang =='en':

logger.info(f"Hearting: {tweet.id} with text: {tweet.text}")

get_twitter_api().create_favorite(tweet.id)

time.sleep(random.random() *5)

interaction |=1

if random.random() >0.95:

logger.info(f"Retweeting: {tweet.id}")

logger.info(f"Text: {tweet.text}")

get_twitter_api().retweet(tweet.id)

time.sleep(random.random())

interaction |=1

interactions += interaction

if interactions ==2:

break

defscore_tweet(tweet):

favorites = _utils.scaled_sigmoid(x=-tweet.favorite_count, stretch=2, max_score=50, center=3)

retweets = _utils.scaled_sigmoid(x=-tweet.retweet_count, stretch=1, max_score=50, center=2)

age = _utils.created_at_score(tweet, stretch=2, max_score=30, center=3)

score = favorites + retweets + age

return score

defscore_user(user):

followed_to_following = _utils.followed_to_following_ratio(user)

followers = _utils.scaled_sigmoid(x=-user.followers_count, stretch=200, max_score=100, center=300)

age = _utils.created_at_score(user, stretch=50, max_score=30, center=60)

score = followed_to_following + followers + age

return score

defget_users_from_recent_tweets(cnt=10, hashtags=None):

q =' AND '.join([f'#{tag}'for tag in hashtags])

users = []

for tweet in tweepy.Cursor(get_twitter_api().search, q=q, lang="en", count=cnt, result_type='recent').items(cnt):

users.append(tweet.user)

return users

deffetchfollow(event=None, context=None):

hashtags = random.choice(HASHTAG_SETS)

# monkey-patch the tweepy User class by adding a hashfunction, which we will need to quickly get unique users

tweepy.models.User.__hash__=lambda self: hash(self.id_str)

users =list(set(get_users_from_recent_tweets(cnt=250, hashtags=hashtags)))

# score users

for user in users:

user.score = score_user(user)

# sort users by score

users =sorted(users, key=lambda x: x.score, reverse=True)

logger.info(f"Found {len(users)}")

following_history = _utils.get_s3_data('following.json')

max_interactions =10

interactions =0

for user in users:

time.sleep(random.random() *10+2)

if user.id_str notin following_history:

try:

logger.info(f"Interacting with {user.name}")

interact_with_user(user, following_history, hashtags)

interactions +=1

exceptExceptionas e:

logger.error(f'Syncing followers history on error: {e}')

_utils.sync_s3_data(following_history)

raise

if interactions >= max_interactions:

break

logger.info('Syncing followers history on ordinary termination')

_utils.sync_s3_data(following_history)

defcomment_tweet(user, tweet):

comment =f'@{user.screen_name}{random.choice(COMMENTS)}'

logger.info(f"Commenting: {tweet.id} with: {comment}")

get_twitter_api().update_status(

comment,

in_reply_to_status_id=tweet.id_str,

auto_populate_reply_metadata=True

)

if__name__=='__main__':

fetchfollow()

fetchfollow.py hosted with ❤by GitHub

bot.fetchfollow

先從兩個變量開始：COMMENTS和HASHTAG_SETS，後面也會引用這兩個變量，只要給定內容和名稱，它們的用法是顯而易見的。COMMENTS列表存儲了一組通用的積極類的評價，HASHTAG_SETS存儲了一系列不一樣的用於搜索的標籤組合。

主要函數是fetchfollow，它執行如下操做:

· 使用HASHTAG_SETS中的隨機標籤來搜索推特。

· 找到這些推特的用戶。根據這些用戶的粉絲數量（越少越好），粉絲-關注比率（越低越好）和帳號使用時長（越新越好），對用戶進行評分，並按得分進行排序，得分最高的（即最有可能會回關你的用戶）即爲第一，得分最低的即爲最後一名。

· 從S3獲取following_history，此文件包含了關注每一個用戶的日期（以及以後對他們取關的日期）。

· 與不在following_history中的用戶從最高分到最低分進行互動，（每位最多10次，畢竟咱們不想觸發機器人警報）。在互動時，給包含咱們標籤的推特打分，而後隨機點贊，評論和轉發這些推特。

· 將用戶添加到following_history中並更新到S3。畢竟咱們不想再關注他們了。

②等待並觀察

這個階段頗有趣。這個階段是把機器人散佈到twitter裏並觀察結果的時候。有時你會以爲頗有趣，有時你也會有些困惑。當我在Instagram上試用機器人的時候，我很快就發現了Instagram上有不少色情內容。但這是後話了。

在建立了第一個版本的Twitter機器人以後，我學到了三件事：

ⓐ必需要調整搜索推文的方式，由於最初只搜索Python。

ⓑ必須調整機器人運行的頻率，下降行爲的肯定性。

第一個版本的機器人很快就被攔截了，由於我瘋狂地評論並點贊別人的推文，就像《冰河世紀》裏的松鼠喝了一杯能量飲料後那樣。

第一個應用程序在評論太多以後被限制訪問

然而此次，建立一個新的應用程序並採起更謹慎的方法就至關容易了。

ⓒTwitter上有不少機器人。我獲得的回覆是，「嘿，謝謝你關注我。請查看我常用的這項不錯的服務：https://xxxbots.xx」。恭喜他們，他們很聰明，採用了一種病毒式的營銷方法。

機器人們對個人機器人做出了反應，消息列表還在繼續增長

③看他們回關你

在過去的四個星期裏，個人Twitter帳號積累添加了大約600個粉絲，除了偶爾在發佈Medium的帖子列表中添加一個新條目以外，我什麼也沒作。

④等過段時間，而後取關他們

既然不想關注太多人，那必須時不時地取關一些人，以保持平衡。

import bots.utils as _utils

from dateutil.parser import parse

from bots.twitter_api import get_twitter_api

import random

import logging

import time

import datetime

logger = logging.getLogger()

logger.setLevel(logging.INFO)

defunfollow(event=None, context=None):

if random.random() >.23:

logger.info(f'Doing nothing this time')

else:

following_history = _utils.get_s3_data('following.json')

sorted_by_following_date =sorted(

[elem for elem in following_history.items() if'unfollowed_at'notin elem[1]],

key=lambda x: parse(x[1]['followed_at'])

)

number_to_unfollow = random.randint(1, 3)

for currently_following in sorted_by_following_date[:number_to_unfollow]:

_id = currently_following[0]

try:

print(_id)

get_twitter_api().destroy_friendship(_id)

following_history[_id]['unfollowed_at'] = datetime.datetime.now().isoformat()

logger.info(f'Unfollowing: {_id}')

exceptExceptionas e:

logger.error(f'Unfollowing: {_id} did not work with error {e}')

time.sleep(random.randint(2, 8))

_utils.sync_s3_data(following_history)

twitter_unfollow.py hosted with ❤by GitHub

bots.unfollow

取關函數在執行時，首先獲取先前上傳的following_history，再根據關注日期，對全部未取關的用戶進行升序排序。對於排名前三的用戶，調用destroy_friendship （取關功能）。這個名字是我本身取的。那麼該函數將更新following_history，而後準備再次調用。

宣傳Medium的文章

這一部分直截了當，固然，在倫理道德方面也是無須質疑的。

from collections import namedtuple

from bots.twitter_api import get_twitter_api

import random

import logging

from bots.config import get_post_data

logger = logging.getLogger()

logger.setLevel(logging.INFO)

classMediumPost(namedtuple('MediumPost', ['id', 'url', 'tags', 'text'])):

defmake_post(self):

used_tags =self.tags[:random.randint(1, len(self.tags))]

returnf'{self.text}{" ".join(["#"+ tag for tag in used_tags])}{self.url}'

defpost_to_twitter(self):

api = get_twitter_api()

res = api.update_status(self.make_post())

return res

defpost_random_medium_article(event=None, context=None):

posts = [MediumPost(*v) for k, v in get_post_data().items()]

random_post = random.choice(posts)

logger.info(f'Posting: {random_post}')

random_post.post_to_twitter()

if__name__=='__main__':

#posts = [MediumPost(*v) for k, v in get_post_data().items()]

#print(posts)

post_random_medium_article()

twitter_post.py hosted with ❤by GitHub

bots.post

此腳本從引用列表中隨機發布一篇文章參考列表以下所示:

Advanced - Visualize Sales Team:

- Advanced - Visualize Sales Team

- https://towardsdatascience.com/how-to-explore-and-visualize-a-dataset-with-python-7da5024900ef

- - Datascience

- BigData

- DataVisualization

- How to visualize a data set!

....

Advanced - Cat, Dog or Elon Musk:

- Advanced - Cat, Dog or Elon Musk

- https://towardsdatascience.com/cat-dog-or-elon-musk-145658489730

- - Datascience

- BigData

- DataAnalytics

- Python

- Automation

- Machine Learning

- Bots

- Learn how to build an image-recognizing convolutional neural network with Python and Keras in less than 15minutes!

post_data.yml hosted with ❤by GitHub

推文樣本

部署

圖片由elCarito拍攝，源自Unsplash網

我使用了Serverless 架構，利用Lambda函數和預約義的時間表（在serverless.yml中指定）將機器人部署到AWS。

service: fb-TwitterBot

provider:

name: aws

runtime: python3.6

memorySize: 256

timeout: 900

region: ${opt:region, 'eu-central-1'}

stage: ${opt:stage, 'production'}

environment:

PROJECT: ${self:service}-${self:provider.stage}

ENV: ${self:provider.stage}

iamRoleStatements:

- Effect: "Allow"

Action:

- "s3:*"

Resource: 'arn:aws:s3:::fb-twitterbot'

- Effect: "Allow"

Action:

- "s3:*"

Resource: 'arn:aws:s3:::fb-twitterbot/*'

custom:

pythonRequirements:

dockerizePip: non-linux

plugins:

- serverless-python-requirements

functions:

run:

handler: bots/fetchfollow.fetchfollow

events:

- schedule:

rate: cron(15 */3 * * ? *)

post:

handler: bots/post.post_random_medium_article

events:

- schedule:

rate: cron(37 7,18 * * ? *)

unfollow:

handler: bots/unfollow.unfollow

events:

- schedule:

rate: cron(17,32,45 */2 * * ? *)

serverless.yml hosted with ❤by GitHub

serverless.yml

安裝機器人至關簡單，可是我會另寫一篇文章向你們解釋Serverless。若是要更新機器人的話，須要對腳本進行一些更改，而後運行serverless deploy。

結語

我會讓機器人運行更長時間，以便你們閱讀這篇文章時能夠有一個實時的參考。不過，我最終仍是會關掉它的。

留言點贊關注

咱們一塊兒分享AI學習與發展的乾貨
歡迎關注全平臺AI垂類自媒體「讀芯術」

（添加小編微信：dxsxbb，加入讀者圈，一塊兒討論最新鮮的人工智能科技哦～）

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。