分享一個本身寫的github-trending小工具

緣起

Github Trending基本上是我天天都會瀏覽的網頁,上面會及時發佈一些GIthub上比較有潛力的項目,或者說每日Star數增量排行榜。javascript

不過因爲Github Trending常常會實時更新,即便你訪問得再勤,不免仍是會錯過一些你感興趣的項目,爲此很多人都想出了本身的解決辦法,例如
josephyzhou ,他的 github-trending 項目獲得了衆多人的青睞,我仔細閱讀了他的源碼 (Go),發現實現也較爲簡單, 就用Python 重寫了一下,發現代碼少了好多,詳見 個人 github-trendinghtml

步驟

主要是創建一個Job,而後分三步:java

  • 建立一個以 日期.md 的文件python

  • 訪問Github-Trending 頁面 而後抓取關注語言的Trending List 寫入 md文件git

  • Git Add + Commit + Pushgithub

Job

def job():

    strdate = datetime.datetime.now().strftime('%Y-%m-%d')
    filename = '{date}.md'.format(date=strdate)

    # create markdown file
    createMarkdown(strdate, filename)

    # write markdown
    scrape('python', filename)
    scrape('swift', filename)
    scrape('javascript', filename)
    scrape('go', filename)

    # git add commit push
    git_add_commit_push(strdate, filename)

create markdown

def createMarkdown(date, filename):
    with open(filename, 'w') as f:
        f.write("###" + date + "\n")

write markdown

def scrape(language, filename):

    HEADERS = {
        'User-Agent'        : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0',
        'Accept'            : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'Accept-Encoding'    : 'gzip,deflate,sdch',
        'Accept-Language'    : 'zh-CN,zh;q=0.8'
    }

    url = 'https://github.com/trending/{language}'.format(language=language)
    r = requests.get(url, headers=HEADERS)
    assert r.status_code == 200

    d = pq(r.content)
    items = d('ol.repo-list li')

    # codecs to solve the problem utf-8 codec like chinese
    with codecs.open(filename, "a", "utf-8") as f:
        f.write('\n####{language}\n'.format(language=language))

        for item in items:
            i = pq(item)
            title = i("h3 a").text()
            owner = i("span.prefix").text()
            description = i("p.col-9").text()
            url = i("h3 a").attr("href")
            url = "https://github.com" + url
            f.write(u"* [{title}]({url}):{description}\n".format(title=title, url=url, description=description))

git operations

def git_add_commit_push(date, filename):
    cmd_git_add = 'git add {filename}'.format(filename=filename)
    cmd_git_commit = 'git commit -m "{date}"'.format(date=date)
    cmd_git_push = 'git push -u origin master'

    os.system(cmd_git_add)
    os.system(cmd_git_commit)
    os.system(cmd_git_push)

部署

代碼寫完了,而後就能夠部署了,固然你能夠放在本身的電腦上跑。可是這是個天天的定時任務,因此不能關機比較尷尬。比較好的辦法是部署到VPS,具體主機商就不推薦了,反正就這幾家,你們隨意。部署以前記得先將VPS 的 SSH key 添加到Github 的信任列表,這樣這個代碼就能夠順利跑起來啦!web

$ git clone https://github.com/bonfy/github-trending.git
  $ cd github-trending
  $ pip install -r requirements.txt
  $ python scraper.py

通常人我不告訴

還有個好處,偷偷告訴大家,這代碼是天天定時跑的,因此天天都會Commit 到Github上,想象一下吧,一年以後你的Github下面的Commit 一欄將是多麼的美觀啊!因此趕快去Star 個人項目,行動起來吧,少年!swift

項目地址: https://github.com/bonfy/github-trending
歡迎你們Starbash

相關文章
相關標籤/搜索