使用PyMongo訪問須要認證的MongoDB

時間 2019-11-13

標籤使用 pymongo 訪問須要認證 mongodb 欄目 MongoDB 简体版

原文原文鏈接

Windows 10家庭中文版，Python 3.6.4，PyMongo 3.7.0，MongoDB 3.6.3，Scrapy 1.5.0，html

前言

在Python中，使用PyMongo訪問MongodB，做者Mike Dirolf，維護人員Bernie Hackett <bernie@mongodb.com>，相關連接以下：python

-PyPI官網git

-GitHub官網github

-最新版本3.7.0文檔mongodb

說明，關於文檔，能夠從GitHub下載PyMongo（須要安裝sphinx先），而後自行編譯文檔。數據庫

說明，PyMongo還有一些附屬包，以提供與MongoDB服務器匹配的功能，好比TLS / SSL、GSSAPI、srv、wire protocol compression with snappy等，你們能夠根據須要安裝。api

本文介紹使用PyMongo訪問須要認證的MongoDB，包括從IDLE、Scrapy爬蟲程序來訪問。安全

參考官文：PyMongo Authentication Examples（能夠直接閱讀官文，忽略本文剩餘部分，）服務器

本地MongoDB服務器介紹

打開MongoDB服務器：app

mongod --dbpath d:\p\mdb2dir --logpath d:\p\mdb2dir\log --logappend --auth --directoryperdb

目前存在數據庫globalnews，裏面有集合news，news裏面的每條文檔包含title、url兩個屬性，目前集合news中有33條文檔，存在用戶reporter，密碼爲222222。

使用IDLE訪問

使用Scrapy爬蟲程序

在Scrapy項目的settings.py中配置MongoDB配置項，並啓用相關Item Pipelines：

MongoDBPipeline源碼以下：

 1 import pymongo
 2 
 3 class MongoDBPipeline(object):
 4     '''
 5     將項目抓取到的數據（title、url、response.body）保存到MongoDB中。
 6     目標數據庫：配置文件中又MDB_URI、MDB_NAME定義
 7     目標數據集：news，由本類定義
 8     '''
 9     
10     # 目標數據集
11     coll_name = 'news'
12     
13     def __init__(self, mongo_uri, mongo_db):
14         self.mongo_uri = mongo_uri
15         self.mongo_db = mongo_db
16         
17         # debug
18         print('mongo_uri = ', self.mongo_uri)
19         print('mongo_db = ', self.mongo_db)
20     
21     # 獲取配置文件中的MDB_URI、MDB_NAME兩個屬性
22     @classmethod
23     def from_crawler(cls, crawler):
24         return cls(
25             mongo_uri = crawler.settings.get('MDB_URI'),
26             mongo_db = crawler.settings.get('MDB_NAME', 'news') # 沒有就返回news
27         )
28     
29     # 啓動spider時，創建數據庫鏈接
30     def open_spider(self, spider): # 沒有指定監控哪一個spider？
31         self.client = pymongo.MongoClient(self.mongo_uri)
32         self.db = self.client[self.mongo_db]
33     
34     # 關閉spider時，關閉數據庫鏈接
35     def close_spider(self, spider):
36         self.client.close()
37     
38     # 將抓取到的Item存儲到MongoDB中
39     def process_item(self, item, spider):
40         
41         self.db[self.coll_name].insert_one(dict(item)) # 
42         return item