Puppeteer
,咱們來了解一下它Puppeteer
?Puppeteer
是 Google Chrome 團隊官方的無界面(Headless)Chrome 工具。Chrome 做爲瀏覽器市場的領頭羊,Chrome Headless 將成爲 web 應用 自動化測試 的行業標杆。因此咱們頗有必要來了解一下它node
無頭瀏覽器是指沒有窗口的瀏覽器git
經過 Puppeteer
咱們可讓瀏覽器幫咱們自動完成不少事情,例如 :github
Puppeteer
?安裝 Puppeteer
很簡單,以下:web
npm i --save puppeteer
# or "yarn add puppeteer"
複製代碼
須要注意的是,因爲用到了ES7的
async/await
語法 ,node
版本最好是v7.6.0或以上npm
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://www.wangyulue.com');
await page.screenshot({path: 'wanger.png'});
await browser.close();
})();
複製代碼
須要注意的是,
Puppeteer
將默認頁面大小爲800px
x600px
,該大小定義了屏幕截圖的大小。咱們能夠經過 Page.setViewport() 來設置截圖頁面大小。api
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md', {waitUntil: 'networkidle2'});
await page.pdf({path: 'api.pdf', format: 'A4'});
await browser.close();
})();
複製代碼
關於 page.pdf()
的更多可配置項,感興趣的同窗能夠戳這裏瀏覽器
/** * @desc Logs into Github. Provide your username and password as environment variables when running the script, i.e: * `GITHUB_USER=myuser GITHUB_PWD=mypassword node github.js` */
const puppeteer = require('puppeteer')
const screenshot = 'github.png';
(async () => {
const browser = await puppeteer.launch({headless: true})
const page = await browser.newPage()
await page.goto('https://github.com/login')
await page.type('#login_field', process.env.GITHUB_USER)
await page.type('#password', process.env.GITHUB_PWD)
await page.click('[name="commit"]')
await page.waitForNavigation()
await page.screenshot({ path: screenshot })
browser.close()
console.log('See screenshot: ' + screenshot)
})()
複製代碼
在GitHub上專門有一個
puppeteer
的demo集合庫,感興趣的同窗能夠戳這裏瞭解bash
Puppeteer
API關於 Puppeteer
詳細的API文檔,感興趣的同窗能夠戳這裏less
Puppeteer 與 Chrome Headless —— 從入門到爬蟲