https://cheerio.js.org/html
Fast, flexible, and lean implementation of core jQuery designed specifically for the server.
Features
❤ Familiar syntax: Cheerio implements a subset of core jQuery. Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery library, revealing its truly gorgeous API.前端
ϟ Blazingly fast: Cheerio works with a very simple, consistent DOM model. As a result parsing, manipulating, and rendering are incredibly efficient.git
❁ Incredibly flexible: Cheerio wraps around @FB55’s forgiving htmlparser2. Cheerio can parse nearly any HTML or XML document.github
const cheerio = require('cheerio'); const $ = cheerio.load('<ul id="fruits">...</ul>');$('.apple', '#fruits').text() //=> Apple $('ul .pear').attr('class') //=> pear $('li[class=orange]').html() //=> Orange
能夠用做服務器端的網頁爬蟲, 解析前端的靜態頁面。web
由於通常頁面的主頁都使用靜態頁面, 來提升SEO, 和首屏的頁面相應度。使用這種工具正好能夠應對, 靜態頁面解析, 提取有用數據。網頁爬蟲
還能夠處理靜態頁面, 給靜態頁面添加一些頁面元素,或者腳本, 在代理的角色中,對往來網頁作修改。服務器
Cheerio is not a web browser
Cheerio parses markup and provides an API for traversing/manipulating the resulting data structure. It does not interpret the result as a web browser does. Specifically, it does not produce a visual rendering, apply CSS, load external resources, or execute JavaScript. If your use case requires any of this functionality, you should consider projects like PhantomJS or JSDom.app