尚學堂學姐筆記：Jsoup示例——提取URL中的圖像和元數據

時間 2019-11-06

標籤學堂筆記 jsoup 示例提取 url 圖像數據欄目 Java 简体版

原文原文鏈接

今天的學姐筆記會和你們分享提取URL中圖像的技巧，這樣能夠方便你們在製做網頁時調用其餘網頁的資源，提升網頁數據之間的相關性。如下案例由尚學堂李老師提供，但願對你們有所幫助。java

在這個例子中，咱們將提取並打印給定URL的全部圖像信息。要作到這一點，咱們調用select()方法傳遞「"img[src~=(?i)\\.(png|jpe?g|gif)]"」正則表達式做爲參數，以便它能夠打印png，jpeg或gif類型的圖像。node

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JsoupPrintImages {
public static void main( String[] args ) throws IOException{
Document doc = Jsoup.connect("http://www.yiibai.com").get();
Elements images = doc.select("img[src~=(?i)\\.(png|jpe?g|gif)]");
for (Element image : images) {
System.out.println("src : " + image.attr("src"));
System.out.println("height : " + image.attr("height"));
System.out.println("width : " + image.attr("width"));
System.out.println("alt : " + image.attr("alt"));
} 正則表達式

}
}
`
Java
執行結果 -編程

... ...
Shell
自已編程運行試試吧！yii

下面繼續介紹在URL中提取元數據的方法。咱們將打印一個URL的meta關鍵字和描述。要實現這個功能，須要調用Document類的select()，first()，get()和attr()方法。
以下代碼實現ip

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class JsoupPrintMetadata {
public static void main( String[] args ) throws IOException{
Document doc = Jsoup.connect("http://www.yiibai.com").get(); 資源

String keywords = doc.select("meta[name=keywords]").first().attr("content");
System.out.println("Meta keyword : " + keywords);
String description = doc.select("meta[name=description]").get(0).attr("content");
System.out.println("Meta description : " + description);
}
}
`
Java
執行結果 -get

... ...
Shell
運行結果還請你們親自動手嘗試哦！
若是以上內容對你有所幫助，別忘了添加關注。每日編程技術持續更新！io