抓取網易雲音樂評論,哪一句觸動你的心

音樂,是給靈魂的獻禮。

一個好的耳機,好像是程序員的標配。固然有時候不光是爲了聽音樂,只是想告訴別人:忙碌中,莫挨老子...java

音樂軟件有不少,爲何說網易雲音樂呢?由於我用的是這個。沒有什麼其餘交易,固然我都要爬她了,她確定不會很爽,因此你們仍是悄悄的比較好。git

網易雲音樂有網頁版,因此分析接口仍是比較簡單的。若是沒有網頁版,就要抓包了,最近發現了一款超級好用的抓包工具,http和https均可以抓程序員

Proxyman Mac版的,免費的,比青花瓷好用太多了。github

一 獲取連接

獲取評論的url長這樣,須要把歌曲id拼接在後面
http://music.163.com/api/v1/resource/comments/R_SO_4_正則表達式

歌曲id能夠點擊分享的時候,複製連接
https://music.163.com/song?id=1350364644spring

把id拼後面
http://music.163.com/api/v1/resource/comments/R_SO_4_1350364644
這就是完整的獲取評論的url。apache

觀察發現還有一個參數offset,用來翻頁json

  • offset = 0 的時候 返回 20個熱評 + 10 條評論
  • offset = 1 的時候 無熱評 從 第二條顯示10 條評論,因此每次要加10

連接有了,接下來就是用代碼來實現了。api

二 獲取歌曲id

其實我是先寫好了代碼,最後才獲取歌曲的id的,不過仍是先從簡單的說吧。 拿到這個連接,獲取id
https://music.163.com/song?id=1350364644
首先想到的是split(),仍是太年輕啊,有時候連接像下面這樣,若是登錄了會有userid。
https://music.163.com/song?id=1350364644&userid=110
據我仔細的觀察,id好像始終在第一個。
這個時候只能用正則表達式了。安全

1. 正則表達式

正則表達式,不光是 ^([0-9]{0,})$,還有 先行斷言(lookahead)和後行斷言(lookbehind)

具體分爲

  • 正向先行斷言 後面必須有 (?=pattern)
  • 負向先行斷言 後面不能有 (?!pattern)
  • 正向後行斷言 前面必需要有 (?<=pattern)
  • 負向後行斷言 前面不能有 (?<!pattern)

下面這個就是 正向後行斷言 ,意思是前面必須有?id=的一段數字
(?<=\?id=)(\d+)
因此把複製的連接傳進來就能夠取到id了

Pattern pb = Pattern.compile("(?<=\\?id=)(\\d+)");
Matcher matcher = pb.matcher(songUrl);
Assert.isTrue(matcher.find(), "未找到歌曲id!");
String songId = matcher.group();
複製代碼

三 Http請求工具類

在平常的開發中,有時候須要發送http或者https請求。而http請求的工具類也有不少種寫法。
固然,不少公司可能封裝好了工具類,下面這個是我本身參考了一些別人的寫法, 而後寫的一個能夠發送httphttpsgetpost請求的工具類。使用也很簡單。

依賴以下

<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.10</version>
</dependency>

<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpmime</artifactId>
    <version>4.5.7</version>
</dependency>
複製代碼
package com.ler.pai.util;

import com.alibaba.fastjson.JSONObject;
import java.io.IOException;
import java.security.KeyManagementException;
import java.security.NoSuchAlgorithmException;
import java.security.cert.X509Certificate;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import javax.net.ssl.SSLContext;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.HttpEntity;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.config.Registry;
import org.apache.http.config.RegistryBuilder;
import org.apache.http.conn.socket.ConnectionSocketFactory;
import org.apache.http.conn.socket.PlainConnectionSocketFactory;
import org.apache.http.conn.ssl.NoopHostnameVerifier;
import org.apache.http.conn.ssl.SSLConnectionSocketFactory;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.impl.conn.PoolingHttpClientConnectionManager;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;

/** * http 工具類 * * @author lww */
@Slf4j
public class HttpUtils {

	private static final String ENCODE = "UTF-8";

	private HttpUtils() {
	}

	/* <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.10</version> </dependency> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpmime</artifactId> <version>4.5.7</version> </dependency> */

	/** * 向指定URL發送GET方法的請求 http * * @param url 發送請求的URL * @param param 請求參數,請求參數應該是 name1=value1&name2=value2 的形式。 * @param headers 可爲null * @return URL 所表明遠程資源的響應結果 */
	public static String sendGetHttp(String url, String param, Map<String, String> headers) {
		HttpGet httpGet = new HttpGet(StringUtils.isBlank(param) ? url : url + "?" + param);
		headers = initHeader(headers);
		//設置header
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpGet.setHeader(entry.getKey(), entry.getValue());
		}
		String content = null;
		try (CloseableHttpClient closeableHttpClient = HttpClientBuilder.create().build()) {
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpGet);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
		} catch (IOException e) {
			log.error("HttpRequest_getForm_e:{}", e);
		}
		return content;
	}

	/** * 向指定URL發送GET方法的請求 https * * @param url 發送請求的URL * @param param 請求參數,請求參數應該是 name1=value1&name2=value2 的形式。 * @param headers 可爲null * @return URL 所表明遠程資源的響應結果 */
	public static String sendGetHttps(String url, String param, Map<String, String> headers) {
		HttpGet httpGet = new HttpGet(StringUtils.isBlank(param) ? url : url + "?" + param);
		headers = initHeader(headers);
		//設置header
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpGet.setHeader(entry.getKey(), entry.getValue());
		}
		String content = null;
		try (CloseableHttpClient closeableHttpClient = sslHttpClientBuild()) {
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpGet);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
		} catch (IOException e) {
			log.error("HttpRequest_getForm_e:{}", e);
		}
		return content;
	}

	/** * 向指定 URL 發送POST方法的請求 form參數 http * * @param url 發送請求的 URL * @param param 請求參數,請求參數能夠 ?name1=value1&name2=value2 拼在url後,也能夠放在param中。 * @param headers 可爲null * @return 所表明遠程資源的響應結果 */
	public static String sendPostFormHttp(String url, Map<String, String> param, Map<String, String> headers) {
		HttpPost httpPost = new HttpPost(url);
		headers = initHeader(headers);
		headers.put("Content-Type", "application/x-www-form-urlencoded");
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpPost.setHeader(entry.getKey(), entry.getValue());
		}
		String content = null;
		List<NameValuePair> pairList = new ArrayList<>();
		if (param != null) {
			for (Map.Entry<String, String> entry : param.entrySet()) {
				pairList.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
			}
		}
		try (CloseableHttpClient closeableHttpClient = HttpClientBuilder.create().build()) {
			httpPost.setEntity(new UrlEncodedFormEntity(pairList, ENCODE));
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpPost);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
		} catch (IOException e) {
			log.error("HttpRequest_getForm_e:{}", e);
		}
		return content;
	}

	/** * 向指定 URL 發送POST方法的請求 form參數 https * * @param url 發送請求的 URL * @param param 請求參數,請求參數能夠 ?name1=value1&name2=value2 拼在url後,也能夠放在param中。 * @param headers 可爲null * @return 所表明遠程資源的響應結果 */
	public static String sendPostFormHttps(String url, Map<String, String> param, Map<String, String> headers) {
		HttpPost httpPost = new HttpPost(url);
		headers = initHeader(headers);
		headers.put("Content-Type", "application/x-www-form-urlencoded");
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpPost.setHeader(entry.getKey(), entry.getValue());
		}
		String content = null;
		List<NameValuePair> pairList = new ArrayList<>();
		if (param != null) {
			for (Map.Entry<String, String> entry : param.entrySet()) {
				pairList.add(new BasicNameValuePair(entry.getKey(), entry.getValue()));
			}
		}
		try (CloseableHttpClient closeableHttpClient = sslHttpClientBuild()) {
			httpPost.setEntity(new UrlEncodedFormEntity(pairList, ENCODE));
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpPost);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
		} catch (IOException e) {
			log.error("HttpRequest_getForm_e:{}", e);
		}
		return content;
	}

	/** * 發送post,參數爲json字符串 放在body中 requestBody http * * @param url url * @param params 參數 * @param headers 可爲null */
	public static String sendPostJsonHttp(String url, JSONObject params, Map<String, String> headers) {
		HttpPost httpPost = new HttpPost(url);
		headers = initHeader(headers);
		headers.put("Content-Type", "application/json");
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpPost.setHeader(entry.getKey(), entry.getValue());
		}
		StringEntity stringEntity = new StringEntity(params.toString(), ENCODE);
		httpPost.setEntity(stringEntity);
		String content = null;
		try (CloseableHttpClient closeableHttpClient = HttpClientBuilder.create().build()) {
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpPost);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
			closeableHttpClient.close();
		} catch (IOException e) {
			log.error("HttpUtil_sendPostJsonHttp_e:{}", e);
		}
		return content;
	}

	/** * 發送post,參數爲json字符串 放在body中 requestBody https * * @param url url * @param params 參數 * @param headers 可爲null */
	public static String sendPostJsonHttps(String url, JSONObject params, Map<String, String> headers) {
		HttpPost httpPost = new HttpPost(url);
		headers = initHeader(headers);
		headers.put("Content-Type", "application/json");
		for (Map.Entry<String, String> entry : headers.entrySet()) {
			httpPost.setHeader(entry.getKey(), entry.getValue());
		}
		StringEntity stringEntity = new StringEntity(params.toString(), ENCODE);
		httpPost.setEntity(stringEntity);
		String content = null;
		try (CloseableHttpClient closeableHttpClient = sslHttpClientBuild()) {
			CloseableHttpResponse httpResponse = closeableHttpClient.execute(httpPost);
			HttpEntity entity = httpResponse.getEntity();
			content = EntityUtils.toString(entity, ENCODE);
			closeableHttpClient.close();
		} catch (IOException e) {
			log.error("HttpUtil_sendPostJsonHttps_e:{}", e);
		}
		return content;
	}

	private static Map<String, String> initHeader(Map<String, String> headers) {
		if (headers == null) {
			headers = new HashMap<>(16);
		}
		headers.put("accept", "*/*");
		headers.put("connection", "Keep-Alive");
		headers.put("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)");
		return headers;
	}

	public static CloseableHttpClient sslHttpClientBuild() {
		Registry<ConnectionSocketFactory> socketFactoryRegistry =
				RegistryBuilder.<ConnectionSocketFactory>create()
						.register("http", PlainConnectionSocketFactory.INSTANCE)
						.register("https", trustAllHttpsCertificates()).build();
		//建立ConnectionManager,添加Connection配置信息
		PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager(socketFactoryRegistry);
		CloseableHttpClient httpClient = HttpClients.custom().setConnectionManager(connectionManager).build();
		return httpClient;
	}

	private static SSLConnectionSocketFactory trustAllHttpsCertificates() {
		SSLConnectionSocketFactory socketFactory = null;
		TrustManager[] trustAllCerts = new TrustManager[1];
		TrustManager tm = new Mitm();
		trustAllCerts[0] = tm;
		SSLContext sc;
		try {
			sc = SSLContext.getInstance("TLS");
			sc.init(null, trustAllCerts, null);
			socketFactory = new SSLConnectionSocketFactory(sc, NoopHostnameVerifier.INSTANCE);
		} catch (NoSuchAlgorithmException | KeyManagementException e) {
			log.error("HttpUtil_trustAllHttpsCertificates_e:{}", e);
		}
		return socketFactory;
	}

	static class Mitm implements TrustManager, X509TrustManager {

		@Override
		public X509Certificate[] getAcceptedIssuers() {
			return null;
		}

		@Override
		public void checkServerTrusted(X509Certificate[] certs, String authType) {
			//don't check
		}

		@Override
		public void checkClientTrusted(X509Certificate[] certs, String authType) {
			//don't check
		}
	}
}
複製代碼

四 無代碼不BB

獲取評論的代碼。註釋很清楚了。(貼了代碼有種被看光光的感受,害羞啊)

package com.ler.pai.service.impl;

import com.alibaba.fastjson.JSONObject;
import com.ler.pai.service.MusicService;
import com.ler.pai.util.HttpUtils;
import com.ler.pai.vo.CommentsInfoVO;
import com.ler.pai.vo.CommentsVO;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import org.springframework.util.Assert;

/** * @author lww * @date 2020-01-30 18:09 */
@Service
@Slf4j
public class MusicServiceImpl implements MusicService {

	private Pattern pb = Pattern.compile("(?<=\\?id=)(\\d+)");

	private static final String COMMENT_URL = "http://music.163.com/api/v1/resource/comments/R_SO_4_";

	@Override
	public String getComment(String songUrl) {
		log.info("MusicServiceImpl_getComment_songUrl:{}", songUrl);
		Matcher matcher = pb.matcher(songUrl);
		Assert.isTrue(matcher.find(), "未找到歌曲id!");
		//獲取歌曲id
		String songId = matcher.group();
		//拼接
		String url = COMMENT_URL + songId;
		// offset = 0 有 20個熱評 + 10 條評論
		// offset = 1 無熱評 從 第二條顯示10 條評論
		String sb = "?offset=0";
		//發送請求
		String s = HttpUtils.sendPostFormHttps(url + sb, null, null);
		//解析
		CommentsInfoVO vo = JSONObject.parseObject(s, CommentsInfoVO.class);
		//獲取總評論數
		Long total = vo.getTotal();
		Assert.isTrue(total != null, "資源不存在!");
		//計算頁數
		Long page = (total % 10 == 0 ? total / 10 : total / 10 + 1);
		//用於存放評論
		StringBuilder res = new StringBuilder(1024);
		int i = 0;
		while (i < page) {
			//先把連接裏 頁數置空
			sb = sb.replace(i + "", "");
			//解析評論
			CommentsInfoVO commentsInfoVO = JSONObject.parseObject(s, CommentsInfoVO.class);
			List<CommentsVO> hotComments = commentsInfoVO.getHotComments();
			//熱評 拼裝數據
			if (hotComments != null) {
				for (CommentsVO hotComment : hotComments) {
					res.append("========================").append("\n");
					res./*append(hotComment.getUser().getNickname()).append(" : ").*/append(hotComment.getContent()).append("\n");
				}
			}
			List<CommentsVO> comments = commentsInfoVO.getComments();
			//評論 拼裝數據
			for (CommentsVO comment : comments) {
				res.append("========================").append("\n");
				res./*append(comment.getUser().getNickname()).append(" : ").*/append(comment.getContent()).append("\n");
			}
			i += 10;
			if (i > 50) {
				//避免爬取太多
				break;
			}
			//要獲取下一頁,須要加10
			sb = "?offset=" + i;
			//發送請求 而後會再次進入循環,再次解析
			s = HttpUtils.sendPostFormHttps(url + sb, null, null);
		}
		return res.toString();
	}
}
複製代碼

陶峻汐 的《和解》執行的結果以下:

========================
不少時候 咱們心裏的痛苦 都是由於本身放不過本身 然而 當咱們遲遲不願與本身和解 不願與過往和解 不願與生活和解的時候 在那些數不清的黑夜裏 另外一個本身老是乘機跑出來擾亂 帶給咱們 糾纏 掙扎和矛盾
========================
她是個人病 註定成不了個人人
========================
尼采:「不需時刻敏感,遲鈍有時即爲美德」
========================
曖昧上頭的那幾秒真的像極了愛情 惋惜人生就是一個圓 上頭有多快樂 下頭就有多難過
========================
我想和你一塊兒睡 但不想睡你
========================
勸你們,不要去接近心中有多年執念傷痕累累的人,不要把本身想象成拯救他的聖人。你的溫暖和愛,只能換來他一句謝謝你和對不起。 同理,也不要在本身傷疤未好前用別人療傷。感情多珍重,對己對他人。
========================
離別後 我裹上並不討喜的白色外衣 再次投入黑色的懷抱 此地無銀三百兩
========================
學會和過去和解吧,日子但是過的未來的 學會和本身和解吧,揪着本身的頭髮也要把本身從泥地裏拔起來 和生活的不公和解吧,消化委屈和遭受的惡意。讓本身變得再有力量一點強大一點 但多數狀況下,我只會跟不快樂和解,替難過找藉口說服本身,可有時候我仍是會被反噬進去,掉進無底洞裏
——日籤晚安
========================
謝謝你們的支持[可愛]
========================
人啊 總要學會本身和本身和解 就像我說的算了吧不是由於我不想要了 而是我再怎麼努力也沒用 不如自我和解

寫在最後

最近在構思一個開源項目,主要是整合一些第三方公共API接口的例子。
好比騰訊的公共API,百度的API等等。作一個簡單的調用例子。

完整項目地址

因爲如今登不了Github,代碼在碼雲

如今已經搭了一個簡單的項目,裏面有如下接口

  • 高德地圖 根據關鍵字查詢地址
  • 高德地圖 查詢天氣(爲簡化開發,只有 北京,杭州的 天氣查詢)
  • 高德地圖 根據ip獲取地址
  • 網易雲音樂 獲取歌曲評論
  • 短鏈生成

歡迎你們來共同維護,密鑰等能夠用我的申請的來測試。
不要泄露公司使用的或者本身的不可公開的。
注意保護我的數據安全。

相關文章
相關標籤/搜索