蘇寧易購價格爬取(golang)

若是商品地址爲:http://product.suning.com/0070230548/10608983060.htmlhtml

則價格地址:python

http://pas.suning.com/nspcsale_0_000000010608983060_000000010608983060_0070230548_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729框架

 

若是商品地址爲:http://product.suning.com/0000000000/144016246.htmlpost

則價格地址:url

http://pas.suning.com/nspcsale_0_000000000144016246_000000000144016246_0000000000_20_021_0210101_500353_1000267_9264_12113_Z001___R9006850_2.86_0___000278188__.html?callback=pcData&_=1558664442552spa

 

參考:3.一、蘇寧百萬級商品爬取 思路講解 商品爬取code

python 抓取 蘇寧價格地址regexp

 

 python和go共同爬取了相同的數據(135個商品的數據),go用時19.457s,python(未使用任何爬蟲框架)用時178.672shtm

 

go:blog

 1 package main
 2 
 3 import (
 4     "fmt"
 5     "io/ioutil"
 6     "net/http"
 7     "regexp"
 8     "strings"
 9 )
10 
11 func GetGoodPrice(url string) string {
12     re := regexp.MustCompile(`com/(.*?).html`)
13     keynum := re.FindAllStringSubmatch(url, -1)
14     keynum0 := keynum[0][1]
15     key0 := strings.Split(keynum0, "/")[0]
16     key1 := strings.Split(keynum0, "/")[1]
17     priceurl := "http://pas.suning.com/nspcsale_0_000000000" + key1 + "_000000000" + key1 + "_" + key0 + "_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729"
18     if len(key1) == 11 {
19         priceurl = "http://pas.suning.com/nspcsale_0_0000000" + key1 + "_0000000" + key1 + "_" + key0 + "_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729"
20     }
21 
22     resp, err := http.Get(priceurl)
23     if err != nil {
24         panic(err)
25     }
26     if resp.StatusCode != 200 {
27         fmt.Println("err")
28     }
29     s, _ := ioutil.ReadAll(resp.Body)
30     resp.Body.Close()
31 
32     re0 := regexp.MustCompile(`"netPrice":"(.*?)","warrantyList`)
33     price := re0.FindAllStringSubmatch(string(s), -1)
34     // fmt.Println(price)
35     // fmt.Println(priceurl)
36     return price[0][1]
37 }
38 func main() {
39     url := `http://product.suning.com/0000000000/144016246.html`
40     price := GetGoodPrice(url)
41     fmt.Println(price)
42 }
相關文章
相關標籤/搜索