s860134大,網址和CODE如下,我發現只是一段時間沒ACCESS網頁
,抓網頁就會正常,但多抓幾次就會掉字。
https://shopee.tw/viviancloe
import requests
import json
if __name__ == "__main__":
headers = {
'Cookie':'SPC_IA=-1; SPC_EC=-;
SPC_F=b9xBLc7WroUphDkfgTLhUFTbZDQoNbTu;
REC_T_ID=6cf0de6c-a762-11e7-bf41-246e960f6a68;
SPC_T_ID="WAlev0L2X1AMTz1j56adnJD9mpCd0b4dT3kdd1BrRTZD27vuhGveETTogw0AQ1jvsKFZF2chyh4Ut7whluhOn/0MxeAPZwthaoAleA3JmC4=";
SPC_U=-; SPC_T_IV="KnibcL4buEmqFNtMuczz+w==";
__utma=88845529.924012273.1506942680.1508042090.1508042090.1;
__utmz=88845529.1508042090.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
_atrk_siteuid=SYQBwFEiz55k_Fw8; csrftoken=pX243QosnAg5tqNlVwAhllB30qL4418F;
__BWfp=c1509458278808x1715baa2b; SPC_SC_TK=; UYOMAPJWEMDGJ=; SPC_SC_UD=;
_ga=GA1.2.924012273.1506942680; _gid=GA1.2.1418713679.1510903500; _gat=1;
SPC_SI=lxbotjnjan1rp46ocb0pkcy8z1qwhc4g',
'Referer':'https://shopee.tw/viviancloe',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36',
'X-CSRFToken':'pX243QosnAg5tqNlVwAhllB30qL4418F'
}
jd = json.loads('{"shop_ids":[730057]}')
response = requests.post('https://shopee.tw/api/v1/shops/', json = jd,
headers = headers)
print(response.text)
※ 引述《ntasop (kuli)》之銘言:
: 使用requests post爬蟲蝦皮網站,chrome顯示"place"欄位的長度和python
: 抓的長度不同,python截掉剩1個字,"description"欄位長度也不同,請教
: 大家該如何改善這問題,非常謝謝。(截掉都是中文字)
: python爬蟲結果:(長度太長截掉一些)
: [......"description": "\ud83d\udc4b
: \ud83c\udf86\u6b61\u8fce\u5149\u81e8\u5154\u5bf6\ud83c\udf86
: \ud83d\udc4b\n\n\ud83d\ude4c\ud83d\ude4c\u6709\u8208\u8da3\u7684\u5546\u54c1\u6b61\u8fce\u5229\u7528\u804a\u804a\u8a62\u554f
: ~~\u8b1d", "place": "\u65b0",..
: chrome抓的結果:
: [...."description": "\ud83d\udc4b
: \ud83c\udf86\u6b61\u8fce\u5149\u81e8\u5154\u5bf6\ud83c\udf86
: \ud83d\udc4b\n\n\ud83d\ude4c\ud83d\ude4c\u6709\u8208\u8da3\u7684\u5546\u54c1\u6b61\u8fce\u5229\u7528\u804a\u804a\u8a62\u554f
: ~~\u8b1d\u8b1d\u5149\u81e8 \ud83d\udc07", "place": "\u65b0\u5317\u5e02\u4e09\u91cd\u5340",....]