大家好,
小弟我最近在
http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw
裡面撈資料,主要是希望能將資料下載下來並且作整理,而我在抓資料時(假如是1240)用firefox去看header時結果如下
http://www.tpex.org.tw/web/emergingstock/single_historical/download.php
Host: www.tpex.org.tw
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101 Firefox/57.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Referer:
http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw
Content-Type: application/x-www-form-urlencoded
Content-Length: 84
Cookie: _ga=GA1.3.582781261.1509173813; _gid=GA1.3.454443446.1513917119;
_gat=1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
year=106&month=12&stkno=1240&stkname=茂生農經&lang=zh-tw
最後一行看起來無法用header的指令正常放進header裡面,請問要如何處理?
我的程式碼如下(Python 3.5)
#!/usr/bin/env python3
# -*- coding: utf8 -*-
import urllib.request
url="http://www.tpex.org.tw/web/emergingstock/single_historical/download.php"
headers={
"Host":"www.tpex.org.tw",
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101
Firefox/57.0",
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language":"zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding":"gzip, deflate",
"Referer":"http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw",
"Content-Type":"application/x-www-form-urlencoded",
"Content-Length":"84",
"Cookie":"_ga=GA1.3.582781261.1509173813; _gid=GA1.3.1976747965.1513496313;
_gat=1",
"Connection":"keep-alive",
"Upgrade-Insecure-Requests":"1",
# "year=106&month=12&stkno=1240&stkname=茂生農經&lang=zh-tw":""
}
req=urllib.request.Request(url,headers=headers)
response=urllib.request.urlopen(req)
print (str(response))
不將最後一行選項寫進去,print出來會是
<http.client.HTTPResponse object at 0x02700B10>
網路上找了半天還是沒有一個比較好的解法。