[問題] 用groupby()做累加 luenchang PTT批踢踢實業坊

[問題] 用groupby()做累加

作者: luenchang (luen) 2015-03-25 18:29:45

不好意思，爬文還沒看到有用groupby這個function累加的例子。
我的資料是 a list of lists. 每個list裡有zipcode, date及revenue.例如
大安區在今年1月1日的銷售是100元 [106,20140101,100].我想做兩件事，一是先
把銷售在每個zipcode的每個月加總起來。這部分已經做完了(step 2)。另一個是把月加總
累加起來。這部分(step 3) 就不知道該如何做了。程式改了半天還是從第一筆資料
累加到最後一筆。但我需要的是在各個zipcode內的累加。
為了方便看資料我把它換行排成by column. Python3 有一個accumulate()好像不錯用，
但我的版本是2.7.9。可能step 3小小改動就可以得到 desired output.
#step1: some mock data
mock=[[106,201501,100],
[106,201501,200],
[106,201502,300],
[106,201502,400],
[220,201502,200],
[220,201502,300],
[220,201503,400],
[220,201503,500]]
#desired output:
[[106,201501,300],
[106,201502,1000],
[220,201502,500],
[220,201502,1400]]
#step2: sum up revenue for each zipcode and month using groupby()
testlist=[]
for key, group in groupby(mock, lambda x: str(x[0])+ str(x[1])[0:6]):
summation = sum ([ x[2] for x in group]) # monthly sum
testlist.append([key, summation])
print testlist
#step 3: accumulate monthly summed revenue over month for each zipcode
test2=list(zip(*testlist)[1])
print "test2:"
print test2
for key, group in groupby(mock, lambda x: str(x[0])):
for index, value in enumerate(test2):
temp=test2[:index+1]
testlist[index].append(reduce(lambda a,b: a+b, temp))
print "another test2:"
print testlist

作者: luenchang (luen) 2015-03-25 18:37:00

mock data少打了日期，應該是20150101共8碼。desiredoutput的yearMonth則是正確的。sorry

作者: ccwang002 (亮) 2015-03-26 01:20:00

你 step3 會每次重頭都累叫是因為你 group 完還是temp=test2[:index+1] 重頭再累加一次，沒用到 group

作者: Yukirin (いい天気！) 2015-03-26 13:10:00

何不用pandas

繼續閱讀

[徵人] matplotlib 繪圖luenchang [問題] python ttk combobox 問題sariel0322 [問題] Socket問題, 怎麼改變發送的內容?liataian [問題] 編碼的問題tryagaaa [問題]pandas如何將經緯度資料放到指定的欄位? allen511081 [問題] 請問有人能成功執行 PyGirl 嗎？ResolaQQ Re: [問題] 找不出問題在哪裡ccwang002 [問題] 找不出問題在哪裡sariel0322 [問題] Python可以執行excel的巨集嗎?pepego Re: [問題] 計算名字list中開頭字母的人數bigpigbigpig