Re: [問題] 進行特定欄位加總

作者: celestialgod (天)   2017-06-28 19:10:54
※ 引述《directly (天使的圓舞曲)》之銘言:
: [問題類型]:
: 程式諮詢(我想用R 做某件事情,但是我不知道要怎麼用R 寫出來)
: [軟體熟悉度]:
: 入門(寫過其他程式,只是對語法不熟悉)
: [問題敘述]:
: 想要進行特定欄位(特定變數)的加總,但不知道怎麼處理
: [程式範例]:
: 資料:
: id A1 B1 B2 A2 C1 C2 C3
: 1 1 2 0 4 5 6 7
: 2 5 6 8 8 1 3 2
: 請問如果我希望
: A1+A2形成一個新的變數,變成新的一欄叫作A,
: B1+B2形成一個新的變數,變成新的一欄叫作B,
: C1+C2+C3形成一個新的變數,變成新的一欄叫作C,
: 想要得到:
: id A B C
: 1 5 2 18
: 2 13 14 6
: 我試著用aggregate、colSums去做但沒有成功,
: 請問用有人可以出手指導嗎?
: 謝謝!
以下共七種方法,自行選擇適用的XD
rlang + dplyr我還沒試出來Orz,這新東西真的搞不太懂
library(data.table)
DT <- fread("id A1 B1 B2 A2 C1 C2 C3
1 1 2 0 4 5 6 7
2 5 6 8 8 1 3 2")
DF <- copy(DT)
setDF(DF)
# base, transform + subset
subset(transform(DF, A = A1 + A2, B = B1 + B2, C = C1 + C2 + C3),
select = c(id, A, B, C))
# id A B C
# 1 1 5 2 18
# 2 2 13 14 6
# base, eval
res <- sapply(c("A", "B", "C"), function(ch){
expr <- paste0(names(DT)[grepl(paste0("^", ch), names(DT))], collapse = "+")
eval(parse(text = expr), DF, parent.frame())
})
cbind(id = DF$id, as.data.frame(res))
# id A B C
# 1 1 5 2 18
# 2 2 13 14 6
# data.table, method 1
DT[ , .(A = A1 + A2, B = B1 + B2, C = C1 + C2 + C3) , by = .(id)]
# id A B C
# 1: 1 5 2 18
# 2: 2 13 14 6
# data.table, method 2
exprs <- lapply(c("A", "B", "C"), function(ch){
parse(text = paste0(names(DT)[grepl(paste0("^", ch), names(DT))], collapse
= "+"))
})
names(exprs) <- c("A", "B", "C")
DT[ , lapply(exprs, eval, envir = DT) , by = .(id)]
# id A B C
# 1: 1 5 2 18
# 2: 2 13 14 6
# data.table, method 3
lapply(c("A", "B", "C"), function(ch){
nn <- names(DT)[grepl(paste0("^", ch), names(DT))]
DT[ , eval(ch) := Reduce(function(x, y) x+y, .SD), .SDcols = nn]
DT[ , eval(nn) := NULL]
return(NULL)
})
DT
# id A B C
# 1: 1 5 2 18
# 2: 2 13 14 6
library(dplyr)
# dplyr, method 1
DF %>% group_by(id) %>%
summarise(A = A1 + A2, B = B1 + B2, C = C1 + C2 + C3)
# # A tibble: 2 x 4
# id A B C
# <int> <int> <int> <int>
# 1 1 5 2 18
# 2 2 13 14 6
# dplyr, method 2
exprs <- lapply(c("A", "B", "C"), function(ch){
paste0(names(DF)[grepl(paste0("^", ch), names(DF))], collapse = "+")
})
names(exprs) <- c("A", "B", "C")
DF %>% group_by(id) %>% summarise_(.dots = exprs)

Links booklink

Contact Us: admin [ a t ] ucptt.com