Re: [問題] 如何整理數量位置資料如:1胃,2腸

作者: celestialgod (天)   2015-07-10 15:21:28
※ 引述《helixc (@_2;)》之銘言:
: [軟體熟悉度]:新手+入門
: [問題敘述]:
: 手上有一筆某蛙類的解剖資料,想要分析食性。
: 紀錄的時候會長這樣:
: ID,Food A,Food B,Food C,Food E
: C146,,,,3腸
: B287,,,,10腸
: C140,,,,4腸
: C133,,,1腸,
: C132,1腸,,,
: B305,,,1腸,
: C112,,2腸,,1腸
: C120,,,,1腸
: C128,,,,1腸
: 想要整理成這樣的資料:
: ID, Food type, Amount, Location
: C146, E, 3, 腸
: B287, E, 10, 腸
: C140, E, 4, 腸
: C133, C, 1, 腸
library(data.table)
library(dplyr)
library(tidyr)
library(magrittr)
library(stringr)
tmp_dt = fread("ID,Food A,Food B,Food C,Food E
C146,,,,3腸
B287,,,,10腸
C140,,,,4腸
C133,,,1腸,
C132,1腸,,,
B305,,,1腸,
C112,,2腸,,1腸
C120,,,,1腸
C128,,,,1腸", colClasses = rep("Character",5))
## method 1
output_dt = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
mutate(Amount = str_extract(tmpCol, "\\d*"),
Location = str_sub(tmpCol, nchar(tmpCol), nchar(tmpCol))) %>%
select(-tmpCol) %>%
transform(foodType = as.character(foodType)) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
## method 2
output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
transform(foodType = as.character(foodType),
tmpCol = sub("(\\d*)(.)", "\\1,\\2", tmpCol)) %>%
separate(tmpCol, c("Amount", "Location")) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
## method 3 (不用sub,separate的sep參數可以改成用位置切割)
output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
transform(foodType = as.character(foodType)) %>%
separate(tmpCol, c("Amount", "Location"), -2) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
output: (3個都一樣)
# ID foodType Amount Location
# 1: C132 A 1 腸
# 2: C112 B 2 腸
# 3: C133 C 1 腸
# 4: B305 C 1 腸
# 5: C146 E 3 腸
# 6: B287 E 10 腸
# 7: C140 E 4 腸
# 8: C112 E 1 腸
# 9: C120 E 1 腸
# 10: C128 E 1 腸
作者: helixc (@_2;)   2015-07-10 20:16:00
好多新函式要學,感謝

Links booklink

Contact Us: admin [ a t ] ucptt.com