Re: [問題] 正則表示式 regex in R cywhale PTT批踢踢實業坊

Re: [問題] 正則表示式 regex in R

作者: cywhale (cywhale) 2016-04-30 23:51:48

※ 引述《celestialgod (天)》之銘言：
: ※ 引述《cywhale (cywhale)》之銘言：
: : [問題類型]:
: : 程式諮詢(我想用R 做某件事情，但是我不知道要怎麼用R 寫出來)
: : 若一字串的開頭與結尾只想留下英文字，我寫
: : gsub("^[^a-zA-Z]+|[^a-zA-Z]+$", "", x)
: : 但若結尾是"sp." or "spp." 我想保留"." 這個符號不被上面這個式子濾掉
: : 比如 "aaa bbb sp." 就維持原字串
: : 但其他情況的"."應該要被濾掉比如 "aaa bbb22." -> "aaa bbb"
: : 試了一些?: ?! 等語法都沒抓到，向大家請教~~ 謝謝~
: str <- c("aaa bbb sp.", "aaa bbb sp2.")
: gsub("[^a-zA-Z]*([a-zA-Z. ]+).*", "\\1", str)
: ^ 這個空格要留著不然會出事XD
: # [1] "aaa bbb sp." "aaa bbb sp"
: 我忘了問會不會有 "aa2 bb3 cc." 要變成 "aa bb cc." 這種情況了？
: 有這種情況建議用regmatches，把 "aa", "bb", "cc."都抓出來，再處理QQ
: 大概像這樣(可能考慮還不夠周延)：
: str <- c("aaa bbb sp.", "aaa bbb sp2.", "aa2 bb3 cc.")
: sapply(regmatches(str, gregexpr("[a-zA-Z. ]+", str)), function(x){
: paste0(x[x != "."], collapse = "")
: })
: # [1] "aaa bbb sp." "aaa bbb sp" "aa bb cc."
From previous post (thanks celestialgod), I learned "\\1" and got some idea..
So I tried and made the following code.
The results closed to my targets, to simplify some scientific names collected
from web. Those formats were just in a mess. ><
After these trials, learned a lot for handling regex... ^_^
gsub("^[^a-zA-Z]+|(?!\\.)[^a-zA-Z]+$|
\\b((sp\\.)+$)|\\b((spp\\.)+$)|((\\w{0,})\\.+$)","\\2\\4\\6",
c("33aaa sp.", "aaa sp.bb33", "aaasp.bb 33 de","aaa w2sp.",
"aaa www spp. ", "spp.","bb.", "XXX sp. ",
"YYY spp.()", "ZZZZ.."), perl=T)
[1] "aaa sp." "aaa sp.bb" "aaasp.bb 33 de" "aaa w2sp" "aaa www spp."
[6] "spp." "bb" "XXX sp." "YYY spp." "ZZZZ"
Any comments or bugs found, just tell me! Thanks for the help~

作者: celestialgod (å¤©) 2016-04-30 23:55:00

這個regex真的好醜XDD

作者: cywhale (cywhale) 2016-05-01 00:01:00

haha.. really.. @@

繼續閱讀

Re: [問題] 如何轉化數字celestialgod Re: [問題] 如何轉化數字andrew43 [問題] 如何轉化數字laputaca Re: [問題] 正則表示式 regex in Rcelestialgod [問題] 正則表示式 regex in Rcywhale [問題] 請問如何賦予tif檔gis座標ghostdx [問題] 位數顯示問題clark574 [討論] 有人去今年的GSOC嗎？naturalsmen Re: [問題] 循環數列celestialgod Re: [問題] ggplot 畫機率分配走勢圖(NT$300+200)celestialgod