最初の分割文字を使用して行を列に分割する

8

このようなデータフレームを持っている：

data.frame(text = c("separate1: and: more","another 20: 42")

どのように最初の：を使用して分離することができますか？予想される出力の例

data.frame(text1 = c("separate1","another 20"), text2 = c("and: more","42")

r

— ナタリー
ソース

1

これはあなたの質問に答えますか？データフレーム文字列の列を複数の列に分割

— Claudiu Papasteri

2

@ClaudiuPapasteriいいえ。これは厳密には同じではなく、ここで受け入れられているソリューションが機能するという事実は偶然です。あなたの提案をだますことは非常に誤解を招く可能性があります

— Sotos

@Sotosはい、あなたは正しいです。私は注意を払っていませんでした。2つのソリューションが機能しましたが、偶然です。そのために残念。私は2セントを誤ったフラグの謝罪としてソリューションプールに追加しました。

— Claudiu Papasteri

4

library(reshape2)

df <- data.frame(text = c("separate1: and: more","another 20: 42")

colsplit(df$text, ":", c("text1", "text2"))

— ジョージリー
ソース

5

でベースを使用できregexpr最初の位置を見つけるために:サブストリングを抽出し、使用することができたtrimws空白を削除します。

x <- c("separate1: and: more","another 20: 42")

i <- regexpr(":", x)
data.frame(text1 = trimws(substr(x, 1, i-1)), text2 = trimws(substring(x, i+1)))
#       text1     text2
#1  separate1 and: more
#2 another 20        42

— GKi
ソース

4

デフォルトでは最初の区切り文字で分割されるstr_split_fixedfrom stringrパッケージを使用できます。

stringr::str_split_fixed(d1$text, ':', 2)

#     [,1]         [,2]        
#[1,] "separate1"  " and: more"
#[2,] "another 20" " 42"

— ソトス
ソース

4

df <- data.frame(text = c("separate1: and: more","another 20: 42"))

df$text1 <- gsub(':.*', '', df$text)
df$text2 <- gsub('^[^:]+: ', '', df$text)

df
#                   text      text1     text2
# 1 separate1: and: more  separate1 and: more
# 2       another 20: 42 another 20        42

— アイスクリームトゥーカン
ソース

4

tidyrの使用：

library(dplyr)
library(tidyr)

df %>% 
  separate(text, c("a", "b"), sep = ": ", extra = "merge")
#            a         b
# 1  separate1 and: more
# 2 another 20        42

— zx8754
ソース

3

別のベースRソリューション

df <- do.call(rbind,lapply(as.character(df$text), function(x) {
  k <- head(unlist(gregexpr(":",x)),1)
  data.frame(text1 = substr(x,1,k-1),
             text2 = substr(x,k+1,nchar(x)))
}))

そのような

> df
       text1      text2
1  separate1  and: more
2 another 20         42

— ThomasIsCoding
ソース

2

申し訳ありませんが、@ Sotosは正しいです。これは複製ではありません。以下は、区切り文字の最初の出現時に分割する別の基本ソリューションです。

df <- data.frame(text = c("separate1: and: more","another 20: 42"))

list <- apply(df, 1, function(x) regmatches(x, regexpr(":", x), invert = TRUE))
df <- data.frame(matrix(unlist(list), nrow = length(list), byrow = TRUE))

df
#>           X1         X2
#> 1  separate1  and: more
#> 2 another 20         42

^{2020-02-10にreprexパッケージ（v0.2.1）によって作成されました}

— Claudiu Papasteri
ソース

2

貧しい老人?utils::strcaptureは決して尊敬されません：

strcapture("^(.+?):(.+$)", df$text, proto=list(text1="", text2=""))
#       text1      text2
#1  separate1  and: more
#2 another 20         42

後ろに挿入：

cbind(df, strcapture("^(.+?):(.+$)", df$text, proto=list(text1="", text2="")))
#                  text      text1      text2
#1 separate1: and: more  separate1  and: more
#2       another 20: 42 another 20         42

— thelatemail
ソース