我有长格式的数据库(tidyverse)n = 3set.seed(1)d <- tibble(y = sample(0:1,n,T),x1 = sample(1:5,n,T),x2 = sample(1:5,n,T),x3 = sample(1:5,n,T))
我有长格式的数据
library(tidyverse)
n=3
set.seed(1)
d <- tibble(y=sample(0:1,n,T),
x1=sample(1:5,n,T),
x2=sample(1:5,n,T),
x3=sample(1:5,n,T))
d_long <- d |> pivot_longer(!y)
# add new row with new variable x4
d_long_new <- q |> bind_rows(tibble(y=1,name="x4",value=10))
.
d_long_new
# A tibble: 10 x 3
y name value
<dbl> <chr> <dbl>
1 0 x1 1
2 0 x2 3
3 0 x3 3
4 1 x1 2
5 1 x2 2
6 1 x3 1
7 0 x1 5
8 0 x2 3
9 0 x3 5
10 1 x4 10
我如何将数据转换为宽格式以获得此结果?
A tibble: 4 x 5
y x1 x2 x3 x4
<dbl> <int> <int> <int> <dbl>
1 0 1 3 3 NA
2 1 2 2 1 NA
3 0 5 3 5 NA
4 1 NA NA NA 10
当我尝试应用pivot_wider()函数时,我得到了一些与我预期不同的结果
d_long_new |> pivot_wider(names_from = name, values_from = value)
.
# A tibble: 2 x 5
y x1 x2 x3 x4
<dbl> <list> <list> <list> <list>
1 0 <dbl [2]> <dbl [2]> <dbl [2]> <NULL>
2 1 <dbl [1]> <dbl [1]> <dbl [1]> <dbl [1]>
您需要添加另一列来标识唯一的行。
d_long_new |>
mutate(seq=row_number(), .by=name) |>
pivot_wider(names_from = name, values_from = value) # |> select(-seq)
# A tibble: 4 x 6
y seq x1 x2 x3 x4
<dbl> <int> <dbl> <dbl> <dbl> <dbl>
1 0 1 1 3 3 NA
2 1 2 2 2 1 NA
3 0 3 5 3 5 NA
4 1 1 NA NA NA 10
您可以使用 data.table
如下包
> library(data.table)
> dcast(setDT(d_long_new)[, rid := rleid(y)], rid + y ~ name)[, -1]
y x1 x2 x3 x4
<num> <num> <num> <num> <num>
1: 0 1 3 3 NA
2: 1 2 2 1 NA
3: 0 5 3 5 NA
4: 1 NA NA NA 10