r - Make the `drop` argument in `dcast` only look at the RHS of the formula -
the drop
argument in dcast
(from "reshape2" or "dplyr") can useful when going "long" "wide" dataset , want create columns combinations not exist in long form.
it turns out using drop
affects combinations left hand side (lhs) of formula right hand side (rhs). thus, creates rows based on combinations of lhs values.
is there way override behavior?
here's sample data:
library(data.table) dt <- data.table(v1 = c(1.105, 1.105, 1.105, 2.012, 2.012, 2.012), id = c(1l, 1l, 1l, 2l, 2l, 2l), v2 = structure(c(2l, 3l, 5l, 1l, 2l, 6l), .label = c("1", "2", "3", "4", "5", "6"), class = "factor"), v3 = c(3l, 2l, 2l, 5l, 4l, 3l))
notice "v2" factor
column 6 levels. want go "long" wide", add in columns missing factor levels (in case "4").
reshape
handles shape, not missing columns:
reshape(dt, direction = "wide", idvar = c("id", "v1"), timevar = "v2") # v1 id v3.2 v3.3 v3.5 v3.1 v3.6 # 1: 1.105 1 3 2 2 na na # 2: 2.012 2 4 na na 5 3
dcast
handles adding missing columns, if there's 1 value on lhs:
dcast(dt, id ~ v2, value.var = "v3", drop = false) # id 1 2 3 4 5 6 # 1: 1 na 3 2 na 2 na # 2: 2 5 4 na na na 3
if there multiple values on lhs, combinations of values on lhs expanded out, if had used cj
or expand.grid
, rows 2 , 3 not @ of interest me:
dcast(dt, ... ~ v2, value.var = "v3", drop = false) # v1 id 1 2 3 4 5 6 # 1: 1.105 1 na 3 2 na 2 na # 2: 1.105 2 na na na na na na # 3: 2.012 1 na na na na na na # 4: 2.012 2 5 4 na na na 3
this similar using xtabs
in base r: ftable(xtabs(v3 ~ id + v1 + v2, dt))
.
is there way let dcast
know essentially, "hey. combination of values on lhs ids. don't try fill them in me."
my current approach 3 steps, 1 collapsing down lhs values, spreading out rhs values, , 1 merging result.
merge(dt[, list(v1 = unique(v1)), .(id)], ## or unique(dt[, c("id", "v1"), = false]) dcast(dt, id ~ v2, value.var = "v3", drop = false), = "id")[] # id v1 1 2 3 4 5 6 # 1: 1 1.105 na 3 2 na 2 na # 2: 2 2.012 5 4 na na na 3
is there better approach i'm missing?
just implemented in data.table development version v1.9.7, commit 2113, closes #1512.
require(data.table) # v1.9.7, commit 2113+ dcast(dt, ... ~ v2, value.var = "v3", drop = c(true, false)) # v1 id 1 2 3 4 5 6 # 1: 1.105 1 na 3 2 na 2 na # 2: 2.012 2 5 4 na na na 3
Comments
Post a Comment