r - Make the `drop` argument in `dcast` only look at the RHS of the formula -


the drop argument in dcast (from "reshape2" or "dplyr") can useful when going "long" "wide" dataset , want create columns combinations not exist in long form.

it turns out using drop affects combinations left hand side (lhs) of formula right hand side (rhs). thus, creates rows based on combinations of lhs values.

is there way override behavior?


here's sample data:

library(data.table) dt <- data.table(v1 = c(1.105, 1.105, 1.105, 2.012, 2.012, 2.012),                  id = c(1l, 1l, 1l, 2l, 2l, 2l),                   v2 = structure(c(2l, 3l, 5l, 1l, 2l, 6l),                                  .label = c("1", "2", "3", "4", "5", "6"),                                  class = "factor"),                  v3 = c(3l, 2l, 2l, 5l, 4l, 3l))  

notice "v2" factor column 6 levels. want go "long" wide", add in columns missing factor levels (in case "4").

reshape handles shape, not missing columns:

reshape(dt, direction = "wide", idvar = c("id", "v1"), timevar = "v2") #       v1 id v3.2 v3.3 v3.5 v3.1 v3.6 # 1: 1.105  1    3    2    2   na   na # 2: 2.012  2    4   na   na    5    3 

dcast handles adding missing columns, if there's 1 value on lhs:

dcast(dt, id ~ v2, value.var = "v3", drop = false) #    id  1 2  3  4  5  6 # 1:  1 na 3  2 na  2 na # 2:  2  5 4 na na na  3 

if there multiple values on lhs, combinations of values on lhs expanded out, if had used cj or expand.grid, rows 2 , 3 not @ of interest me:

dcast(dt, ... ~ v2, value.var = "v3", drop = false) #       v1 id  1  2  3  4  5  6 # 1: 1.105  1 na  3  2 na  2 na # 2: 1.105  2 na na na na na na # 3: 2.012  1 na na na na na na # 4: 2.012  2  5  4 na na na  3 

this similar using xtabs in base r: ftable(xtabs(v3 ~ id + v1 + v2, dt)).


is there way let dcast know essentially, "hey. combination of values on lhs ids. don't try fill them in me."

my current approach 3 steps, 1 collapsing down lhs values, spreading out rhs values, , 1 merging result.

merge(dt[, list(v1 = unique(v1)), .(id)],  ## or unique(dt[, c("id", "v1"), = false])       dcast(dt, id ~ v2, value.var = "v3", drop = false),        = "id")[] #    id    v1  1 2  3  4  5  6 # 1:  1 1.105 na 3  2 na  2 na # 2:  2 2.012  5 4 na na na  3 

is there better approach i'm missing?

just implemented in data.table development version v1.9.7, commit 2113, closes #1512.

require(data.table) # v1.9.7, commit 2113+ dcast(dt, ... ~ v2, value.var = "v3", drop = c(true, false)) #       v1 id  1 2  3  4  5  6 # 1: 1.105  1 na 3  2 na  2 na # 2: 2.012  2  5 4 na na na  3 

Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -