r - sample column without duplicates -
i'm writing custom function achieve this, wondering if there simple, built-in function in r achieve same goals.
i have data like:
stringvariable1 stringvariable2 string1 string1 b string1 d string2 e string2 string3 b and want shuffle data in stringvariable2, don't want duplicates in respect different stringvariables in 1.
so wouldn't acceptable (as 'b' duplicated respect string1):
stringvariable1 stringvariable2 string1 b string1 b string1 d string2 string2 e string3 d but would:
stringvariable1 stringvariable2 string1 b string1 e string1 d string2 string2 e string3 d so i'm trying randomise stringvariable2, without replacement respect different stringvariable1's. creating custom function way this?
thanks!
are values of stringvariable2 duplicated in groups of stringvariable1? if not, group-wise permutation performed (d name of data frame containing data):
d$perm1<-as.vector(unlist(tapply(d$stringvariable2, d$stringvariable1, sample))) this (tapply()) applies sampling without replacement (using sample()) stringvariable2 inside every group of stringvariable1. finally, resulting list converted vector using unlist() , as.vector(). last function strips off names of observations inside vector. permuted values stored ìn column perm1 of original data frame.
Comments
Post a Comment