syntactic sugar - Julia DataFrame multiple values filtering -
there 2 ways filter dataframe in case below:
1. df = df[((df[:field].==1) | (df[:field].==2)), :] 2. df = df[[in(v, [1, 2]) v in df[:field]], :] second approach slower it's suitable mutable set of values in condition. there syntactic sugar missed can fast 1st way in-like construction?
julia> using dataframes
findinfunction way task:
julia> function t_findin(df::dataframes.dataframe) df[findin(df[:a],[1,2]), :] end t3 (generic function 1 method) array comprehensions:
julia> function t_compr(df::dataframes.dataframe) df[[in(v, [1, 2]) v in df[:a]], :] end t1 (generic function 1 method) multiple conditionds:
julia> function t_mconds(df::dataframes.dataframe) df[((df[:a].==1) | (df[:a].==2)), :] end t2 (generic function 1 method) test data
julia> df[:b] = rand(1:30,10_000_000); julia> df[:a] = rand(1:30,10_000_000); test results
julia> @time t_findin(df); 0.489064 seconds (67 allocations: 19.340 mb, 0.49% gc time) julia> @time t_mconds(df); 0.222389 seconds (106 allocations: 78.933 mb, 5.98% gc time) julia> @time t_compr(df); 23.634846 seconds (100.00 m allocations: 2.563 gb, 1.47% gc time)
Comments
Post a Comment