How to implement random sampling of a set of vectors in java? -


i have huge number of context vectors , want find average cosine similarity of them. however, it's not efficient calculate through whole set. that's why, want take random sample set.

the problem each context vector explains degree of meaning word want make balanced selection(according vector values). searched , found can use monte carlo method. found gibbs sampler example here: https://darrenjw.wordpress.com/2011/07/16/gibbs-sampler-in-various-languages-revisited/

however, confused little bit. understand, method provides normal distribution , generates double numbers. did not understand how implement method in case. explain me how can solve problem?

thanks in advance.

you don't want random sample, want representative sample. 1 relatively efficient way sort elements in "strength" order, take every nth element, give representative sample of size/n elements.

try this:

// given set<vector> myset; int reductionfactor = 200; // eg sample 0.5% of elements  list<vector> list = new arraylist<>(myset); collections.sort(list, new comparator<vector> {     public int compare(vector o1, vector o2) {         // compare "strength"     }          }); list<vector> randomsample = new arraylist<>(list.size() / reductionfactor ); (int = 0; < list.size(); += reductionfactor)     randomsample.add(list.get(i); 

the time complexity o(n log n) due sort operation, , space complexity o(n).


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

android - Keyboard hides my half of edit-text and button below it even in scroll view -