gpu - openAcc how to profile -


hi using caps openacc compilers, strage happens when try preliminary profile results.

at first, ran code declaring hmpprt_log_level="info", generates profile results time stamp.

[     2.612337] ( 0) info : upload   edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50) [     2.613485] ( 0) info : call     __hmpp_acc_region__2ha750yb (queue=none, location=gravity_openacc.c:50) [     2.614367] ( 0) info : free     edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50) 

so guess kernel execution time calculated 2.614367-2.613485=0.000882 s.

but when declaring cuda_profile=1, below profile shown

method=[ __hmpp_acc_region__2ha750yb_parallel_region_1 ] gputime=[ 492.480 ] cputime=[ 13.000 ] occupancy=[ 0.250 ]  

so i'm quite confused these 2 results, true???

anyone solutions?

thanks!

the cuda profiler shows time takes execute cuda kernel, while log obtain hmpprt_log_level="info" gives overall time takes execute region, not same thing, because may have code executed on host example.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -