gpu - openAcc how to profile -
hi using caps openacc compilers, strage happens when try preliminary profile results.
at first, ran code declaring hmpprt_log_level="info", generates profile results time stamp.
[ 2.612337] ( 0) info : upload edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50) [ 2.613485] ( 0) info : call __hmpp_acc_region__2ha750yb (queue=none, location=gravity_openacc.c:50) [ 2.614367] ( 0) info : free edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50)
so guess kernel execution time calculated 2.614367-2.613485=0.000882 s.
but when declaring cuda_profile=1, below profile shown
method=[ __hmpp_acc_region__2ha750yb_parallel_region_1 ] gputime=[ 492.480 ] cputime=[ 13.000 ] occupancy=[ 0.250 ]
so i'm quite confused these 2 results, true???
anyone solutions?
thanks!
the cuda profiler shows time takes execute cuda kernel, while log obtain hmpprt_log_level="info" gives overall time takes execute region, not same thing, because may have code executed on host example.
Comments
Post a Comment