Utility
Index
GPUInspector.alloc_mem
GPUInspector.clear_all_gpus_memory
GPUInspector.clear_gpu_memory
GPUInspector.functional
GPUInspector.get_cpu_stats
GPUInspector.get_cpu_utilization
GPUInspector.get_cpu_utilizations
GPUInspector.get_cpusocket_temperatures
GPUInspector.hastensorcores
GPUInspector.toggle_tensorcoremath
References
GPUInspector.alloc_mem
— Methodalloc_mem(memsize::UnitPrefixedBytes; devs=(CUDA.device(),), dtype=Float32)
Allocates memory on the devices whose IDs are provided via devs
. Returns a vector of memory handles (i.e. CuArray
s).
Examples:
alloc_mem(MiB(1024)) # allocate on the currently active device
alloc_mem(B(40_000_000); devs=(0,1)) # allocate on GPU0 and GPU1
GPUInspector.clear_all_gpus_memory
— FunctionReclaim the unused memory of all available GPUs.
GPUInspector.clear_gpu_memory
— FunctionReclaim the unused memory of the currently active GPU (i.e. device()
).
GPUInspector.functional
— FunctionCheck if CUDA/GPU is available and functional. If not, print some (hopefully useful) debug information.
GPUInspector.get_cpu_stats
— MethodGet information about all cpu cores. Returns a vector of vectors. The outer index corresponds to cpu cores. The inner vector contains the following information (in that order):
user nice system idle iowait irq softirq steal guest ?
See proc(5) for more information.
GPUInspector.get_cpu_utilization
— Functionget_cpu_utilization(core=getcpuid(); Δt=0.01)
Get the utilization (in percent) of the given cpu core
over a certain time interval Δt
.
GPUInspector.get_cpu_utilizations
— Functionget_cpu_utilizations(cores=0:Sys.CPU_THREADS-1; Δt=0.01)
Get the utilization (in percent) of the given cpu cores
over a certain time interval Δt
.
Based on this.
GPUInspector.get_cpusocket_temperatures
— MethodTries to get the temperatures of the available CPUs (sockets not cores) in degrees Celsius.
Based on cat /sys/class/thermal/thermal_zone*/temp
.
GPUInspector.hastensorcores
— FunctionChecks whether the given CuDevice
has Tensor Cores.
GPUInspector.toggle_tensorcoremath
— Functiontoggle_tensorcoremath([enable::Bool; verbose=true])
Switches the CUDA.math_mode
between CUDA.FAST_MATH
(enable=true
) and CUDA.DEFAULT_MATH
(enable=false
). For matmuls of CuArray{Float32}
s, this should have the effect of using/enabling and not using/disabling tensor cores. Of course, this only works on supported devices and CUDA versions.
If no arguments are provided, this functions toggles between the two math modes.
GPUInspector.@unroll
— Macro@unroll N expr
Takes a for loop as expr
and informs the LLVM unroller to unroll it N
times, if it is safe to do so.
GPUInspector.@unroll
— Macro@unroll expr Takes a for loop as expr
and informs the LLVM unroller to fully unroll it, if it is safe to do so and the loop count is known.