Utility

Index

References

GPUInspector.alloc_memMethod
alloc_mem(memsize::UnitPrefixedBytes; devs=(CUDA.device(),), dtype=Float32)

Allocates memory on the devices whose IDs are provided via devs. Returns a vector of memory handles (i.e. CuArrays).

Examples:

alloc_mem(MiB(1024)) # allocate on the currently active device
alloc_mem(B(40_000_000); devs=(0,1)) # allocate on GPU0 and GPU1
GPUInspector.functionalFunction

Check if CUDA/GPU is available and functional. If not, print some (hopefully useful) debug information.

GPUInspector.get_cpu_statsMethod

Get information about all cpu cores. Returns a vector of vectors. The outer index corresponds to cpu cores. The inner vector contains the following information (in that order):

user nice system idle iowait irq softirq steal guest ?

See proc(5) for more information.

GPUInspector.get_cpu_utilizationFunction
get_cpu_utilization(core=getcpuid(); Δt=0.01)

Get the utilization (in percent) of the given cpu core over a certain time interval Δt.

GPUInspector.get_cpu_utilizationsFunction
get_cpu_utilizations(cores=0:Sys.CPU_THREADS-1; Δt=0.01)

Get the utilization (in percent) of the given cpu cores over a certain time interval Δt.

Based on this.

GPUInspector.toggle_tensorcoremathFunction
toggle_tensorcoremath([enable::Bool; verbose=true])

Switches the CUDA.math_mode between CUDA.FAST_MATH (enable=true) and CUDA.DEFAULT_MATH (enable=false). For matmuls of CuArray{Float32}s, this should have the effect of using/enabling and not using/disabling tensor cores. Of course, this only works on supported devices and CUDA versions.

If no arguments are provided, this functions toggles between the two math modes.

GPUInspector.@unrollMacro
@unroll N expr

Takes a for loop as expr and informs the LLVM unroller to unroll it N times, if it is safe to do so.

GPUInspector.@unrollMacro

@unroll expr Takes a for loop as expr and informs the LLVM unroller to fully unroll it, if it is safe to do so and the loop count is known.