hws - Hardware Sampling for GPUs and CPUs 1.1.1
Hardware sampling (e.g., clock frequencies, memory consumption, temperatures, or energy draw) for CPUs and GPUS.
|
The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power draw. It currently supports CPUs as well as GPUs from NVIDIA, AMD, and Intel.
General dependencies:
find_package
call)find_package
call)Dependencies based on the hardware to sample:
turbostat
(may require root privileges), lscpu
, or free
and the subprocess.h
library (automatically build during the CMake configuration if it couldn't be found using the respective find_package
call)NVML
rocm_smi_lib
Level Zero library
To download the hardware sampling use:
Building the library can be done using the normal CMake approach:
The [optional_options]
can be one or multiple of:
HWS_ENABLE_CPU_SAMPLING=ON|OFF|AUTO
(default: AUTO
):ON
: check whether CPU information can be sampled and fail if this is not the caseAUTO
: check whether CPU information can be sampled but do not fail if this is not the caseOFF
: do not check whether CPU information can be sampledHWS_ENABLE_GPU_NVIDIA_SAMPLING=ON|OFF|AUTO
(default: AUTO
):ON
: check whether NVIDIA GPU information can be sampled and fail if this is not the caseAUTO
: check whether NVIDIA GPU information can be sampled but do not fail if this is not the caseOFF
: do not check whether NVIDIA GPU information can be sampledHWS_ENABLE_GPU_AMD_SAMPLING=ON|OFF|AUTO
(default: AUTO
):ON
: check whether AMD GPU information can be sampled and fail if this is not the caseAUTO
: check whether AMD GPU information can be sampled but do not fail if this is not the caseOFF
: do not check whether AMD GPU information can be sampledHWS_ENABLE_GPU_INTEL_SAMPLING=ON|OFF|AUTO
(default: AUTO
):ON
: check whether Intel GPU information can be sampled and fail if this is not the caseAUTO
: check whether Intel GPU information can be sampled but do not fail if this is not the caseOFF
: do not check whether Intel GPU information can be sampledHWS_ENABLE_ERROR_CHECKS=ON|OFF
(default: OFF
): enable sanity checks during hardware sampling, may be problematic with smaller sample intervalsHWS_SAMPLING_INTERVAL=100ms
(default: 100ms
): set the sampling interval in millisecondsHWS_ENABLE_PYTHON_BINDINGS=ON|OFF
(default: ON
): enable Python bindingsThe library supports the install
target:
Afterward, the necessary exports should be performed:
Note: when using Intel GPUs, the CMAKE_MODULE_PATH
should be updated to point to our cmake
directory containing the Findlevel_zero.cmake
file and export ZES_ENABLE_SYSMAN=1
should be set.
The library is also available via pip:
This pip install behaves as if no additional CMake options were provided. This means that only the hardware is supported for which the respective vendor libraries was available at the point of the pip install hardware-sampling
invocation.
The sampling type fixed
denotes samples that are gathered once per hardware samples like maximum clock frequencies or temperatures or the total available memory. The sampling type sampled
denotes samples that are gathered during the whole hardware sampling process like the current clock frequencies, temperatures, or memory consumption.
sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
---|---|---|---|---|---|
architecture | fixed | str | str | str | - |
byte_order | fixed | str | str (fix) | str (fix) | str (fix) |
num_cores | fixed | int | int | - | - |
num_threads | fixed | int | - | - | - |
threads_per_core | fixed | int | - | - | - |
cores_per_socket | fixed | int | - | - | - |
num_sockets | fixed | int | - | - | - |
numa_nodes | fixed | int | - | - | - |
vendor_id | fixed | str | str (fix) | str | str (PCIe ID) |
name | fixed | str | str | str | str |
flags | fixed | list of str | - | - | list of str |
persistence_mode | fixed | - | bool | - | - |
standby_mode | fixed | - | - | - | str |
num_threads_per_eu | fixed | - | - | - | int |
eu_simd_width | fixed | - | - | - | int |
compute_utilization | sampled | % | % | % | - |
memory_utilization | sampled | - | % | % | - |
ipc | sampled | float | - | - | - |
irq | sampled | int | - | - | - |
smi | sampled | int | - | - | - |
poll | sampled | int | - | - | - |
poll_percent | sampled | % | - | - | - |
performance_level | sampled | - | int | str | - |
sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
---|---|---|---|---|---|
auto_boosted_clock_enabled | fixed | bool | bool | - | - |
clock_frequency_min | fixed | MHz | MHz | MHz | MHz |
clock_frequency_max | fixed | MHz | MHz | MHz | MHz |
memory_clock_frequency_min | fixed | - | MHz | MHz | MHz |
memory_clock_frequency_max | fixed | - | MHz | MHz | MHz |
socket_clock_frequency_min | fixed | - | - | MHz | - |
socket_clock_frequency_min | fixed | - | - | MHz | - |
sm_clock_frequency_max | fixed | - | MHz | - | - |
available_clock_frequencies | fixed | - | map of MHz | list of MHz | list of MHz |
available_memory_clock_frequencies | fixed | - | list of MHz | list of MHz | list of MHz |
clock_frequency | sampled | MHz | MHz | MHz | MHz |
average_non_idle_clock_frequency | sampled | MHz | - | - | - |
time_stamp_counter | sampled | MHz | - | - | - |
memory_clock_frequency | sampled | - | MHz | MHz | MHz |
socket_clock_frequency | sampled | - | - | MHz | - |
sm_clock_frequency | sampled | - | MHz | - | - |
overdrive_level | sampled | - | - | % | - |
memory_overdrive_level | sampled | - | - | % | - |
throttle_reason | sampled | - | bitmask | - | bitmask |
throttle_reason_string | sampled | - | str | - | str |
memory_throttle_reason | sampled | - | - | - | bitmask |
memory_throttle_reason_string | sampled | - | - | - | str |
auto_boosted_clock | sampled | - | bool | - | - |
frequency_limit_tdp | sampled | - | - | - | MHz |
memory_frequency_limit_tdp | sampled | - | - | - | MHz |
sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
---|---|---|---|---|---|
power_management_limit | fixed | - | W | W | - |
power_enforced_limit | fixed | - | W | W | W |
power_measurement_type | fixed | str (fix) | str | str | str |
power_management_mode | fixed | - | bool | - | bool |
available_power_profiles | fixed | - | list of int | list of str | - |
power_usage | sampled | W | W | W | W<br>(calculated via power_total_energy_consumption) |
core_watt | sampled | W | - | - | - |
dram_watt | sampled | W | - | - | - |
package_rapl_throttling | sampled | % | - | - | - |
dram_rapl_throttling | sampled | % | - | - | - |
power_total_energy_consumption | sampled | J<br>(calculated via power_usage) | J | J (calculated via power_usage if power_total_energy_consumption isn't available) | J |
power_profile | sampled | - | int | str | - |
sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
---|---|---|---|---|---|
cache_size_L1d | fixed | str | - | - | - |
cache_size_L1i | fixed | str | - | - | - |
cache_size_L2 | fixed | str | - | - | - |
cache_size_L3 | fixed | str | - | - | - |
memory_total | fixed | B | B | B | B<br>(map of memory modules) |
visible_memory_total | fixed | - | - | B | B<br>(map of memory modules) |
swap_memory_total | fixed | B | - | - | - |
num_pcie_lanes_min | fixed | - | - | int | - |
num_pcie_lanes_max | fixed | - | int | int | int |
pcie_link_generation_max | fixed | - | int | - | int |
pcie_link_speed_max | fixed | - | MBPS | - | MBPS |
pcie_link_transfer_rate_min | fixed | - | - | MT/s | - |
pcie_link_transfer_rate_max | fixed | - | - | MT/s | - |
memory_bus_width | fixed | - | Bit | - | Bit<br>(map of memory modules) |
memory_num_channels | fixed | - | - | - | int<br>(map of memory modules) |
memory_used | sampled | B | B | B | B<br>(map of memory modules) |
memory_free | sampled | B | B | B | B<br>(map of memory modules) |
swap_memory_used | sampled | B | - | - | - |
swap_memory_free | sampled | B | - | - | - |
num_pcie_lanes | sampled | - | int | int | int |
pcie_link_generation | sampled | - | int | - | int |
pcie_link_speed | sampled | - | MBPS | - | MBPS |
pcie_link_transfer_rate | sampled | - | - | T/s | - |
sample | sample type | CPUs | NVIDIA GPUs | AMD GPUs | Intel GPUs |
---|---|---|---|---|---|
num_fans | fixed | - | int | int | int |
fan_speed_min | fixed | - | % | - | - |
fan_speed_max | fixed | - | % | RPM | RPM |
temperature_min | fixed | - | - | °C | - |
temperature_max | fixed | - | °C | °C | °C |
memory_temperature_min | fixed | - | - | °C | - |
memory_temperature_max | fixed | - | °C | °C | °C |
hotspot_temperature_min | fixed | - | - | °C | - |
hotspot_temperature_max | fixed | - | - | °C | - |
hbm_0_temperature_min | fixed | - | - | °C | - |
hbm_0_temperature_max | fixed | - | - | °C | - |
hbm_1_temperature_min | fixed | - | - | °C | - |
hbm_1_temperature_max | fixed | - | - | °C | - |
hbm_2_temperature_min | fixed | - | - | °C | - |
hbm_2_temperature_max | fixed | - | - | °C | - |
hbm_3_temperature_min | fixed | - | - | °C | - |
hbm_3_temperature_max | fixed | - | - | °C | - |
global_temperature_max | fixed | - | - | °C | °C |
fan_speed_percentage | sampled | - | % | % | % |
temperature | sampled | °C | °C | °C | °C |
memory_temperature | sampled | - | - | °C | °C |
hotspot_temperature | sampled | - | - | °C | - |
hbm_0_temperature | sampled | - | - | °C | - |
hbm_1_temperature | sampled | - | - | °C | - |
hbm_2_temperature | sampled | - | - | °C | - |
hbm_3_temperature | sampled | - | - | °C | - |
global_temperature | sampled | - | - | - | °C |
psu_temperature | sampled | - | - | - | °C |
core_temperature | sampled | °C | - | - | - |
core_throttle_percent | sampled | % | - | - | - |
sample | sample type | CPUs |
---|---|---|
gfx_render_state_percent | sampled | % |
gfx_frequency | sampled | MHz |
average_gfx_frequency | sampled | MHz |
gfx_state_c0_percent | sampled | % |
cpu_works_for_gpu_percent | sampled | % |
gfx_watt | sampled | W |
sample | sample type | CPUs |
---|---|---|
idle_states | fixed | map of values |
all_cpus_state_c0_percent | sampled | % |
any_cpu_state_c0_percent | sampled | % |
low_power_idle_state_percent | sampled | % |
system_low_power_idle_state_percent | sampled | % |
package_low_power_idle_state_percent | sampled | % |
The hws library is distributed under the MIT license.