Ricks-Lab GPU Utilities

rickslab-gpu-utils

A set of utilities for monitoring GPU performance and modifying control settings.

In order to get maximum capability of these utilities, you should be running with a kernel that provides support of the GPUs you have installed. If using AMD GPUs, installing the latest amdgpu driver package or the latest ROCm release, may provide additional capabilities. If you have Nvidia GPUs installed, you should have nvidia-smi installed in order for the utility reading of the cards to be possible. Writing to GPUs is currently only possible for AMD GPUs, and only with compatible cards and with the the AMD ppfeaturemask set to 0xfffd7fff. This can be accomplished by adding amdgpu.ppfeaturemask=0xfffd7fff to the GRUB_CMDLINE_LINUX_DEFAULT value in /etc/default/grub and executing sudo update-grub

Check out the User Guide!

Install the latest package from PyPI with the following commands:

pip3 uninstall rickslab-gpu-utils
pip3 install rickslab-gpu-utils

gpu-chk

This utility verifies if the environment is compatible with rickslab-gpu-utils.

gpu-ls

This utility displays most relevant parameters for installed and compatible GPUs. The default behavior is to list relevant parameters by GPU. OpenCL platform information is added when the --clinfo option is used. A brief listing of key parameters is available with the --short command line option. A simplified table of current GPU state is displayed with the --table option. The --no_fan can be used to ignore fan settings. The --pstate option can be used to output the p-state table for each GPU instead of the list of basic parameters. The --ppm option is used to output the table of available power/performance modes instead of basic parameters.

gpu-mon

A utility to give the current state of all compatible GPUs. The default behavior is to continuously update a text based table in the current window until Ctrl-C is pressed. With the --gui option, a table of relevant parameters will be updated in a Gtk window. You can specify the delay between updates with the --sleep N option where N is an integer > zero that specifies the number of seconds to sleep between updates. The --no_fan option can be used to disable the reading and display of fan information. The --log option is used to write all monitor data to a psv log file. When writing to a log file, the utility will indicate this in red at the top of the window with a message that includes the log file name. The --plot will display a plot of critical GPU parameters which updates at the specified --sleep N interval. If you need both the plot and monitor displays, then using the --plot option is preferred over running both tools as a single read of the GPUs is used to update both displays. The --ltz option results in the use of local time instead of UTC.

gpu-plot

A utility to continuously plot the trend of critical GPU parameters for all compatible GPUs. The --sleep N can be used to specify the update interval. The gpu-plot utility has 2 modes of operation. The default mode is to read the GPU driver details directly, which is useful as a standalone utility. The --stdin option causes gpu-plot to read GPU data from stdin. This is how gpu-mon produces the plot and can also be used to pipe your own data into the process. The --simlog option can be used with the --stdin when a monitor log file is piped as stdin. This is useful for troubleshooting and can be used to display saved log results. The --ltz option results in the use of local time instead of UTC. If you plan to run both gpu-plot and gpu-mon, then the --plot option of the gpu-mon utility should be used instead of both utilities in order reduce data reads by a factor of 2.

gpu-pac

Program and Control compatible GPUs with this utility. By default, the commands to be written to a GPU are written to a bash file for the user to inspect and run. If you have confidence, the --execute_pac option can be used to execute and then delete the saved bash file. Since the GPU device files are writable only by root, sudo is used to execute commands in the bash file, as a result, you will be prompted for credentials in the terminal where you executed gpu-pac. The --no_fan option can be used to eliminate fan details from the utility. The --force_write option can be used to force all configuration parameters to be written to the GPU. The default behavior is to only write changes.

New in this Release - v3.5.0

Development Plans

Known Issues

References

History

New in Previous Release - v3.3.14

New in Previous Release - v3.2.0

New in Previous Release - v3.0.0

New in Previous Release - v2.7.0

New in Previous Release - v2.6.0

New in Previous Release - v2.5.2

New in Previous Release - v2.5.1

New in Previous Release - v2.5.0

New in Previous Release - v2.4.0

New in Previous Release - v2.3.1

New in Previous Release - v2.3.0

New in Previous Release - v2.2.0

New in Previous Release - v2.1.0

New in Previous Release - v2.0.0

New in Previous Release - v1.1.0

New in Previous Release - v1.0.0