All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Handle exceptions for function
getpass.getuser()
by @XuehaiPan in #130. Issued by @landgraf.
- Refactor setup scripts by @XuehaiPan.
- Fix documentation for the
ResourceMetricCollector.clear()
method by @MyGodItsFull0fStars in #132. - Gracefully ignore UTF-8 decoding errors by @XuehaiPan.
1.3.2 - 2023-10-17
- Add separate implementation for
GpuStatsLogger
callback forlightning
by @XuehaiPan in #114. - Remove metrics if process is gone in
nvitop-exporter
by @XuehaiPan in #107.
1.3.1 - 2023-10-05
- Add Python 3.12 classifiers by @XuehaiPan in #101.
- Fix
libcuda.cuDeviceGetUuid()
when the UUID contains0x00
by @XuehaiPan in #100.
1.3.0 - 2023-08-27
- Add Prometheus exporter by @XuehaiPan in #92.
- Add device APIs to query PCIe and NVLink throughput by @XuehaiPan in #87.
- Use recent timestamp for GPU process utilization query for more accurate per-process GPU usage by @XuehaiPan in #85. We extend our heartfelt gratitude to @2581543189 for their invaluable assistance. Their timely comments and comprehensive feedback have greatly contributed to the improvement of this project.
- Fix upstream changes for process info v3 APIs on 535.104.05 driver by @XuehaiPan in #94.
- Fix removal for process info v3 APIs on the upstream 535.98 driver by @XuehaiPan in #89.
1.2.0 - 2023-07-24
- Include last snapshot metrics in the log results for
ResourceMetricCollector
by @XuehaiPan in #80. - Add
mypy
integration and update type annotations by @XuehaiPan in #73.
- Fix process info support for NVIDIA R535 driver (CUDA 12.2+) by @XuehaiPan in #79.
- Fix inappropriate exception catching in function
libcuda.cuDeviceGetUuid
by @XuehaiPan.
1.1.2 - 2023-04-11
- Further isolate the
CUDA_VISIBLE_DEVICES
parser in a subprocess by @XuehaiPan in #70.
1.1.1 - 2023-04-07
- Fix MIG device support by @XuehaiPan.
1.1.0 - 2023-04-07
- Support float number as snapshot interval that >= 0.25s by @XuehaiPan in #67.
- Show more host metrics (e.g., used virtual memory, uptime) in CLI by @XuehaiPan in #59.
- Move
TTLCache
usage to CLI-only by @XuehaiPan in #66.
- Respect
FORCE_COLOR
andNO_COLOR
environment variables by @XuehaiPan.
- Drop Python 3.6 support by @XuehaiPan in #56.
1.0.0 - 2023-02-01
- The first stable release of
nvitop
by @XuehaiPan.