BenchmarkGPU

BenchmarkGPU is a community-driven PyTorch benchmark for evaluating whether a GPU is actually delivering the level of performance it was marketed to provide.

In practice, real-world throughput can fall short for many reasons:

Missing or outdated drivers
Incorrect runtime installation
Power or thermal limits
Background system activity
Misconfigured environment variables
Silent fallbacks to slower execution paths

This project focuses on repeatable matrix-multiplication benchmarking, stability sampling, and lightweight system-signal checks so you can investigate whether your machine is underperforming.

Important Notes

Make sure your system has the correct drivers and runtime stack installed before benchmarking.
For NVIDIA GPUs, install a PyTorch build that matches your CUDA environment.
For AMD GPUs, install a PyTorch build with ROCm support and the required ROCm drivers.
For Intel GPUs, install a PyTorch build with torch.xpu support and the required Intel GPU drivers/runtime.
For Apple Silicon, make sure you are using a PyTorch build with Apple MPS support on a compatible macOS version.
This project reports interference indicators only. It is not a full malware scanner.

Testing Status

This codebase has not been tested extensively across the full hardware matrix.
Apple MPS and Nvidia CUDA support is well-tested and useful to a certain extent, it is currently more validated than the other accelerator paths.
The AMD ROCm, and Intel GPU paths have received less development attention.
Even so, you should still treat all results with engineering caution and verify suspicious behavior on your own hardware.

Usage

Run the benchmark through either entrypoint:

python3 main.py

or

python3 -m benchmark_gpu

Examples:

python3 main.py --device auto
python3 main.py --device cuda --device-index 0
python3 main.py --device rocm --device-index 0
python3 main.py --device xpu --device-index 0
python3 main.py --device mps
python3 main.py --device cpu

The benchmark collects repeated samples, looks for anomalous measurements, and writes a plain-text report to the results/ directory by default.

Community Results

You can browse shared benchmark submissions from me and the community in docs/results.md.

If you would like to add your own result, please open a pull request and append a new row to the table after running the benchmark on your hardware.

Project Layout

The codebase is intentionally modular so contributors can work on one subsystem without creating backend-specific spaghetti:

benchmark_gpu/backends/: backend adapters for CUDA, ROCm, Intel XPU, Apple MPS, and CPU
benchmark_gpu/benchmark/: benchmark execution and stability logic
benchmark_gpu/diagnostics/: lightweight interference checks
benchmark_gpu/reports/: plain-text reporting
benchmark_gpu/cli.py: CLI parsing and validation
benchmark_gpu/app.py: application orchestration

Contributing

If you see any issues, please feel free to:

Write me an issue, which I will use for testing.
Or even better, send me a pull request with the fixed code, and I will review it ASAP.

Hardware diversity is the hardest part of a GPU benchmarking project, so real-world bug reports and fixes are extremely valuable.

Thank you for having me in this community-driven project.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
benchmark_gpu		benchmark_gpu
docs		docs
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BenchmarkGPU

Important Notes

Testing Status

Usage

Community Results

Project Layout

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BenchmarkGPU

Important Notes

Testing Status

Usage

Community Results

Project Layout

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages