Skip to content

BinaryOutlook/BenchmarkGPU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BenchmarkGPU

BenchmarkGPU is a community-driven PyTorch benchmark for evaluating whether a GPU is actually delivering the level of performance it was marketed to provide.

In practice, real-world throughput can fall short for many reasons:

  • Missing or outdated drivers
  • Incorrect runtime installation
  • Power or thermal limits
  • Background system activity
  • Misconfigured environment variables
  • Silent fallbacks to slower execution paths

This project focuses on repeatable matrix-multiplication benchmarking, stability sampling, and lightweight system-signal checks so you can investigate whether your machine is underperforming.

Important Notes

  • Make sure your system has the correct drivers and runtime stack installed before benchmarking.
  • For NVIDIA GPUs, install a PyTorch build that matches your CUDA environment.
  • For AMD GPUs, install a PyTorch build with ROCm support and the required ROCm drivers.
  • For Intel GPUs, install a PyTorch build with torch.xpu support and the required Intel GPU drivers/runtime.
  • For Apple Silicon, make sure you are using a PyTorch build with Apple MPS support on a compatible macOS version.
  • This project reports interference indicators only. It is not a full malware scanner.

Testing Status

  • This codebase has not been tested extensively across the full hardware matrix.
  • Apple MPS and Nvidia CUDA support is well-tested and useful to a certain extent, it is currently more validated than the other accelerator paths.
  • The AMD ROCm, and Intel GPU paths have received less development attention.
  • Even so, you should still treat all results with engineering caution and verify suspicious behavior on your own hardware.

Usage

Run the benchmark through either entrypoint:

python3 main.py

or

python3 -m benchmark_gpu

Examples:

python3 main.py --device auto
python3 main.py --device cuda --device-index 0
python3 main.py --device rocm --device-index 0
python3 main.py --device xpu --device-index 0
python3 main.py --device mps
python3 main.py --device cpu

The benchmark collects repeated samples, looks for anomalous measurements, and writes a plain-text report to the results/ directory by default.

Community Results

You can browse shared benchmark submissions from me and the community in docs/results.md.

If you would like to add your own result, please open a pull request and append a new row to the table after running the benchmark on your hardware.

Project Layout

The codebase is intentionally modular so contributors can work on one subsystem without creating backend-specific spaghetti:

  • benchmark_gpu/backends/: backend adapters for CUDA, ROCm, Intel XPU, Apple MPS, and CPU
  • benchmark_gpu/benchmark/: benchmark execution and stability logic
  • benchmark_gpu/diagnostics/: lightweight interference checks
  • benchmark_gpu/reports/: plain-text reporting
  • benchmark_gpu/cli.py: CLI parsing and validation
  • benchmark_gpu/app.py: application orchestration

Contributing

If you see any issues, please feel free to:

  1. Write me an issue, which I will use for testing.
  2. Or even better, send me a pull request with the fixed code, and I will review it ASAP.

Hardware diversity is the hardest part of a GPU benchmarking project, so real-world bug reports and fixes are extremely valuable.

Thank you for having me in this community-driven project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages