Skip to content

Add builtin data structure benchmark framework for booster#4139

Draft
Stevengre wants to merge 3 commits intomasterfrom
benchmark-framework
Draft

Add builtin data structure benchmark framework for booster#4139
Stevengre wants to merge 3 commits intomasterfrom
benchmark-framework

Conversation

@Stevengre
Copy link

Adds a benchmark framework for profiling booster's builtin K data structure operations (KMap, KSet, KList), including a pipeline benchmark for matchMaps.

What's included:

  • Booster.Benchmark.Data / Booster.Benchmark.Ops: generate synthetic K terms and benchmark operations (lookup, insert, update, remove, union, intersection, etc.) at various sizes (10–50000)
  • booster-bench executable using tasty-bench with CSV output
  • Pipeline benchmarks: matchMaps, term ordering (Ord vs hash-first), substitution application, and a full single-rule rewrite pipeline
  • Unit tests validating generated data structures
  • scripts/run-bench-profile.sh for profiling with GHC RTS options

@Stevengre
Copy link
Author

Stevengre commented Feb 25, 2026

Benchmark Results (GHC 9.6.5)

KMap Operations

Size lookup-existing lookup-missing size insert update remove keys values in_keys sortAndDedup
10 85.9 ns 79.7 ns 151 ns 1.47 μs 1.43 μs 1.30 μs 758 ns 734 ns 159 ns 2.65 μs
100 340 ns 522 ns 231 ns 12.7 μs 13.1 μs 12.7 μs 6.73 μs 7.22 μs 1.13 μs 28.2 μs
1000 3.85 μs 4.85 μs 1.32 μs 346 μs 145 μs 150 μs 69.0 μs 79.2 μs 15.0 μs 488 μs
5000 13.9 μs 29.7 μs 12.6 μs 2.12 ms 798 μs 759 μs 359 μs 370 μs 56.7 μs 2.23 ms
10000 20.2 μs 51.4 μs 34.0 μs 7.88 ms 2.18 ms 1.95 ms 818 μs 1.13 ms 93.3 μs 7.53 ms

KSet Operations

Size in size difference union intersection sortAndDedup
10 40.8 ns 13.6 ns 1.25 μs 1.82 μs 1.13 μs 1.79 μs
100 286 ns 82.2 ns 10.5 μs 14.3 μs 10.5 μs 20.8 μs
1000 1.55 μs 2.07 μs 122 μs 154 μs 110 μs 270 μs
5000 19.7 μs 8.69 μs 596 μs 966 μs 578 μs 1.59 ms
10000 37.1 μs 20.3 μs 1.58 ms 2.80 ms 1.88 ms 3.94 ms

KList Operations

Size get-0 get-middle get-last size range concat
10 635 ns 670 ns 597 ns 133 ns 1.71 μs 1.33 μs
100 708 ns 827 ns 873 ns 216 ns 4.73 μs 10.4 μs
1000 1.70 μs 2.30 μs 3.23 μs 1.21 μs 31.7 μs 108 μs
5000 5.44 μs 8.20 μs 10.5 μs 5.01 μs 154 μs 572 μs
10000 10.8 μs 14.8 μs 21.2 μs 11.3 μs 361 μs 1.41 ms
50000 48.4 μs 72.6 μs 123 μs 59.3 μs

Pipeline Benchmarks

Benchmark Mean
matchMaps (size 10) 6.99 μs
matchMaps (size 100) 175 μs
matchMaps (size 1000) 24.3 ms
ord-term derived (all sizes) ~14.5 ns
ord-term hash-first (all sizes) ~6.6 ns
substitution unchanged-keys (size 1000) 708 μs
substitution changed-keys (size 1000) 881 μs
substitution unchanged-keys (size 5000) 4.97 ms
substitution changed-keys (size 5000) 7.55 ms
full-single-rule-pipeline 10.9 μs

Key observations:

  • matchMaps at size 1000 takes 24.3 ms — a potential optimization target
  • Hash-first term comparison is ~2x faster than derived Ord across all sizes
  • KMap insert shows super-linear growth (likely due to normalization), while lookup scales well
  • Substitution with changed keys is ~1.5x slower than unchanged keys at size 5000

@Stevengre Stevengre marked this pull request as draft February 25, 2026 13:37
@ehildenb
Copy link
Member

@jberthold what do you think of this benchmark? It looks good to me, any changes/additions you would make to it? At least, having it as a test in the repo seems like a good thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants