Skip to content

Add Builder for fast sorted fingerprint insertion#20

Open
lukesandberg wants to merge 1 commit intoarthurprs:masterfrom
lukesandberg:lukesandberg/insert_sorted
Open

Add Builder for fast sorted fingerprint insertion#20
lukesandberg wants to merge 1 commit intoarthurprs:masterfrom
lukesandberg:lukesandberg/insert_sorted

Conversation

@lukesandberg
Copy link

@lukesandberg lukesandberg commented Feb 27, 2026

Summary

  • Adds a Builder type for constructing filters from fingerprints in sorted (non-decreasing) order. Each insertion is O(1) amortized via sequential append — no run boundary lookups, linear scans, or element shifting.
  • grow() and shrink_to_fit() internal paths now use Builder for reconstruction, improving their performance.
  • At ~95%+ occupancy when slots wrap around, the builder delegates to Filter::insert_impl for correctness and simplicity
  • Out-of-order insertion panics unconditionally, making the sorted contract explicit.

API

let mut builder = qfilter::Builder::new(1000, 0.01).unwrap();
// Also: Builder::new_resizeable(), Builder::with_fingerprint_size()

let fp_size = builder.fingerprint_size();
let mut hashes: Vec<u64> = items.iter()
    .map(|i| qfilter::compute_fingerprint(i, fp_size))
    .collect();
hashes.sort();

for h in hashes {
    builder.insert_fingerprint(false, h).unwrap();
}
let filter = builder.into_filter();

Benchmarks (10k items, criterion change% vs baseline)

Benchmark Change Notes
sorted_insert ~92 µs New — ~7x faster than grow (regular insert to fill)
grow_from_90pct -11.5% Internal rebuild now uses Builder
grow_resizeable -16.7% Multiple growth cycles, each rebuild faster
shrink -11.5% shrink_to_fit rebuild uses Builder
shrink_10pct -11.8% Same mechanism
grow neutral No growth triggered in this bench

Test plan

  • cargo test — 35 unit tests + 6 doc-tests pass
  • cargo clippy — clean
  • cargo +nightly fuzz run fuzz_sorted_insert -- -max_total_time=120 — 636K iterations, 0 crashes (includes resizable growth paths)
  • cargo bench — no regressions, improvements in grow/shrink paths

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lukesandberg lukesandberg force-pushed the lukesandberg/insert_sorted branch from e3faa51 to d7b2ab3 Compare February 27, 2026 18:09
@lukesandberg lukesandberg marked this pull request as ready for review February 27, 2026 23:11
@arthurprs
Copy link
Owner

Thank you for your contributions. I skimmed through it and it looks good. I'll find more time in the next few days to review and merge.

@lukesandberg
Copy link
Author

Thanks! We are excited about getting these changes in and some of your recent perf improvements. They appear to have some pretty dramatic perf wins for our usecase in turbopack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants