Skip to content

fix(fzd): deduplicate batch inputs before calling fzr#61

Open
yannrichet-tmp wants to merge 1 commit intoFunz:mainfrom
yannrichet-tmp:fix/fzd-deduplicate-batch-inputs
Open

fix(fzd): deduplicate batch inputs before calling fzr#61
yannrichet-tmp wants to merge 1 commit intoFunz:mainfrom
yannrichet-tmp:fix/fzd-deduplicate-batch-inputs

Conversation

@yannrichet-tmp
Copy link

Summary

  • Deduplicate points in fzd batches before passing them to fzr: when a design-of-experiments algorithm proposes duplicate points in a single batch, each unique point is now evaluated only once and results are reused for duplicates. This avoids redundant computations and prevents silent result overwrites (since duplicate rows would map to the same temp directory).
  • Reject duplicate rows in fzr input: fzr now raises a clear ValueError if its input_variables DataFrame contains duplicate rows, catching the issue early with an informative message.

Test plan

  • Verify fzr raises ValueError when given a DataFrame with duplicate rows
  • Verify fzd correctly deduplicates batch points and re-maps results back to the full design
  • Run existing test suite to confirm no regressions

🤖 Generated with Claude Code

Algorithms like brent submit batches containing duplicate parameter
values (e.g. the initial [min, max, max] design).  fzr would fail
because two entries map to the same temp-directory name.

Before calling fzr, build a unique_design list and an index_map that
records which slot in unique_design each original entry corresponds to.
After fzr returns, reconstruct result_df to the full original length by
re-indexing with iloc[index_map], so the algorithm sees the expected
number of results including the duplicated ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants