-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Description
Hi,
I’m currently testing Qwen2.5-Coder-32B-Instruct and trying to compute pass@10, but the evaluation stage seems to become nearly endless, particularly during the test execution phase.
In addition, I’m encountering OOM issues even on fairly large CPU clusters. From the behavior, it seems possible that worker processes are recursively creating more workers, which may be contributing to both the extremely long runtime and the memory blow-up.
I’m wondering:
- Is this a known issue with the evaluator?
- Are there recommended settings for pass@10 evaluation to avoid runaway execution?
- Should the number of workers be restricted manually, or is there a safer evaluation mode for large-scale runs?
Any advice would be appreciated. Thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels