Skip to content

Test Qwen2.5-Coder-32b-instruct pass@10 #120

@hyperplane2021-hub

Description

@hyperplane2021-hub

Hi,

I’m currently testing Qwen2.5-Coder-32B-Instruct and trying to compute pass@10, but the evaluation stage seems to become nearly endless, particularly during the test execution phase.

In addition, I’m encountering OOM issues even on fairly large CPU clusters. From the behavior, it seems possible that worker processes are recursively creating more workers, which may be contributing to both the extremely long runtime and the memory blow-up.

I’m wondering:

  • Is this a known issue with the evaluator?
  • Are there recommended settings for pass@10 evaluation to avoid runaway execution?
  • Should the number of workers be restricted manually, or is there a safer evaluation mode for large-scale runs?

Any advice would be appreciated. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions