Evaluation of custom datasets with sb-cli

I was following the docs here to evaluate a custom dataset by swapping out the subset:

> As long as it follows the SWE-bench format, you can use --subset /path/to/your/dataset to run on a custom dataset. The dataset needs to be loadable as datasets.load_dataset(path, split=split).

e.g. running something like:

`sb-cli submit my-dataset-path test --predictions_path preds.json --run_id some-id-for-your-run`

The above runs into an invalid subset error. Does custom evaluation only work locally?

Relevant docs: https://mini-swe-agent.com/latest/usage/swebench/#__tabbed_2_2

Thanks for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation of custom datasets with sb-cli #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluation of custom datasets with sb-cli #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions