-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
I was following the docs here to evaluate a custom dataset by swapping out the subset:
As long as it follows the SWE-bench format, you can use --subset /path/to/your/dataset to run on a custom dataset. The dataset needs to be loadable as datasets.load_dataset(path, split=split).
e.g. running something like:
sb-cli submit my-dataset-path test --predictions_path preds.json --run_id some-id-for-your-run
The above runs into an invalid subset error. Does custom evaluation only work locally?
Relevant docs: https://mini-swe-agent.com/latest/usage/swebench/#__tabbed_2_2
Thanks for your help!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels