Fix GRPO Reasoning Advanced Reward Tutorial by pramodith · Pull Request #331 · huggingface/cookbook

pramodith · 2025-09-23T16:56:47Z

What does this PR do?

Fixes the notebook to disable bf16 because the model and lora weights are configured to load in bf16

Fixes # (issue)

Who can review?

Feel free to tag members/contributors who may be interested in your PR.
@stevhliu

Set bf16=False

add comma

review-notebook-app · 2025-09-23T16:56:52Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

stevhliu · 2025-09-23T17:28:18Z

Pinging @behroozazarkhalili who contributed this notebook

behroozazarkhalili · 2025-09-24T15:12:02Z

Pinging @behroozazarkhalili who contributed this notebook

Hi,
I'll correct it and send a new pull request this weekend as I'm approaching ACL deadline.

pramodith added 3 commits September 23, 2025 17:53

Update trl_grpo_reasoning_advanced_reward.ipynb

d430359

Update trl_grpo_reasoning_advanced_reward.ipynb

1bec3c8

Set bf16=False

Update trl_grpo_reasoning_advanced_reward.ipynb

984de4a

add comma

pramodith changed the title ~~Pramodith patch 1~~ Fix GRPO Reasoning Advanced Reward Tutorial Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GRPO Reasoning Advanced Reward Tutorial#331

Fix GRPO Reasoning Advanced Reward Tutorial#331
pramodith wants to merge 3 commits intohuggingface:mainfrom
pramodith:pramodith-patch-1

pramodith commented Sep 23, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Sep 23, 2025

Uh oh!

stevhliu commented Sep 23, 2025

Uh oh!

behroozazarkhalili commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pramodith commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Uh oh!

review-notebook-app bot commented Sep 23, 2025

Uh oh!

stevhliu commented Sep 23, 2025

Uh oh!

behroozazarkhalili commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pramodith commented Sep 23, 2025 •

edited

Loading