Curriculum Group Policy Optimization:
Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation

The paper will be announced later.

🚀 Quick Started

1. Environment Set Up

Clone this repository and install packages.

git clone https://github.com/baoteng-li/CGPO.git
cd CGPO
conda create -n cgpo python=3.10.16
pip install -e .

2. Reward Preparation

We adopted the same reward model processing approach as Flow-GRPO. Since each reward model may rely on different versions, combining them in one Conda environment can cause version conflicts. To avoid this, we adopt a remote server setup inspired by ddpo-pytorch. You only need to install the specific reward model you plan to use. For more information, please refer to Flow-GRPO.

GenEval

Please create a new Conda virtual environment and install the corresponding dependencies according to the instructions in reward-server.

OCR

Please install paddle-ocr:

pip install paddlepaddle-gpu==2.6.2
pip install paddleocr==2.9.1
pip install python-Levenshtein

Then, pre-download the model using the Python command line:

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=False, lang="en", use_gpu=False, show_log=False)

Pickscore

PickScore requires no additional installation.

3. Start Training

Single-node training:

bash scripts/single_node/cgpo.sh

✨ Important Hyperparameters

You can adjust the parameters in config/grpo.py to tune different hyperparameters. An empirical finding is that config.sample.train_batch_size * num_gpu / config.sample.num_image_per_prompt * config.sample.num_batches_per_epoch = 48, i.e., group_number=48, group_size=24. Additionally, setting config.train.gradient_accumulation_steps = config.sample.num_batches_per_epoch // 2 also yields good performance.

🤗 Acknowledgement

This project is based on Flow-GRPO. Thank you for your outstanding contributions to the community.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
config		config
dataset		dataset
flow_grpo		flow_grpo
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Curriculum Group Policy Optimization:
Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation

🚀 Quick Started

1. Environment Set Up

2. Reward Preparation

GenEval

OCR

Pickscore

3. Start Training

✨ Important Hyperparameters

🤗 Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

PRIS-CV/CGPO

Folders and files

Latest commit

History

Repository files navigation

Curriculum Group Policy Optimization:Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation

🚀 Quick Started

1. Environment Set Up

2. Reward Preparation

GenEval

OCR

Pickscore

3. Start Training

✨ Important Hyperparameters

🤗 Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Curriculum Group Policy Optimization:
Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation

Packages