CarperAI is doing great work lowering the barrier for RLHF training (i.e. training ChatGPT-like models). The latest release of their trlX library includes this great example, showing how to train RLHF models at scale with an open-source dataset!
3
36
196
51K
49