Welcome to trlX’s documentation!

trlX is a library for training large language models with reinforcement learning. Training can be done with two RL algorithms: PPO (Schulman et al. 2017) for online training and ILQL (Snell et al. 2022) for offline training. For distributed training two backends are supported: Huggingface 🤗 Accelerate and NVIDIA NeMo.