Orchestrators
Orchestrators manage reading data from a pipeline and creating RL data elements (i.e. trlx.data.RLElement
)
to push to a models rollout storage. Use the trlx.orchestrator.register_orchestrator
decorator when creating
new orchestrators.
General
- class trlx.orchestrator.Orchestrator(pipeline: trlx.pipeline.BasePipeline, rl_model: trlx.model.BaseRLModel)[source]
PPO
- class trlx.orchestrator.ppo_orchestrator.PPOOrchestrator(model: trlx.model.BaseRLModel, pipeline: trlx.pipeline.BasePipeline, reward_fn: Callable, metric_fn: Optional[Callable] = None, chunk_size: int = 512)[source]
Orchestrator that prepares data for PPO training: transforms samples from pipeline into PPOBatch and pushes them into model’s store
ILQL