Pipelines

Pipelines are how you read from a dataset with trlX. Rollout stores are how models store experiences created for them by the orchestrator. It is these experiences in their rollout store that they are trained on.

General

class trlx.pipeline.BasePipeline(path: str = 'dataset')[source]

abstract create_loader(batch_size: int, shuffle: bool, prep_fn: Optional[Callable] = None, num_workers: int = 0) → torch.utils.data.dataloader.DataLoader[source]

Create a dataloader for the pipeline

Parameters: prep_fn – Typically a tokenizer. Applied to GeneralElement after collation.

class trlx.pipeline.BaseRolloutStore(capacity=- 1)[source]

abstract create_loader(batch_size: int, shuffle: bool, prep_fn: Optional[Callable] = None, num_workers: int = 0) → torch.utils.data.dataloader.DataLoader[source]

Create a dataloader for the rollout store

Parameters: prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)

abstract push(exps: Iterable[Any])[source]: Push experiences to rollout storage

PPO

class trlx.pipeline.ppo_pipeline.PPORolloutStorage(pad_token_id)[source]

Rollout storage for training PPO

create_loader(batch_size: int, shuffle: bool) → torch.utils.data.dataloader.DataLoader[source]

Create a dataloader for the rollout store

Parameters: prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)

push(exps: Iterable[trlx.data.ppo_types.PPORLElement])[source]: Push experiences to rollout storage

ILQL

class trlx.pipeline.offline_pipeline.PromptPipeline(prompts, tokenizer=None)[source]

Tokenizes texts, and then pads them into batches

create_loader(batch_size: int, shuffle=False) → torch.utils.data.dataloader.DataLoader[source]

Create a dataloader for the pipeline

Parameters: prep_fn – Typically a tokenizer. Applied to GeneralElement after collation.

class trlx.pipeline.offline_pipeline.ILQLRolloutStorage(input_ids, attention_mask, rewards, states_ixs, actions_ixs, dones)[source]

Rollout storage for training ILQL

create_loader(batch_size: int)[source]

Create a dataloader for the rollout store

Parameters: prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)