tinker_cookbook.rl.EnvGroupBuilder
class tinker_cookbook.rl.EnvGroupBuilder(ABC)
Builds a group of environments. The group will be used in the following way:
make_envs()
Create the environments for this group.
Returns: Sequence[Env]: The environments to run rollouts in.
compute_group_rewards(trajectory_group, env_group)
Compute a final reward for each trajectory that depends on the whole group.
Parameters:
- trajectory_group (list[Trajectory]) – The completed trajectories, one per environment in the group.
- env_group (Sequence[Env]) – The corresponding environments (same order as
trajectory_group).
Returns: list[tuple[float, Metrics]]: A list of (reward, metrics) pairs, one per trajectory. The reward is added to the per-timestep total; the metrics dict is merged into training logs.
cleanup()
Clean up resources created by make_envs().
logging_tags()
Return tags used to aggregate metrics in training logs.
Returns: list[str]: Tag strings for this environment group. Default is an empty list.