Skip to content

tinker_cookbook.rl.assemble_training_data

tinker_cookbook.rl.assemble_training_data(trajectory_groups_P, advantages_P)

Convert trajectories to training data format.

Parameters:

  • trajectory_groups_P (list[TrajectoryGroup]) – Groups of trajectories to convert into training datums.
  • advantages_P (list[torch.Tensor]) – Per-group advantage tensors, one per trajectory group, as returned by :func:compute_advantages.

Returns: tuple[list[tinker.Datum], list[dict[str, int]]]: A flat list of training datums and a parallel list of metadata dicts mapping each datum back to its group_idx and traj_idx.