Skip to content

tinker_cookbook.rl.StepResult

class tinker_cookbook.rl.StepResult(**)

Result returned by :meth:Env.step.

Fields:

  • reward (float) – Immediate reward for this step.
  • episode_done (bool) – Whether the episode has ended.
  • next_observation (Observation) – Observation for the next step (or final observation if episode_done).
  • next_stop_condition (StopCondition) – Stop condition for the next generation.
  • metrics (Metrics) – Numeric values aggregated and reported in training logs (e.g., timing, counts). Default: field(default_factory=dict).
  • logs (Logs) – Diagnostic info for display/debugging tools (not aggregated like metrics). Default: field(default_factory=dict).