tinker_cookbook.rl.StepResult
class tinker_cookbook.rl.StepResult(**)
Result returned by :meth:Env.step.
Fields:
- reward (float) – Immediate reward for this step.
- episode_done (bool) – Whether the episode has ended.
- next_observation (Observation) – Observation for the next step (or final observation if episode_done).
- next_stop_condition (StopCondition) – Stop condition for the next generation.
- metrics (Metrics) – Numeric values aggregated and reported in training logs (e.g., timing, counts). Default:
field(default_factory=dict). - logs (Logs) – Diagnostic info for display/debugging tools (not aggregated like metrics). Default:
field(default_factory=dict).