Skip to content

tinker_cookbook.distillation.Config

class tinker_cookbook.distillation.Config(**)

Configuration for off-policy distillation with soft teacher targets.

Parameters:

  • learning_rate – Optimizer learning rate.
  • dataset_configs – One per domain. Each pairs a dataset with a teacher.
  • model_name – Student model name.
  • n_teacher_targets – Number of teacher tokens per position used as soft targets.
  • teacher_concurrency – Max concurrent teacher forward passes.
  • batch_size – Number of examples per training step.

Fields:

  • learning_rate (float)
  • dataset_configs (list[DatasetWithTeacher])
  • model_name (str)
  • renderer_name (str | None) – Default: None
  • lora_rank (int) – Default: 32
  • n_teacher_targets (int) – Number of highest-probability teacher tokens per position used as soft targets. Default: 20.
  • teacher_concurrency (int) – Max concurrent teacher forward passes per batch. Default: 64.
  • batch_size (int) – Number of examples per training step. Default: 64.
  • save_every (int) – Checkpointing and logging Default: 10.
  • eval_every (int) – Default: 20
  • max_steps (int | None) – Default: None
  • load_checkpoint_path (str | None) – Default: None
  • log_path (str)
  • wandb_project (str | None) – Default: None
  • wandb_name (str | None) – Default: None
  • base_url (str | None) – Default: None
  • ttl_seconds (int | None) – Server-side checkpoint retention (seconds). None = keep indefinitely. Default: 604800.
  • enable_trace (bool) – Default: False
  • evaluator_builders (list[SamplingClientEvaluatorBuilder]) – Default: []