Chat SFT
Supervised fine-tuning on conversational datasets to turn a base model into a chat assistant.
What you'll build
A chat-capable model fine-tuned on NoRobots or Tulu3 using LoRA. The training uses standard next-token prediction on assistant turns.
Prerequisites
Key concepts
- Supervised fine-tuning (SFT) — train on human-written assistant responses using next-token prediction loss
- LoRA — parameter-efficient fine-tuning that trains low-rank adapter weights instead of the full model
Run it
NoRobots dataset
python -m tinker_cookbook.recipes.chat_sl.train \
model_name=Qwen/Qwen3-8B-Base \
dataset=no_robots \
learning_rate=5e-4 \
batch_size=64 \
lora_rank=64 \
eval_every=20 \
save_every=20 \
wandb_project=cookbook_sl
Tulu3 dataset
python -m tinker_cookbook.recipes.chat_sl.train \
model_name=Qwen/Qwen3-8B-Base \
dataset=tulu3 \
learning_rate=5e-4 \
batch_size=128 \
lora_rank=64 \
eval_every=500 \
save_every=500 \
wandb_project=cookbook_sl
Expected results
| Dataset | Steps | test/nll |
|---|---|---|
| NoRobots | 140 | 1.788 |
| Tulu3 | 1740 | 0.50 |
Tulu3 performance can be further improved by training longer with a higher lora_rank and lower batch_size.
Adding your own dataset
The base classes in tinker_cookbook/supervised/data.py support loading new data in the following ways:
SupervisedDatasetFromHFDataset— loads a dataset from HuggingFace Hub with a postprocessing functionStreamingSupervisedDatasetFromHFDataset— works similarly, but supports streaming for large datasetsFromConversationFileBuilder— supports data loading from a JSONL file
You can also pass a path to a JSONL file directly with dataset=path/to/file.jsonl.
Learn more
- Hyperparameter sweep results for learning rate and LoRA rank across models
- Source code