Comparison |
A pair of completions (A and B) generated from the same prompt conversation. |
LabeledComparison |
A Comparison annotated with a human preference label (A, B, or Tie). |
ComparisonRenderer |
Abstract renderer for converting Comparisons to model inputs for preference trai |
ComparisonRendererFromChatRenderer |
Wraps a chat Renderer to render Comparisons for preference training. |
PreferenceModel |
Abstract base class for models that score a Comparison and return a preference f |
PreferenceModelBuilder |
Abstract builder that creates PreferenceModel instances. |
PreferenceModelFromChatRenderer |
A PreferenceModel that uses a chat renderer and Tinker sampling client. |
PreferenceModelBuilderFromChatRenderer |
Builds a PreferenceModel that uses a chat renderer and a Tinker sampling client. |
Config |
Configuration for Direct Preference Optimization (DPO) training. |