Wandb Run: https://wandb.ai/eleutherai/pythia-rlhf/runs/lvb8b3md
Eval Results
Tasks | Version | Filter | Metric | Value | Stderr | |
---|---|---|---|---|---|---|
arc_challenge | Yaml | none | acc | 0.3200 | ± | 0.0136 |
none | acc_norm | 0.3532 | ± | 0.0140 | ||
arc_easy | Yaml | none | acc | 0.6721 | ± | 0.0096 |
none | acc_norm | 0.5955 | ± | 0.0101 | ||
lambada_openai | Yaml | none | perplexity | 4.8557 | ± | 0.1146 |
none | acc | 0.6575 | ± | 0.0066 | ||
logiqa | Yaml | none | acc | 0.2335 | ± | 0.0166 |
none | acc_norm | 0.2750 | ± | 0.0175 | ||
piqa | Yaml | none | acc | 0.7557 | ± | 0.0100 |
none | acc_norm | 0.7666 | ± | 0.0099 | ||
sciq | Yaml | none | acc | 0.8980 | ± | 0.0096 |
none | acc_norm | 0.8370 | ± | 0.0117 | ||
winogrande | Yaml | none | acc | 0.6172 | ± | 0.0137 |
wsc | Yaml | none | acc | 0.3654 | ± | 0.0474 |