Wandb Run: https://wandb.ai/eleutherai/pythia-rlhf/runs/gy2g8jj1
Model Evals:
| Tasks | Version | Filter | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|
| arc_challenge | Yaml | none | acc | 0.2253 | ± | 0.0122 |
| none | acc_norm | 0.2278 | ± | 0.0123 | ||
| arc_easy | Yaml | none | acc | 0.2551 | ± | 0.0089 |
| none | acc_norm | 0.2567 | ± | 0.0090 | ||
| lambada_openai | Yaml | none | perplexity | NaN | ± | NaN |
| none | acc | 0.0016 | ± | 0.0005 | ||
| logiqa | Yaml | none | acc | 0.2028 | ± | 0.0158 |
| none | acc_norm | 0.2028 | ± | 0.0158 | ||
| piqa | Yaml | none | acc | 0.4946 | ± | 0.0117 |
| none | acc_norm | 0.4924 | ± | 0.0117 | ||
| sciq | Yaml | none | acc | 0.0140 | ± | 0.0037 |
| none | acc_norm | 0.0140 | ± | 0.0037 | ||
| winogrande | Yaml | none | acc | 0.5036 | ± | 0.0141 |
| wsc | Yaml | none | acc | 0.6346 | ± | 0.0474 |