Wandb Run: https://wandb.ai/eleutherai/pythia-rlhf/runs/31gbxj2w
Eval Results:
| Tasks | Version | Filter | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|
| arc_challenge | Yaml | none | acc | 0.2159 | ± | 0.0120 |
| none | acc_norm | 0.2295 | ± | 0.0123 | ||
| arc_easy | Yaml | none | acc | 0.3266 | ± | 0.0096 |
| none | acc_norm | 0.3287 | ± | 0.0096 | ||
| lambada_openai | Yaml | none | perplexity | NaN | ± | NaN |
| none | acc | 0.1750 | ± | 0.0053 | ||
| logiqa | Yaml | none | acc | 0.2028 | ± | 0.0158 |
| none | acc_norm | 0.2028 | ± | 0.0158 | ||
| piqa | Yaml | none | acc | 0.5441 | ± | 0.0116 |
| none | acc_norm | 0.5446 | ± | 0.0116 | ||
| sciq | Yaml | none | acc | 0.2050 | ± | 0.0128 |
| none | acc_norm | 0.1940 | ± | 0.0125 | ||
| winogrande | Yaml | none | acc | 0.5043 | ± | 0.0141 |
| wsc | Yaml | none | acc | 0.6154 | ± | 0.0479 |