Wandb run: https://wandb.ai/eleutherai/pythia-rlhf/runs/rh4mnzmr
Eval Results:
| Tasks | Version | Filter | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|
| arc_challenge | Yaml | none | acc | 0.2884 | ± | 0.0132 |
| none | acc_norm | 0.3183 | ± | 0.0136 | ||
| arc_easy | Yaml | none | acc | 0.6124 | ± | 0.0100 |
| none | acc_norm | 0.5328 | ± | 0.0102 | ||
| lambada_openai | Yaml | none | perplexity | 8.7783 | ± | 0.2341 |
| none | acc | 0.5783 | ± | 0.0069 | ||
| logiqa | Yaml | none | acc | 0.2151 | ± | 0.0161 |
| none | acc_norm | 0.2826 | ± | 0.0177 | ||
| piqa | Yaml | none | acc | 0.7176 | ± | 0.0105 |
| none | acc_norm | 0.7176 | ± | 0.0105 | ||
| sciq | Yaml | none | acc | 0.8590 | ± | 0.0110 |
| none | acc_norm | 0.7790 | ± | 0.0131 | ||
| winogrande | Yaml | none | acc | 0.5959 | ± | 0.0138 |
| wsc | Yaml | none | acc | 0.3654 | ± | 0.0474 |