Wandb Runs: https://wandb.ai/eleutherai/pythia-rlhf/runs/644tyaq0?workspace=user-yongzx
Model Evals:
| Task | Version | Filter | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|
| arc_challenge | Yaml | none | acc | 0.2287 | ± | 0.0123 | 
| none | acc_norm | 0.2619 | ± | 0.0128 | ||
| arc_easy | Yaml | none | acc | 0.5248 | ± | 0.0102 | 
| none | acc_norm | 0.4533 | ± | 0.0102 | ||
| logiqa | Yaml | none | acc | 0.2089 | ± | 0.0159 | 
| none | acc_norm | 0.2765 | ± | 0.0175 | ||
| piqa | Yaml | none | acc | 0.6855 | ± | 0.0108 | 
| none | acc_norm | 0.6823 | ± | 0.0109 | ||
| sciq | Yaml | none | acc | 0.8050 | ± | 0.0125 | 
| none | acc_norm | 0.7080 | ± | 0.0144 | ||
| winogrande | Yaml | none | acc | 0.5335 | ± | 0.0140 | 
| wsc | Yaml | none | acc | 0.3654 | ± | 0.0474 | 
| lambada_openai | Yaml | none | perplexity | 9.8265 | ± | 0.3139 | 
| none | acc | 0.5135 | ± | 0.0070 |