webgpt regression reward-model

Reward Model pretrained on openai/webgpt_comparison

Reward model finetuned from existing pretrain model.

Things that aligned with the orignal papers

Different from the papers

Other models I had tried

Performance on validation split

model val acc val loss (rank loss)
roberta-base 56.21 0.71
roberta-large 57.89 0.67
electra-base 57.02 0.70
electra-large 58.75 0.69

Tensorboard logs are located under runs/

Note: