AI Model Zoo
Home
AI Tools
BimAnt
Reward model for RLHF trained on 3000 examples from
Anthropic/hh-rlhf
dataset.