Introduction:

We trained two Chinese reward models based on the base model Baichuan-13B-Base, following Llama 2 style. rw_helpful_13b_wpack model was trained to give scores based on the helpfulness of the responses; rw_safe_13b_wpack was trained to give scores based on the safety of the responses.

Loading and usage:

Under GPU runtime, run scoring.py in src folder. The prompt list and the answer list can be modified or imported to fit your purpose of use. An example would be:

python scoring.py \
	--model_name_or_path PATH_TO_THE_REWARD_MODEL

in which PATH_TO_THE_REWARD_MODEL can either be the path to rw_helpful_13b_wpack_exported or rw_safe_13b_wpack_exported.

Multiple GPUs are expected, 8$\times$A100 for example.

Testing results

We tested the performance of our safety reward model and helpfulness reward model on their separate test dataset.

# rw_safe_13b_wpack evaluation results
{

    "eval_accuracy": 0.8876339025592757,

    "eval_loss": 0.197021484375,

    "eval_runtime": 1131.4606,

    "eval_samples_per_second": 19.719,

    "eval_steps_per_second": 2.465

}

# rw_helpful_13b_wpack evaluation results
{

    "eval_accuracy": 0.6387571848594814,

    "eval_loss": 0.63916015625,

    "eval_runtime": 2188.8722,

    "eval_samples_per_second": 17.248,

    "eval_steps_per_second": 2.156

}

Citation

pass

奖励模型

简介:

我们仿照LLama 2 奖励模型的训练机制Baichuan-13B-Base模型为基座 训练了两个中文的奖励模型。其中 rw_helpful_13b_wpack model 模型用于根据回答是否对问题有帮助对回答进行打分;rw_safe_13b_wpack 模型用于根据回答的安全性对回答进行打分。

用法:

在GPU 运行时下,运行 src 文件夹下的 scoring.py 脚本。该脚本中的prompt_list 变量和good_ans_list 变量可根据需要修改或导入。模型将对good_ans_list 中的每一个answer 结合对应的prompt 根据其适用性和安全性进行打分。 一个例子:

python scoring.py \
	--model_name_or_path PATH_TO_THE_REWARD_MODEL

其中 PATH_TO_THE_REWARD_MODEL 可以是rw_helpful_13b_wpack_exported 或 rw_safe_13b_wpack_exported 两者中一个的模型路径。 一般需要多块显卡,例如8张A100 GPU

测试结果

我们分别测试了训练好的 适用性奖励模型 和 安全性奖励模型在各自测试数据集下的表现如下:

# rw_safe_13b_wpack的测试结果
{

    "eval_accuracy": 0.8876339025592757,

    "eval_loss": 0.197021484375,

    "eval_runtime": 1131.4606,

    "eval_samples_per_second": 19.719,

    "eval_steps_per_second": 2.465

}

# rw_helpful_13b_wpack 的测试结果
{

    "eval_accuracy": 0.6387571848594814,

    "eval_loss": 0.63916015625,

    "eval_runtime": 2188.8722,

    "eval_samples_per_second": 17.248,

    "eval_steps_per_second": 2.156

}

引用


@Misc{Baichuan-13B-Base,
  title = {Baichuan-13B-Base},
  author = {hiyouga},
  howpublished = {\url{https://huggingface.co/baichuan-inc/}},
  year = {2023}
}
@Misc{llama-efficient-tuning,
  title = {LLaMA Efficient Tuning},
  author = {hiyouga},
  howpublished = {\url{https://github.com/hiyouga/LLaMA-Efficient-Tuning}},
  year = {2023}
}