Model Card for Model ID

Model description

BELLE is based on Bloomz-7b1-mt and finetuned with 0.6M Chinese data combined with 50,000 pieces of English data from the open source Stanford-Alpaca, resulting in good Chinese instruction understanding and response generation capabilities.

The code of Chinese data generation and other detailed information can be found in our Github project repository: https://github.com/LianjiaTech/BELLE.

We trained models using datasets of different sizes (200,000, 600,000, and 1,000,000 samples) for instruction learning, and we obtained different model versions as shown below:

Datasize	200,000	600,000	1,000,000
Finetuned Model	BELLE-7B-0.2M	BELLE-7B-0.6M	BELLE-7B-1M

Training hyper-parameters

Parameter	Value
Batch size	64
Learning rate	3e-6
Epochs	3
Weight_decay	0.001
Warmup_rate	0.1
LR_scheduler	linear

Use model

Please note that the input should be formatted as follows in both training and inference.

Human: {input} \n\nAssistant:

BELLE can be easily loaded with AutoModelForCausalLM.

from transformers import AutoTokenizer, AutoModelForCausalLM
import sys

model_path = "./" # You can modify the path for storing the local model
model =  AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
print("Human:")
line = input()
while line:
        inputs = 'Human: ' + line.strip() + '\n\nAssistant:'
        input_ids = tokenizer(inputs, return_tensors="pt").input_ids
        outputs = model.generate(input_ids, max_new_tokens=200, do_sample = True, top_k = 30, top_p = 0.85, temperature = 0.35, repetition_penalty=1.2)
        rets = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        print("Assistant:\n" + rets[0].strip().replace(inputs, ""))
        print("\n------------------------------------------------\nHuman:")
        line = input()

The above example did not utilize the context from previous turns of the conversation. If you want to perform multi-turn dialogue understanding, you can concatenate the previous turns of the conversation in the format of Human:{input}\n\nAssistant:{output} and input it to the generate function.

Limitations

There still exists a few issues in the model trained on current base model and data:

The model might generate factual errors when asked to follow instructions related to facts.
Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions.
Needs improvements on reasoning and coding.

Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed.

Some use cases

Human	Assitant
使用python写字符串翻转的函数	可以使用Python内置方法 `reversed()` 来实现字符串翻转，代码如下：<br/><br/>def reverse_string(s):<br/> return s[::-1]<br/> <br/>print(reverse_string("Hello, world!"))<br/># 输出：'!dlrow,olleH'<br/>
做难而正确的事，翻译成英语	Do difficult and correct things.
The current Alpaca model is fine-tuned from a 7B LLaMA model on 52K instruction-following data generated by the techniques in the Self-Instruct paper, with some modifications that we discuss in the next section. In a preliminary human evaluation, we found that the Alpaca 7B model behaves similarly to the text-davinci-003 model on the Self-Instruct instruction-following evaluation suite. How many instruction-following data generated?	52K
给一个北京三日游的旅行攻略	Day 1:<br/>- 参观故宫博物院，了解中国历史和文化。<br/>- 在天安门广场观看升旗仪式。 <br/>- 到颐和园游览美丽的园林景观。<br/>- 品尝烤鸭和其他地道美食。<br/>- 去王府井购物中心逛街购物。 <br/><br/>Day 2:<br/>- 前往长城游玩，感受古代中国的壮丽景色。<br/>- 乘坐索道缆车到达八达岭长城，欣赏美景。<br/>- 晚上去簋街体验当地的小吃文化。<br/><br/>Day 3:<br/>- 参观天坛公园，观赏古建筑群和祈年殿。<br/>- 到鸟巢和水立方观看奥运会比赛。<br/>- 午餐后到798艺术区参观当代艺术家的工作室。
来一个温馨的睡前故事	从前有一个小兔子，它住在森林里。每天晚上，小兔子都会在树下打盹儿。但是有一天晚上，一只大灰狼来了！他悄悄地溜进小兔子的洞穴里，想要偷走它的胡萝卜。<br/>然而，小兔子并没有被吓到。相反，它用自己的智慧和勇气保护了自己。它在洞口放了一堆胡萝卜来吸引大灰狼，然后趁机逃脱了。从那天起，小兔子学会了在危险时保持镇静并采取适当的行动
如何快速提升自己的写作能力	1. 阅读大量不同类型的文学作品，包括小说、散文和诗歌等。<br/>2. 练习写作，每天写一些东西并不断修改和完善。<br/>3. 参加写作班或请教专业作家以获得反馈和建议。<br/>4. 利用各种工具和技术来提高写作效率和质量，例如语法检查器、拼写检查器和自动润色功能等。
北京大学是哪一年成立的？	北京大学成立于1898年

模型描述

BELLE模型以Bloomz-7b1-mt为基础，在 0.2M/0.6M/1.0M 条中文数据上，结合Stanford-Alpaca开源的5万条英文数据上训练，取得了较好的中文instruction理解以及response生成能力。

中文数据生成代码及其他详细信息，位于我们的Github项目仓库: https://github.com/LianjiaTech/BELLE

我们采取了不同大小规模（20万、60万和100万样本）的指令学习的数据集训练模型，我们得到不同的模型版本如下所示:

Datasize	200,000	600,000	1,000,000
Finetuned Model	BELLE-7B-0.2M	BELLE-7B-0.6M	BELLE-7B-1M

模型训练超参数

参数	值
Batch size	64
Learning rate	3e-6
Epochs	3
Weight_decay	0.001
Warmup_rate	0.1
LR_scheduler	linear

使用模型

请注意，在训练和推理时，模型的输入应该处理成如下形式：

Human: {input} \n\nAssistant:

通过AutoModelForCausalLM即可直接载入模型并使用。

from transformers import AutoTokenizer, AutoModelForCausalLM
import sys

model_path = "./" # You can modify the path for storing the local model
model =  AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
print("Human:")
line = input()
while line:
        inputs = 'Human: ' + line.strip() + '\n\nAssistant:'
        input_ids = tokenizer(inputs, return_tensors="pt").input_ids
        outputs = model.generate(input_ids, max_new_tokens=200, do_sample = True, top_k = 30, top_p = 0.85, temperature = 0.35, repetition_penalty=1.2)
        rets = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        print("Assistant:\n" + rets[0].strip().replace(inputs, ""))
        print("\n------------------------------------------------\nHuman:")
        line = input()

上面样例没有利用对话上文，如果你想做多轮对话理解，可以将对话上文按照Human:{input}\n\nAssistant:{output} 方式拼接起来输入给generate函数

局限性和使用限制

基于当前数据和基础模型训练得到的SFT模型，在效果上仍存在以下问题：

在涉及事实性的指令上可能会产生违背事实的错误回答。
对于具备危害性的指令无法很好的鉴别，由此会产生危害性言论。
在一些涉及推理、代码等场景下模型的能力仍有待提高。

基于以上模型局限性，我们要求开发者仅将我们开源的代码、数据、模型及后续用此项目生成的衍生物用于研究目的，不得用于商业，以及其他会对社会带来危害的用途。

一些示例

Human	Assitant
使用python写字符串翻转的函数	可以使用Python内置方法 `reversed()` 来实现字符串翻转，代码如下：<br/><br/>def reverse_string(s):<br/> return s[::-1]<br/> <br/>print(reverse_string("Hello, world!"))<br/># 输出：'!dlrow,olleH'<br/>
做难而正确的事，翻译成英语	Do difficult and correct things.
The current Alpaca model is fine-tuned from a 7B LLaMA model on 52K instruction-following data generated by the techniques in the Self-Instruct paper, with some modifications that we discuss in the next section. In a preliminary human evaluation, we found that the Alpaca 7B model behaves similarly to the text-davinci-003 model on the Self-Instruct instruction-following evaluation suite. How many instruction-following data generated?	52K
给一个北京三日游的旅行攻略	Day 1:<br/>- 参观故宫博物院，了解中国历史和文化。<br/>- 在天安门广场观看升旗仪式。 <br/>- 到颐和园游览美丽的园林景观。<br/>- 品尝烤鸭和其他地道美食。<br/>- 去王府井购物中心逛街购物。 <br/><br/>Day 2:<br/>- 前往长城游玩，感受古代中国的壮丽景色。<br/>- 乘坐索道缆车到达八达岭长城，欣赏美景。<br/>- 晚上去簋街体验当地的小吃文化。<br/><br/>Day 3:<br/>- 参观天坛公园，观赏古建筑群和祈年殿。<br/>- 到鸟巢和水立方观看奥运会比赛。<br/>- 午餐后到798艺术区参观当代艺术家的工作室。
来一个温馨的睡前故事	从前有一个小兔子，它住在森林里。每天晚上，小兔子都会在树下打盹儿。但是有一天晚上，一只大灰狼来了！他悄悄地溜进小兔子的洞穴里，想要偷走它的胡萝卜。<br/>然而，小兔子并没有被吓到。相反，它用自己的智慧和勇气保护了自己。它在洞口放了一堆胡萝卜来吸引大灰狼，然后趁机逃脱了。从那天起，小兔子学会了在危险时保持镇静并采取适当的行动
如何快速提升自己的写作能力	1. 阅读大量不同类型的文学作品，包括小说、散文和诗歌等。<br/>2. 练习写作，每天写一些东西并不断修改和完善。<br/>3. 参加写作班或请教专业作家以获得反馈和建议。<br/>4. 利用各种工具和技术来提高写作效率和质量，例如语法检查器、拼写检查器和自动润色功能等。
北京大学是哪一年成立的？	北京大学成立于1898年