medical

MindGLM: A Fine-tuned Language Model for Chinese Psychological Counseling

  1. Introduction MindGLM is a large language model fine-tuned and aligned for the task of psychological counseling in Chinese. Developed from the foundational model ChatGLM2-6B, MindGLM is designed to resonate with human preferences in psychological inquiries, offering a reliable and safe tool for digital psychological counseling.

  2. Key Features

  1. Usage To use MindGLM with the Hugging Face Transformers library:

'

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM")
model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM")
history = []
end_command = "结束对话"
max_length = 600
max_turns = 12

instruction = "假设你是友善的心理辅导师,避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。"

# 获取用户输入
prompt = input("求助者:")

# 添加前缀
prompt = instruction + "求助者:" + prompt + "。支持者:"
response, history = model.chat(tokenizer, prompt, history=[])
print("支持者: " + response)

while True:
  # 获取用户输入
  user_input = input("求助者:")

  # 检查是否收到结束指令
  if user_input == end_command:
      break

  # 添加前缀
  prompt = "求助者:" + user_input + "。支持者:"
  response, history = model.chat(tokenizer, prompt, history=history)
  print("支持者: " + response)

  # 计算历史记录总字符长度,包括新对话
  total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history)

  # 如果历史记录字符数超过限制或者对话轮数超过限制,删除一些旧的对话
  while total_length > max_length or len(history) > max_turns:
    # 删除最旧的一条对话
    removed_item = history.pop(1)  # 第一条是instruction,删除第二条
    # print('删除:',removed_item)
    total_length -= len(f'{removed_item[0]} {removed_item[1]}')

'

  1. Training Data MindGLM was trained using a combination of open-source datasets and self-constructed datasets, ensuring a comprehensive understanding of psychological counseling scenarios. The datasets include SmileConv, comparison_data_v1, psychology-RLAIF, rm_labelled_180, and rm_gpt_375.

  2. Training Process The model underwent a three-phase training approach:

  1. Limitations While MindGLM is a powerful tool, users should be aware of its limitations:

It is designed for psychological counseling but should not replace professional medical advice or interventions.

The model's responses are based on the training data, and while it's aligned with human preferences, it might not always provide the most appropriate response.

  1. License Please refer to the licensing terms of the datasets used for training. Usage of MindGLM should be in compliance with these licenses.license: apache-2.0

  2. Contact Information For any queries, feedback, or collaboration opportunities, please reach out to:

MindGLM: 针对中文心理咨询任务的对齐大语言模型

  1. 简介 MindGLM 是针对中文心理咨询任务进行微调和对齐的大型语言模型。MindGLM 由基础模型 ChatGLM2-6B 发展而来,旨在与人类的心理咨询偏好产生共鸣,为数字心理咨询提供可靠、安全的工具。

  2. 主要特点

  1. 使用方法 将 MindGLM 与 "transformer "一起使用:

'

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM")
model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM")
history = []
end_command = "结束对话"
max_length = 600
max_turns = 12

instruction = "假设你是友善的心理辅导师,避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。"

# 获取用户输入
prompt = input("求助者:")

# 添加前缀
prompt = instruction + "求助者:" + prompt + "。支持者:"
response, history = model.chat(tokenizer, prompt, history=[])
print("支持者: " + response)

while True:
  # 获取用户输入
  user_input = input("求助者:")

  # 检查是否收到结束指令
  if user_input == end_command:
      break

  # 添加前缀
  prompt = "求助者:" + user_input + "。支持者:"
  response, history = model.chat(tokenizer, prompt, history=history)
  print("支持者: " + response)

  # 计算历史记录总字符长度,包括新对话
  total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history)

  # 如果历史记录字符数超过限制或者对话轮数超过限制,删除一些旧的对话
  while total_length > max_length or len(history) > max_turns:
    # 删除最旧的一条对话
    removed_item = history.pop(1)  # 第一条是instruction,删除第二条
    # print('删除:',removed_item)
    total_length -= len(f'{removed_item[0]} {removed_item[1]}')

'

  1. 训练数据 MindGLM 结合使用开源数据集和自建数据集进行训练,以确保全面了解心理咨询场景。这些数据集包括 SmileConv、comparison_data_v1、psychology-RLAIF、rm_labelled_180 和 rm_gpt_375。

  2. 训练过程 该模型采用了三阶段训练方法:(均使用了LoRA)

  1. 局限性 虽然 MindGLM 是一款功能强大的工具,但用户也应了解其局限性:

它是为心理咨询而设计的,但不应取代专业医疗建议或干预。

模型的反应是基于训练数据的,虽然它与人类的偏好相一致,但可能并不总是提供最合适的反应。

  1. 使用许可 请参考用于训练的数据集的许可条款。MindGLM的使用应遵守这些许可。

  2. 联系信息 如有任何疑问、反馈或合作机会,请联系: