CoLLaMA: A Multilingual Instruction Dataset and Large Language Model for Code

<p align="center" width="100%"> <img src="https://i.postimg.cc/J7Ds1tw6/CoLLaMA.jpg" width="40%" height="20%"> </p>

Model details

Trained in June 2023.

CoLLaMA-7b is a multilingual (Chinese and English) instruction tuning dataset and large language model for coding tasks which finetuned from LLaMA models.

Please refer to the README of the GitHub repository for detailed information.

Training dataset

The model was trained on a 480k rows instruction following dataset, which is released in the GitHub repository.

Citation

<div> <div align="center"> <a target='_blank'>Gang Hu<sup>1</sup></span>  <a target='_blank'>Xi Wen<sup>1</sup></span>  <a target='_blank'>Xin Liu<sup>1</sup></a>  <a href='https://jimin.chancefocus.com/' target='_blank'>Jimin Huang<sup>2</sup></a> 
<a target='_blank'>Qianqian Xie*<sup>3</sup></a> 

</div> <div> <div align="center"> <sup>1</sup>School of Information Science & Engineering, Yunnan University  <sup>2</sup>ChanceFocus AMC  <sup>3</sup>School of Computer Science, Wuhan University  </div>

@misc{Hu2023CoLLaMA,
      title={CoLLaMA: A Multilingual Instruction Dataset and Large Language Model for Code}, 
      author={Gang Hu and Xi Wen and Xin Liu and Jimin Huang and Qianqian Xie},
      year={2023},
}