BLOOM-3B-Instruct
BLOOM-3B-Instruct is a model for short-form instruction following. It is built by finetuning BLOOM-3B-zh on a dataset.
Table of Contents
Model Details
BLOOM-zh is a language model with enhanced Traditional Chinese capability. It is derived from BLOOMZ. BLOOM-zh is trained extendedly on large amount of Traditional Chinese text data.
Basics
- Developed by: CKIP lab at Acedemia Sinica, MediaTek Research, and National Academy for Educational Research
- Model Type: Transformer-based Language Model
- Version: 1.0.0
- Languages: Multiple; see training data
- License: RAIL License v1.0 (link)
- Release Date Estimate: 8.August.2023
- Paper: https://arxiv.org/abs/2303.04715
- Cite as:
@article{ennen2023extending, title={Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results}, author={Ennen, Philipp and Hsu, Po-Chun and Hsu, Chan-Jan and Liu, Chang-Le and Wu, Yen-Chen and Liao, Yin-Hsiang and Lin, Chin-Tung and Shiu, Da-Shan and Ma, Wei-Yun}, journal={arXiv preprint arXiv:2303.04715}, year={2023} }
- Organizations of contributors:
- Academia Sinica
- MediaTek Research
- National Academy for Educational Research
- Send Questions to: cindylin@iis.sinica.edu.tw
Technical Specifications
This section provides information for people who work on model development.
For technical specifications, please refer to BLOOM.
Environmental Impact
For environmental impact, please refer to BLOOM.
Uses
This section addresses questions around how the model is intended to be used, discusses the foreseeable users of the model (including those affected by the model), and describes uses that are considered out of scope or misuse of the model. It provides information for anyone considering using the model or who is affected by the model.
For the uses of the model, please refer to BLOOM.
Risks and Limitations
This section identifies foreseeable harms and misunderstandings.
For risks and limitations, please refer to BLOOM.
Factors
This section lists some different aspects of BLOOM models. Its focus is on those aspects that are likely to give rise to high variance in model behavior.
-
The model is trained on Traditional Chinese. However, the pretrained weights capture more than 40 different languages.
-
The model is trained on web crawled data, news articles, novels, knowledge sources (encyclopedia, education sector) and instructions.
Recommendations
This section provides information on warnings and potential mitigations.
For recommendations, please refer to BLOOM.
Model Card Authors
Ordered roughly chronologically and by amount of time spent.
Philipp Ennen, Po-Chun Hsu, Chan-Jan Hsu, Chang-Le Liu, Yen-Chen Wu, Yin-Hsiang Liao, Chin-Tung Lin, Chi-Ming Chung, Yi-Chang Chen, Da-Shan Shiu, Wei-Yun Ma <!-- # Bloom_eval -->