
<h4 align="center"> <p> <b>English</b> | <a href="">简体中文</a> </p> </h4>

Aquila Language Model is the first open source language model that supports both Chinese and English knowledge, commercial license agreements, and compliance with domestic data regulations.

The additional details of the Aquila model will be presented in the official technical report. Please stay tuned for updates on official channels, including the FlagAI GitHub repository, FlagAI's Zhihu account and FlagAI's official technical communication group.

Model Model Type Description Status GPUs Used
Aquila-7B Base model, 7 billion parameters Aquila Base Model inherits the architectural design advantages of GPT-3 and LLaMA. It replaces a batch of more efficient underlying operator implementations, redesigns the implementation of bilingual tokenizer, upgrades BMTrain parallel training method, and achieves nearly 8 times the training efficiency of Magtron+DeepSpeed ZeRO-2. Released Nvidia-A100
Aquila-33B Base model, 33 billion parameters Same as above Coming soon Nvidia-A100
AquilaChat-7B SFT model, fine-tuned and RL based on Aquila-7B AquilaChat Dialog Model supports fluent text dialogue and multiple language generation tasks, and realizes the call of AquilaChat to other models and tools by defining an expandable special instruction specification, which is easy to extend. For example, calling the open source AltDiffusion multimodal language image generation model of Flagship Intelligence achieved smooth image generation capability. Together with Flagship Intelligence's InstructFace multi-step controllable text-picture model, it is easy to achieve multi-step controllable editing of human face images. Released Nvidia-A100
AquilaChat-33B SFT model, fine-tuned and RL based on Aquila-33B Same as above Coming soon Nvidia-A100
AquilaCode-7B-NV Base model, "text-code" generation model, further pre-trained based on Aquila-7B, trained on Nvidia AquilaCode-7B achieves high performance with small data sets and parameters, and is currently the best open source code model that supports both Chinese and English, trained using training code data with compliant open source licenses after high-quality filtering. AquilaCode-7B has been trained on both Nvidia and domestic chips for code models. Released on GitHub Nvidia-A100
AquilaCode-7B-TS Base model, "text-code" generation model, further pre-trained based on Aquila-7B, trained on Horizon Robotics chips Same as above Released on GitHub Tianshu-BI-V100

We will continue to release improved versions of Aquila model as open source.

n the FlagEval large model evaluation ("Subjective + Objective"), AquilaChat-7B v1.0 has shown a slight overall improvement compared to last version. It achieved an improvement of around 12.46% on the C-Eval, 10.88% on the MMLU, and 9.93% on the BoolQ dataset. For detailed evaluation results, please refer to the website For detailed version change history, see Change Log.

<!-- </table> -->

Quick Start AquilaChat-7B(Chat model)

1. Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
device = torch.device("cuda")
model_info = "BAAI/AquilaChat-7B"
tokenizer = AutoTokenizer.from_pretrained(model_info, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_info, trust_remote_code=True)
text = "请给出10个要到北京旅游的理由。"
tokens = tokenizer.encode_plus(text)['input_ids'][:-1]
tokens = torch.tensor(tokens)[None,].to(device)
stop_tokens = ["###", "[UNK]", "</s>"]
with torch.no_grad():
    out = model.generate(tokens, do_sample=True, max_length=512, eos_token_id=100007, bad_words_ids=[[tokenizer.encode(token)[0] for token in stop_tokens]])[0]
    out = tokenizer.decode(out.cpu().numpy().tolist())


AquilaChat-7B and AquilaChat-33B open-source model is licensed under BAAI Aquila Model Licence Agreement