Aria Code Light is based on llama 2 chat. Training procedure
Aria Code light is a finetuned llama 2 chat HF on python dataset with over 18.000 tokens of coding prompts and answers. Our goal was to create a model which can run on a single GPU with more language skills and better coding performance than the initial LLama 2 especially in Python. ..
GPU used for training : NVIDIA A100.
....
Timing: Less than 24 hours.
.....
Method : Lora + PEFT
Update following LLAMA CODE 7B release.
As Meta just released LLAMA CODE 7B, and even a LLAMA CODE PYTHON, which is trained on a larger python dataset than Aria Code light, we still believe Aria Code light has a more user friendly approach by adding coding skills to a chat model. It has been noticed by many community users that specialized models in Coding often loose "non-coding" and natural language performance. That being said,we encourage you to try both and use the model which fit your needs better,everything done for the open source community is always useful. Congratulations to Meta Team for achieving this new milestone in "Coding LLMS" area.
Contact
Support : contact@faradaylab.fr
The following bitsandbytes
quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: True
- load_in_4bit: False
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
Framework versions
- PEFT 0.6.0.dev0