[To be released soon]
BHASHA-7B-8K-HI
A 7B foundation language model pre-trained on hindi text with 8k context size. Weights initialised from MPT-7B-8K model. Uses extended vocabulary with knowledge transfer within embedding space.
Model Description
| Hyperparameter | Value |
|---|---|
| n_parameters | 6695735296 (6.69B) |
| n_layers | 32 |
| n_heads | 32 |
| d_model | 4096 |
| vocab size | 61772 |
| sequence length | 8192 |
This model is still getting pre-trained. Updated weights along with more details will be available soon.
Follow us to get updates on the progress.