[To be released soon]
BHASHA-7B-2K-HI
A 7B foundation language model pre-trained on hindi text with 2048 context size. Weights initialised from bhasha-7b-256-hi model. Uses extended vocabulary with knowledge transfer within embedding space.
Model Description
Hyperparameter | Value |
---|---|
n_parameters | 6695735296 (6.69B) |
n_layers | 32 |
n_heads | 32 |
d_model | 4096 |
vocab size | 61772 |
sequence length | 2048 |
This model is still getting pre-trained. Updated weights along with more details will be available soon.
Follow us to get updates on the progress.