Training procedure

The following bitsandbytes quantization config was used during training:

Framework versions

Training loss

Epoch | Training Loss | Validation Loss
1 | 0.573400 | 0.595536
2 | 0.476500 | 0.506768
3 | 0.421000 | 0.472346