The aim is to compress the mT5-small model to leave only the Ukrainian language and some basic English.
Reproduced the similar result (but with another language) from this medium article.
Results:
- 300M params -> 75M params (75%)
- 250K tokens -> 8900 tokens
- 1.1GB size model -> 0.3GB size model