This is the fine-tuned mt5-base-romanian base model (390M parameters).
The model was fine-tuned on the romanian diacritics dataset for 150k steps with a batch of size 8. The encoder sequence length is 256 and the decoder sequence length is also 256. It was trained with the following scripts.
How to load the fine-tuned mt5x model
from transformers import MT5ForConditionalGeneration, T5Tokenizer
model = MT5ForConditionalGeneration.from_pretrained('iliemihai/mt5-base-romanian-diacritics')
tokenizer = T5Tokenizer.from_pretrained('iliemihai/mt5-base-romanian-diacritics')
input_text = "A inceput sa ii taie un fir de par, iar fata sta in fata, tine camasa de in in mana si canta nota SI."
inputs = tokenizer(input_text, max_length=256, truncation=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"])
output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output) # this will print "A început să îi taie un fir de păr, iar fata stă în față, ține cămașa de in în mână și cântă nota SI"
Evaluation
Evaluation will be done soon here
Acknowledgements
We'd like to thank TPU Research Cloud for providing the TPUv3 cores we used to train these models!
Authors
Yours truly,