Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=2.0113779983241042,
metrics={'train_runtime': 12268.4376,
'train_samples_per_second': 3.427,
'train_steps_per_second': 0.245,
'total_flos': 1.2147019450889011e+17,
'train_loss': 2.0113779983241042,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
2.035800 |
1.906599 |
0.365400 |
0.150500 |
0.243200 |
0.243500 |
0.366300 |
227.230300 |
2 |
1.976100 |
1.878923 |
0.393700 |
0.167800 |
0.263500 |
0.263800 |
0.423600 |
193.114200 |
3 |
1.956800 |
1.871454 |
0.409300 |
0.175100 |
0.273400 |
0.273600 |
0.457000 |
172.294500 |