Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=7710,
training_loss=2.8554159399445727,
metrics={'train_runtime': 21924.7566,
'train_samples_per_second': 4.923,
'train_steps_per_second': 0.352,
'total_flos': 2.3807388210639667e+17,
'train_loss': 2.8554159399445727,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
2.981200 |
2.831641 |
0.414500 |
0.147000 |
0.230700 |
0.230600 |
0.512800 |
140.734900 |
2 |
2.800900 |
2.789402 |
0.417300 |
0.148400 |
0.231800 |
0.231700 |
0.516000 |
141.158200 |
3 |
2.680300 |
2.780862 |
0.418300 |
0.148400 |
0.232200 |
0.232100 |
0.516800 |
140.872300 |