Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=2.5178213735600132,
metrics={'train_runtime': 8703.174,
'train_samples_per_second': 4.83,
'train_steps_per_second': 0.345,
'total_flos': 9.272950245870797e+16,
'train_loss': 2.5178213735600132,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
2.661100 |
2.469111 |
0.451300 |
0.185200 |
0.279000 |
0.278900 |
0.553300 |
141.720300 |
2 |
2.434100 |
2.403647 |
0.456900 |
0.192800 |
0.284500 |
0.284500 |
0.556800 |
141.763100 |
3 |
2.313700 |
2.393932 |
0.459500 |
0.194400 |
0.286300 |
0.286200 |
0.559200 |
141.571600 |