Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=1.8524150695953217,
metrics={'train_runtime': 2319.7329,
'train_samples_per_second': 18.122,
'train_steps_per_second': 1.295,
'total_flos': 9.110291036818637e+16,
'train_loss': 1.8524150695953217,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
1.969100 |
1.756651 |
0.159100 |
0.088300 |
0.138800 |
0.138900 |
0.001600 |
20.000000 |
2 |
1.794000 |
1.699691 |
0.158500 |
0.090300 |
0.139500 |
0.139600 |
0.001400 |
20.000000 |
3 |
1.713700 |
1.687554 |
0.162700 |
0.091900 |
0.141800 |
0.141900 |
0.001660 |
20.000000 |