Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=7710
training_loss=2.1297076629757417
metrics={'train_runtime': 6059.0418,
'train_samples_per_second': 17.813,
'train_steps_per_second': 1.272,
'total_flos': 2.3389776681055027e+17,
'train_loss': 2.1297076629757417,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
2.223100 |
2.038599 |
0.147400 |
0.054800 |
0.113500 |
0.113500 |
0.001400 |
20.000000 |
2 |
2.078100 |
2.009619 |
0.152900 |
0.057800 |
0.117000 |
0.117000 |
0.001600 |
20.000000 |
3 |
1.989000 |
2.006006 |
0.152900 |
0.057300 |
0.116700 |
0.116700 |
0.001700 |
20.000000 |