Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=4248,
training_loss=2.172659089111788,
metrics={'train_runtime': 3371.7912,
'train_samples_per_second': 17.633,
'train_steps_per_second': 1.26,
'total_flos': 1.2884303701396685e+17,
'train_loss': 2.172659089111788,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
2.318000 |
2.079500 |
0.128100 |
0.046700 |
0.104200 |
0.104200 |
0.001100 |
20.000000 |
2 |
2.130000 |
2.043523 |
0.130200 |
0.047400 |
0.105400 |
0.105300 |
0.001300 |
20.000000 |
3 |
2.047100 |
2.034664 |
0.130700 |
0.047800 |
0.105900 |
0.105900 |
0.001300 |
20.000000 |