Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=4248,
training_loss=2.930363613782405,
metrics={'train_runtime': 11857.8062,
'train_samples_per_second': 5.014,
'train_steps_per_second': 0.358,
'total_flos': 1.3114345819786445e+17,
'train_loss': 2.930363613782405,
'epoch': 3.0}
Training Results
Epoch |
Training Loss |
Validation Loss |
Rouge1 |
Rouge2 |
Rougel |
Rougelsum |
Bleu |
Gen Len |
1 |
3.095400 |
2.864138 |
0.425500 |
0.139000 |
0.246300 |
0.246300 |
0.541400 |
141.540900 |
2 |
2.876500 |
2.811244 |
0.425600 |
0.139100 |
0.246500 |
0.246400 |
0.541600 |
141.619000 |
3 |
2.748300 |
2.797923 |
0.425800 |
0.138700 |
0.246400 |
0.246300 |
0.541800 |
141.597000 |