<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-da-multiwoz2.1_800-loss-ep50
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3385
- Accuracy: 43.6109
- Num: 7365
- Gen Len: 15.5064
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 24
- eval_batch_size: 192
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Num | Gen Len |
---|---|---|---|---|---|---|
1.1303 | 0.89 | 200 | 0.4894 | 28.91 | 7365 | 13.4421 |
0.5193 | 1.79 | 400 | 0.4112 | 32.6525 | 7365 | 14.586 |
0.4567 | 2.68 | 600 | 0.3846 | 35.2014 | 7365 | 15.2893 |
0.4231 | 3.57 | 800 | 0.3655 | 36.9293 | 7365 | 15.5405 |
0.3955 | 4.46 | 1000 | 0.3582 | 38.2589 | 7365 | 14.8105 |
0.3796 | 5.36 | 1200 | 0.3540 | 38.9391 | 7365 | 15.7322 |
0.3709 | 6.25 | 1400 | 0.3481 | 40.4378 | 7365 | 14.8407 |
0.3561 | 7.14 | 1600 | 0.3451 | 40.4357 | 7365 | 15.2813 |
0.3471 | 8.04 | 1800 | 0.3424 | 41.2701 | 7365 | 15.6809 |
0.3379 | 8.93 | 2000 | 0.3390 | 41.742 | 7365 | 15.2439 |
0.3319 | 9.82 | 2200 | 0.3389 | 41.7948 | 7365 | 15.4334 |
0.321 | 10.71 | 2400 | 0.3431 | 41.8954 | 7365 | 15.5404 |
0.3153 | 11.61 | 2600 | 0.3402 | 42.7186 | 7365 | 15.7551 |
0.3075 | 12.5 | 2800 | 0.3398 | 42.3081 | 7365 | 14.991 |
0.3043 | 13.39 | 3000 | 0.3405 | 42.7562 | 7365 | 15.7135 |
0.2981 | 14.29 | 3200 | 0.3385 | 43.6109 | 7365 | 15.5064 |
0.2912 | 15.18 | 3400 | 0.3430 | 43.2042 | 7365 | 15.4578 |
0.2864 | 16.07 | 3600 | 0.3449 | 43.3342 | 7365 | 15.6775 |
0.2797 | 16.96 | 3800 | 0.3441 | 43.8748 | 7365 | 15.6739 |
0.2738 | 17.86 | 4000 | 0.3477 | 43.7114 | 7365 | 15.7276 |
0.268 | 18.75 | 4200 | 0.3464 | 43.4139 | 7365 | 15.3885 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.5.1
- Tokenizers 0.12.1