<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-da-multiwoz2.1_400-ep20-nonstop
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3688
- Accuracy: 41.0146
- Num: 7365
- Gen Len: 15.7041
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 64
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Num | Gen Len |
---|---|---|---|---|---|---|
1.182 | 1.16 | 200 | 0.5155 | 29.0366 | 7365 | 13.7817 |
0.5435 | 2.33 | 400 | 0.4347 | 32.9943 | 7365 | 15.3553 |
0.4654 | 3.49 | 600 | 0.4011 | 35.0331 | 7365 | 15.9184 |
0.4345 | 4.65 | 800 | 0.3842 | 37.1184 | 7365 | 15.9117 |
0.4054 | 5.81 | 1000 | 0.3758 | 37.8867 | 7365 | 15.1696 |
0.3883 | 6.98 | 1200 | 0.3717 | 38.2739 | 7365 | 15.8155 |
0.3694 | 8.14 | 1400 | 0.3719 | 39.4091 | 7365 | 16.389 |
0.3573 | 9.3 | 1600 | 0.3670 | 39.8173 | 7365 | 15.5985 |
0.3465 | 10.47 | 1800 | 0.3664 | 40.0755 | 7365 | 15.9217 |
0.3404 | 11.63 | 2000 | 0.3662 | 39.9384 | 7365 | 16.0745 |
0.329 | 12.79 | 2200 | 0.3644 | 41.1361 | 7365 | 15.4292 |
0.3176 | 13.95 | 2400 | 0.3691 | 40.7146 | 7365 | 15.7916 |
0.3129 | 15.12 | 2600 | 0.3701 | 40.647 | 7365 | 15.524 |
0.3153 | 16.28 | 2800 | 0.3679 | 40.7364 | 7365 | 15.6447 |
0.3073 | 17.44 | 3000 | 0.3681 | 40.9193 | 7365 | 15.6842 |
0.3003 | 18.6 | 3200 | 0.3676 | 41.0059 | 7365 | 15.693 |
0.2988 | 19.77 | 3400 | 0.3688 | 41.0041 | 7365 | 15.7045 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.5.1
- Tokenizers 0.12.1