<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
evjvqa_mt5_vit_16
This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.2997
- F1: 0.4194
- Bleu4: 0.3783
- Mean Pred Len: 14.85
- Mean Label Len: 15.25
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 40
- eval_batch_size: 40
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 80
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | F1 | Bleu4 | Mean Pred Len | Mean Label Len |
---|---|---|---|---|---|---|---|
15.7375 | 0.07 | 20 | 9.6637 | 0.0771 | 0.0567 | 10.75 | 15.25 |
15.7459 | 0.15 | 40 | 10.0761 | 0.0784 | 0.0754 | 11.5 | 15.25 |
15.456 | 0.22 | 60 | 9.5077 | 0.0574 | 0.0595 | 11.35 | 15.25 |
15.3725 | 0.3 | 80 | 9.5230 | 0.0589 | 0.0436 | 11.45 | 15.25 |
14.9377 | 0.37 | 100 | 8.6082 | 0.079 | 0.0725 | 12.2 | 15.25 |
14.5629 | 0.45 | 120 | 9.3522 | 0.0851 | 0.0704 | 12.35 | 15.25 |
14.2505 | 0.52 | 140 | 8.0656 | 0.0666 | 0.0473 | 11.85 | 15.25 |
13.4648 | 0.6 | 160 | 7.5456 | 0.0783 | 0.054 | 10.4 | 15.25 |
13.055 | 0.67 | 180 | 7.0022 | 0.0607 | 0.0529 | 10.2 | 15.25 |
12.2861 | 0.75 | 200 | 6.6263 | 0.0704 | 0.0677 | 10.4 | 15.25 |
11.8459 | 0.82 | 220 | 6.1817 | 0.0849 | 0.0802 | 11.15 | 15.25 |
10.9808 | 0.9 | 240 | 5.6607 | 0.0779 | 0.053 | 11.65 | 15.25 |
10.0039 | 0.97 | 260 | 5.3278 | 0.0867 | 0.0619 | 10.55 | 15.25 |
8.819 | 1.05 | 280 | 4.5316 | 0.1154 | 0.1346 | 9.45 | 15.25 |
7.5032 | 1.12 | 300 | 3.7815 | 0.1355 | 0.1159 | 9.75 | 15.25 |
6.1347 | 1.2 | 320 | 3.0172 | 0.1807 | 0.1546 | 9.85 | 15.25 |
4.8126 | 1.27 | 340 | 2.6729 | 0.2177 | 0.1978 | 9.35 | 15.25 |
4.1824 | 1.35 | 360 | 2.3100 | 0.3017 | 0.3567 | 11.3 | 15.25 |
3.6456 | 1.42 | 380 | 2.2327 | 0.3029 | 0.3605 | 11.4 | 15.25 |
3.3865 | 1.5 | 400 | 2.0704 | 0.316 | 0.3167 | 13.15 | 15.25 |
3.2078 | 1.57 | 420 | 2.0376 | 0.3027 | 0.2856 | 13.5 | 15.25 |
3.0357 | 1.65 | 440 | 1.9508 | 0.3207 | 0.3404 | 13.1 | 15.25 |
2.9388 | 1.72 | 460 | 1.9042 | 0.3872 | 0.3665 | 13.5 | 15.25 |
2.7807 | 1.8 | 480 | 1.8595 | 0.3954 | 0.3692 | 13.65 | 15.25 |
2.7234 | 1.87 | 500 | 1.8956 | 0.3871 | 0.3484 | 14.2 | 15.25 |
2.6417 | 1.95 | 520 | 1.7809 | 0.4406 | 0.3592 | 15.85 | 15.25 |
2.5189 | 2.02 | 540 | 1.7255 | 0.4242 | 0.3844 | 14.8 | 15.25 |
2.4075 | 2.1 | 560 | 1.7226 | 0.4378 | 0.4022 | 14.55 | 15.25 |
2.3158 | 2.17 | 580 | 1.6749 | 0.46 | 0.4313 | 14.7 | 15.25 |
2.3145 | 2.25 | 600 | 1.6850 | 0.4229 | 0.3525 | 15.75 | 15.25 |
2.2615 | 2.32 | 620 | 1.6651 | 0.4618 | 0.3666 | 16.65 | 15.25 |
2.1983 | 2.4 | 640 | 1.6409 | 0.4101 | 0.3297 | 15.1 | 15.25 |
2.1365 | 2.47 | 660 | 1.6350 | 0.4317 | 0.3728 | 15.4 | 15.25 |
2.1286 | 2.55 | 680 | 1.6045 | 0.389 | 0.3352 | 14.95 | 15.25 |
2.1301 | 2.62 | 700 | 1.5884 | 0.4391 | 0.3679 | 15.55 | 15.25 |
2.1368 | 2.7 | 720 | 1.5702 | 0.415 | 0.3352 | 15.4 | 15.25 |
2.0449 | 2.77 | 740 | 1.5415 | 0.4215 | 0.366 | 14.7 | 15.25 |
2.0286 | 2.85 | 760 | 1.5434 | 0.406 | 0.3291 | 15.35 | 15.25 |
2.0126 | 2.92 | 780 | 1.5358 | 0.389 | 0.3033 | 15.0 | 15.25 |
1.9923 | 3.0 | 800 | 1.4857 | 0.4471 | 0.3605 | 15.85 | 15.25 |
1.8807 | 3.07 | 820 | 1.4665 | 0.4743 | 0.3717 | 15.95 | 15.25 |
1.8989 | 3.15 | 840 | 1.4760 | 0.3996 | 0.3502 | 14.8 | 15.25 |
1.8745 | 3.22 | 860 | 1.4294 | 0.3815 | 0.3258 | 15.2 | 15.25 |
1.9292 | 3.3 | 880 | 1.4454 | 0.4366 | 0.3694 | 15.6 | 15.25 |
1.8473 | 3.37 | 900 | 1.4205 | 0.4032 | 0.3523 | 15.65 | 15.25 |
1.8723 | 3.45 | 920 | 1.4080 | 0.4167 | 0.3609 | 15.5 | 15.25 |
1.8272 | 3.52 | 940 | 1.4069 | 0.3944 | 0.3734 | 14.45 | 15.25 |
1.8443 | 3.6 | 960 | 1.4088 | 0.409 | 0.3712 | 14.65 | 15.25 |
1.7956 | 3.67 | 980 | 1.3970 | 0.3848 | 0.3573 | 14.6 | 15.25 |
1.802 | 3.75 | 1000 | 1.3971 | 0.4116 | 0.3856 | 14.75 | 15.25 |
1.8154 | 3.82 | 1020 | 1.4013 | 0.4382 | 0.3731 | 14.85 | 15.25 |
1.7599 | 3.9 | 1040 | 1.4035 | 0.4106 | 0.3566 | 15.25 | 15.25 |
1.8375 | 3.97 | 1060 | 1.3992 | 0.4286 | 0.3594 | 15.6 | 15.25 |
1.739 | 4.04 | 1080 | 1.3955 | 0.4218 | 0.3686 | 15.1 | 15.25 |
1.7291 | 4.12 | 1100 | 1.3968 | 0.4702 | 0.4011 | 15.65 | 15.25 |
1.7279 | 4.19 | 1120 | 1.3743 | 0.4328 | 0.3668 | 15.5 | 15.25 |
1.7092 | 4.27 | 1140 | 1.3650 | 0.4321 | 0.3721 | 15.55 | 15.25 |
1.7002 | 4.34 | 1160 | 1.3413 | 0.3999 | 0.3669 | 15.25 | 15.25 |
1.7333 | 4.42 | 1180 | 1.3715 | 0.4459 | 0.3758 | 16.15 | 15.25 |
1.707 | 4.49 | 1200 | 1.3630 | 0.4173 | 0.3686 | 15.0 | 15.25 |
1.6815 | 4.57 | 1220 | 1.3326 | 0.4344 | 0.3755 | 15.1 | 15.25 |
1.7045 | 4.64 | 1240 | 1.3440 | 0.4083 | 0.3801 | 14.7 | 15.25 |
1.6511 | 4.72 | 1260 | 1.3361 | 0.3976 | 0.3722 | 14.7 | 15.25 |
1.682 | 4.79 | 1280 | 1.3314 | 0.3964 | 0.3707 | 14.85 | 15.25 |
1.6511 | 4.87 | 1300 | 1.3461 | 0.4081 | 0.3704 | 15.0 | 15.25 |
1.5936 | 4.94 | 1320 | 1.3362 | 0.4185 | 0.3667 | 15.15 | 15.25 |
1.6287 | 5.02 | 1340 | 1.3312 | 0.4296 | 0.374 | 14.85 | 15.25 |
1.6401 | 5.09 | 1360 | 1.3152 | 0.403 | 0.366 | 14.95 | 15.25 |
1.6093 | 5.17 | 1380 | 1.3316 | 0.3931 | 0.3689 | 14.75 | 15.25 |
1.6002 | 5.24 | 1400 | 1.3506 | 0.3948 | 0.3702 | 14.8 | 15.25 |
1.6245 | 5.32 | 1420 | 1.3344 | 0.401 | 0.3605 | 15.1 | 15.25 |
1.6005 | 5.39 | 1440 | 1.3310 | 0.4174 | 0.3698 | 15.1 | 15.25 |
1.5903 | 5.47 | 1460 | 1.3218 | 0.4156 | 0.3716 | 14.85 | 15.25 |
1.6016 | 5.54 | 1480 | 1.3219 | 0.4368 | 0.3984 | 14.8 | 15.25 |
1.6143 | 5.62 | 1500 | 1.3157 | 0.4094 | 0.3729 | 14.55 | 15.25 |
1.6082 | 5.69 | 1520 | 1.3109 | 0.4068 | 0.3778 | 14.9 | 15.25 |
1.5451 | 5.77 | 1540 | 1.3057 | 0.4056 | 0.3703 | 14.95 | 15.25 |
1.6312 | 5.84 | 1560 | 1.3055 | 0.4032 | 0.3656 | 14.85 | 15.25 |
1.5476 | 5.92 | 1580 | 1.3282 | 0.4154 | 0.3662 | 15.2 | 15.25 |
1.5758 | 5.99 | 1600 | 1.3205 | 0.4136 | 0.3623 | 15.2 | 15.25 |
1.598 | 6.07 | 1620 | 1.3200 | 0.4159 | 0.3675 | 14.9 | 15.25 |
1.567 | 6.14 | 1640 | 1.3359 | 0.4153 | 0.3699 | 14.7 | 15.25 |
1.5349 | 6.22 | 1660 | 1.3378 | 0.4036 | 0.3649 | 14.8 | 15.25 |
1.5536 | 6.29 | 1680 | 1.3374 | 0.4143 | 0.3691 | 14.85 | 15.25 |
1.5382 | 6.37 | 1700 | 1.3274 | 0.4052 | 0.38 | 14.65 | 15.25 |
1.5238 | 6.44 | 1720 | 1.3217 | 0.406 | 0.3674 | 14.9 | 15.25 |
1.5434 | 6.52 | 1740 | 1.3174 | 0.4096 | 0.3759 | 14.85 | 15.25 |
1.5326 | 6.59 | 1760 | 1.3134 | 0.4096 | 0.3759 | 14.85 | 15.25 |
1.5263 | 6.67 | 1780 | 1.3157 | 0.4104 | 0.3635 | 15.05 | 15.25 |
1.4775 | 6.74 | 1800 | 1.3197 | 0.4096 | 0.3759 | 14.85 | 15.25 |
1.5173 | 6.82 | 1820 | 1.3121 | 0.4167 | 0.3722 | 14.9 | 15.25 |
1.5304 | 6.89 | 1840 | 1.3240 | 0.4198 | 0.3818 | 14.7 | 15.25 |
1.5344 | 6.97 | 1860 | 1.3250 | 0.4135 | 0.3793 | 14.7 | 15.25 |
1.5392 | 7.04 | 1880 | 1.3187 | 0.4135 | 0.3793 | 14.7 | 15.25 |
1.5201 | 7.12 | 1900 | 1.3128 | 0.4143 | 0.3681 | 14.8 | 15.25 |
1.5139 | 7.19 | 1920 | 1.3072 | 0.4143 | 0.3654 | 14.95 | 15.25 |
1.4878 | 7.27 | 1940 | 1.3021 | 0.4143 | 0.3654 | 14.95 | 15.25 |
1.5123 | 7.34 | 1960 | 1.3041 | 0.4143 | 0.3681 | 14.8 | 15.25 |
1.4569 | 7.42 | 1980 | 1.3203 | 0.417 | 0.3712 | 14.8 | 15.25 |
1.4984 | 7.49 | 2000 | 1.3149 | 0.4198 | 0.3832 | 14.65 | 15.25 |
1.5187 | 7.57 | 2020 | 1.3102 | 0.4076 | 0.3818 | 14.7 | 15.25 |
1.5394 | 7.64 | 2040 | 1.3223 | 0.4176 | 0.3907 | 14.65 | 15.25 |
1.4602 | 7.72 | 2060 | 1.3102 | 0.4101 | 0.3686 | 14.9 | 15.25 |
1.4959 | 7.79 | 2080 | 1.3123 | 0.4178 | 0.3688 | 15.05 | 15.25 |
1.5462 | 7.87 | 2100 | 1.3083 | 0.4262 | 0.3692 | 15.1 | 15.25 |
1.4951 | 7.94 | 2120 | 1.2964 | 0.4301 | 0.3816 | 14.95 | 15.25 |
1.5016 | 8.01 | 2140 | 1.3078 | 0.4274 | 0.3784 | 14.9 | 15.25 |
1.4464 | 8.09 | 2160 | 1.3154 | 0.4178 | 0.3654 | 15.1 | 15.25 |
1.4654 | 8.16 | 2180 | 1.3070 | 0.4243 | 0.3702 | 15.0 | 15.25 |
1.4519 | 8.24 | 2200 | 1.2995 | 0.4339 | 0.3708 | 15.05 | 15.25 |
1.5098 | 8.31 | 2220 | 1.3051 | 0.4395 | 0.3903 | 14.75 | 15.25 |
1.4601 | 8.39 | 2240 | 1.3013 | 0.4376 | 0.3881 | 14.8 | 15.25 |
1.4693 | 8.46 | 2260 | 1.2981 | 0.4278 | 0.3871 | 14.8 | 15.25 |
1.5386 | 8.54 | 2280 | 1.3002 | 0.4112 | 0.3781 | 14.8 | 15.25 |
1.5115 | 8.61 | 2300 | 1.2994 | 0.4153 | 0.3806 | 14.9 | 15.25 |
1.5133 | 8.69 | 2320 | 1.2971 | 0.4236 | 0.385 | 14.85 | 15.25 |
1.4691 | 8.76 | 2340 | 1.2979 | 0.4321 | 0.3896 | 14.75 | 15.25 |
1.4548 | 8.84 | 2360 | 1.3054 | 0.4276 | 0.385 | 14.75 | 15.25 |
1.4816 | 8.91 | 2380 | 1.3029 | 0.4259 | 0.3857 | 14.7 | 15.25 |
1.4386 | 8.99 | 2400 | 1.2983 | 0.4196 | 0.3826 | 14.75 | 15.25 |
1.5242 | 9.06 | 2420 | 1.2958 | 0.421 | 0.3739 | 14.95 | 15.25 |
1.4824 | 9.14 | 2440 | 1.2939 | 0.4292 | 0.3827 | 14.9 | 15.25 |
1.5137 | 9.21 | 2460 | 1.2896 | 0.4213 | 0.3796 | 14.8 | 15.25 |
1.4634 | 9.29 | 2480 | 1.2934 | 0.4191 | 0.3855 | 14.85 | 15.25 |
1.4881 | 9.36 | 2500 | 1.2982 | 0.4134 | 0.3838 | 14.65 | 15.25 |
1.4185 | 9.44 | 2520 | 1.2995 | 0.4117 | 0.3795 | 14.65 | 15.25 |
1.3843 | 9.51 | 2540 | 1.3013 | 0.4217 | 0.3826 | 14.65 | 15.25 |
1.4563 | 9.59 | 2560 | 1.3005 | 0.4117 | 0.3795 | 14.65 | 15.25 |
1.461 | 9.66 | 2580 | 1.3008 | 0.4194 | 0.3783 | 14.85 | 15.25 |
1.47 | 9.74 | 2600 | 1.2999 | 0.4194 | 0.3783 | 14.85 | 15.25 |
1.4892 | 9.81 | 2620 | 1.2994 | 0.4196 | 0.3826 | 14.75 | 15.25 |
1.4503 | 9.89 | 2640 | 1.2992 | 0.4196 | 0.3826 | 14.75 | 15.25 |
1.4216 | 9.96 | 2660 | 1.2997 | 0.4194 | 0.3783 | 14.85 | 15.25 |
Framework versions
- Transformers 4.22.2
- Pytorch 1.12.1+cu113
- Datasets 2.6.1
- Tokenizers 0.12.1