generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

find_first_sent_train_30_eval_10_flan-t5-xl

This model is a fine-tuned version of google/flan-t5-xl on the tyzhu/find_first_sent_train_30_eval_10 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 5 2.5051 1.241 43.0
No log 2.0 10 2.4681 0.8804 35.0
No log 3.0 15 2.4068 2.5934 24.3
No log 4.0 20 2.3519 1.9387 25.8
No log 5.0 25 2.2914 2.182 28.1
No log 6.0 30 2.2810 2.2794 28.8
No log 7.0 35 2.2634 3.0649 30.2
No log 8.0 40 2.3489 3.6397 35.5
No log 9.0 45 2.4796 2.1984 34.8
2.0773 10.0 50 2.6235 2.3859 28.5
2.0773 11.0 55 2.7206 3.1068 29.3
2.0773 12.0 60 2.8255 2.9542 29.2
2.0773 13.0 65 3.3399 3.5205 28.6
2.0773 14.0 70 3.3261 2.6726 28.4
2.0773 15.0 75 4.0954 2.6483 28.7
2.0773 16.0 80 4.6468 1.5483 30.9
2.0773 17.0 85 4.1352 1.9426 32.5
2.0773 18.0 90 4.5193 2.0072 31.5
2.0773 19.0 95 5.0365 5.7223 35.4
0.4306 20.0 100 4.9830 6.0764 33.7
0.4306 21.0 105 5.1218 5.6436 35.2
0.4306 22.0 110 5.4091 5.3174 40.4
0.4306 23.0 115 5.3755 5.4611 38.8
0.4306 24.0 120 5.2219 5.9493 33.6
0.4306 25.0 125 5.2747 5.3679 36.9
0.4306 26.0 130 5.3279 5.0396 41.2
0.4306 27.0 135 5.4788 5.287 39.1
0.4306 28.0 140 5.5710 5.1812 40.4
0.4306 29.0 145 5.6488 5.3043 39.4
0.0867 30.0 150 5.5148 5.2983 37.9
0.0867 31.0 155 5.4655 20.8944 39.5
0.0867 32.0 160 5.6512 5.8527 34.4
0.0867 33.0 165 5.6764 15.876 50.8
0.0867 34.0 170 5.6538 15.876 50.8
0.0867 35.0 175 5.6921 20.4813 39.8
0.0867 36.0 180 5.6782 20.2634 40.8
0.0867 37.0 185 5.6104 21.0798 38.8
0.0867 38.0 190 5.3899 21.8155 37.8
0.0867 39.0 195 5.2651 21.8952 37.7
0.0376 40.0 200 5.4012 21.8952 37.7
0.0376 41.0 205 5.3592 21.8952 37.7
0.0376 42.0 210 5.2308 21.8952 37.7
0.0376 43.0 215 5.2728 21.3782 38.4
0.0376 44.0 220 5.3208 22.1008 37.0
0.0376 45.0 225 5.3982 21.8952 37.7
0.0376 46.0 230 5.3998 21.8952 37.7
0.0376 47.0 235 5.3946 21.7985 37.9
0.0376 48.0 240 5.5448 21.9756 37.8
0.0376 49.0 245 5.6623 21.9756 37.8
0.0248 50.0 250 5.6704 15.6207 52.4
0.0248 51.0 255 5.7137 15.6207 52.4
0.0248 52.0 260 5.7186 16.1671 49.9
0.0248 53.0 265 5.7098 16.0377 50.3
0.0248 54.0 270 5.6003 15.9103 50.6
0.0248 55.0 275 5.5697 15.9103 50.6
0.0248 56.0 280 5.5331 16.0377 50.2
0.0248 57.0 285 5.5400 15.8265 50.9
0.0248 58.0 290 5.6258 13.2365 61.1
0.0248 59.0 295 5.6516 13.2365 61.1
0.0147 60.0 300 5.6560 13.2073 61.6
0.0147 61.0 305 5.7258 13.1459 61.9
0.0147 62.0 310 5.7615 13.1459 61.9
0.0147 63.0 315 5.7989 13.1459 61.9
0.0147 64.0 320 5.8839 13.1459 61.9
0.0147 65.0 325 5.9621 13.1459 61.9
0.0147 66.0 330 6.0142 13.1459 61.9
0.0147 67.0 335 6.0231 13.1459 61.9
0.0147 68.0 340 5.9970 21.1381 38.6
0.0147 69.0 345 5.9133 21.1381 38.6
0.0107 70.0 350 5.8522 20.9916 39.2
0.0107 71.0 355 5.7963 20.9916 39.2
0.0107 72.0 360 5.7927 20.9916 39.2
0.0107 73.0 365 5.7878 20.9916 39.2
0.0107 74.0 370 5.7743 20.9916 39.2
0.0107 75.0 375 5.7927 20.9916 39.2
0.0107 76.0 380 5.8188 20.9916 39.2
0.0107 77.0 385 5.8431 20.9916 39.2
0.0107 78.0 390 5.8821 20.9916 39.2
0.0107 79.0 395 5.9117 20.9916 39.2
0.0089 80.0 400 5.9405 20.9916 39.2
0.0089 81.0 405 5.9583 21.6414 38.2
0.0089 82.0 410 5.9502 21.6414 38.2
0.0089 83.0 415 5.9410 21.6414 38.2
0.0089 84.0 420 5.9362 21.6414 38.2
0.0089 85.0 425 5.9252 21.6414 38.2
0.0089 86.0 430 5.9187 21.6414 38.2
0.0089 87.0 435 5.9201 21.6414 38.2
0.0089 88.0 440 5.9235 21.6414 38.2
0.0089 89.0 445 5.9023 21.6414 38.2
0.0074 90.0 450 5.8876 21.6414 38.2
0.0074 91.0 455 5.8896 21.6414 38.2
0.0074 92.0 460 5.8949 21.6414 38.2
0.0074 93.0 465 5.8910 21.6414 38.2
0.0074 94.0 470 5.8899 21.6414 38.2
0.0074 95.0 475 5.8902 21.6414 38.2
0.0074 96.0 480 5.8955 21.6414 38.2
0.0074 97.0 485 5.9038 21.6414 38.2
0.0074 98.0 490 5.9107 21.6414 38.2
0.0074 99.0 495 5.9156 21.6414 38.2
0.0067 100.0 500 5.9172 21.6414 38.2

Framework versions