<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
finetuned-baseline-phase-0.1
This model is a fine-tuned version of ishwarbb23/finetuned-baseline-phase-0.0 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.0837
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.9044 | 0.14 | 5 | 3.5150 |
3.6543 | 0.29 | 10 | 3.4614 |
3.6345 | 0.43 | 15 | 3.4248 |
3.6121 | 0.57 | 20 | 3.3929 |
3.5874 | 0.72 | 25 | 3.3687 |
3.5709 | 0.86 | 30 | 3.3519 |
3.5185 | 1.01 | 35 | 3.3363 |
3.484 | 1.15 | 40 | 3.3250 |
3.4515 | 1.29 | 45 | 3.3153 |
3.4944 | 1.44 | 50 | 3.3051 |
3.4387 | 1.58 | 55 | 3.2956 |
3.4965 | 1.72 | 60 | 3.2867 |
3.4745 | 1.87 | 65 | 3.2791 |
3.4252 | 2.01 | 70 | 3.2736 |
3.499 | 2.15 | 75 | 3.2668 |
3.4885 | 2.3 | 80 | 3.2614 |
3.3934 | 2.44 | 85 | 3.2578 |
3.4112 | 2.59 | 90 | 3.2539 |
3.3843 | 2.73 | 95 | 3.2487 |
3.3753 | 2.87 | 100 | 3.2421 |
3.3824 | 3.02 | 105 | 3.2344 |
3.3801 | 3.16 | 110 | 3.2293 |
3.3943 | 3.3 | 115 | 3.2258 |
3.3946 | 3.45 | 120 | 3.2230 |
3.3178 | 3.59 | 125 | 3.2212 |
3.3325 | 3.73 | 130 | 3.2184 |
3.3925 | 3.88 | 135 | 3.2140 |
3.3453 | 4.02 | 140 | 3.2086 |
3.346 | 4.17 | 145 | 3.2048 |
3.3575 | 4.31 | 150 | 3.2019 |
3.4051 | 4.45 | 155 | 3.1983 |
3.3307 | 4.6 | 160 | 3.1959 |
3.3328 | 4.74 | 165 | 3.1932 |
3.2993 | 4.88 | 170 | 3.1910 |
3.3636 | 5.03 | 175 | 3.1885 |
3.3118 | 5.17 | 180 | 3.1874 |
3.3351 | 5.31 | 185 | 3.1844 |
3.2868 | 5.46 | 190 | 3.1798 |
3.3262 | 5.6 | 195 | 3.1757 |
3.3524 | 5.75 | 200 | 3.1728 |
3.3378 | 5.89 | 205 | 3.1706 |
3.2928 | 6.03 | 210 | 3.1694 |
3.2715 | 6.18 | 215 | 3.1681 |
3.2448 | 6.32 | 220 | 3.1650 |
3.3084 | 6.46 | 225 | 3.1620 |
3.3209 | 6.61 | 230 | 3.1597 |
3.2942 | 6.75 | 235 | 3.1573 |
3.3388 | 6.89 | 240 | 3.1555 |
3.273 | 7.04 | 245 | 3.1544 |
3.3283 | 7.18 | 250 | 3.1520 |
3.1891 | 7.32 | 255 | 3.1514 |
3.2671 | 7.47 | 260 | 3.1504 |
3.2802 | 7.61 | 265 | 3.1486 |
3.316 | 7.76 | 270 | 3.1462 |
3.2761 | 7.9 | 275 | 3.1445 |
3.2772 | 8.04 | 280 | 3.1436 |
3.2263 | 8.19 | 285 | 3.1429 |
3.2242 | 8.33 | 290 | 3.1389 |
3.256 | 8.47 | 295 | 3.1376 |
3.3119 | 8.62 | 300 | 3.1370 |
3.2445 | 8.76 | 305 | 3.1336 |
3.2314 | 8.9 | 310 | 3.1311 |
3.2631 | 9.05 | 315 | 3.1298 |
3.2825 | 9.19 | 320 | 3.1313 |
3.1922 | 9.34 | 325 | 3.1324 |
3.2144 | 9.48 | 330 | 3.1289 |
3.2273 | 9.62 | 335 | 3.1246 |
3.1995 | 9.77 | 340 | 3.1223 |
3.2356 | 9.91 | 345 | 3.1216 |
3.2254 | 10.05 | 350 | 3.1224 |
3.2555 | 10.2 | 355 | 3.1230 |
3.1581 | 10.34 | 360 | 3.1221 |
3.2334 | 10.48 | 365 | 3.1177 |
3.2064 | 10.63 | 370 | 3.1162 |
3.277 | 10.77 | 375 | 3.1153 |
3.2614 | 10.92 | 380 | 3.1115 |
3.2386 | 11.06 | 385 | 3.1105 |
3.2357 | 11.2 | 390 | 3.1100 |
3.2005 | 11.35 | 395 | 3.1099 |
3.2146 | 11.49 | 400 | 3.1104 |
3.19 | 11.63 | 405 | 3.1110 |
3.1835 | 11.78 | 410 | 3.1109 |
3.2247 | 11.92 | 415 | 3.1100 |
3.2138 | 12.06 | 420 | 3.1082 |
3.2105 | 12.21 | 425 | 3.1079 |
3.2074 | 12.35 | 430 | 3.1077 |
3.1758 | 12.5 | 435 | 3.1057 |
3.2357 | 12.64 | 440 | 3.1034 |
3.1556 | 12.78 | 445 | 3.1018 |
3.2014 | 12.93 | 450 | 3.1007 |
3.1641 | 13.07 | 455 | 3.1000 |
3.2082 | 13.21 | 460 | 3.1000 |
3.1841 | 13.36 | 465 | 3.1003 |
3.2168 | 13.5 | 470 | 3.1003 |
3.202 | 13.64 | 475 | 3.0995 |
3.253 | 13.79 | 480 | 3.0975 |
3.1916 | 13.93 | 485 | 3.0966 |
3.2383 | 14.08 | 490 | 3.0949 |
3.2758 | 14.22 | 495 | 3.0938 |
3.1513 | 14.36 | 500 | 3.0934 |
3.1907 | 14.51 | 505 | 3.0929 |
3.1482 | 14.65 | 510 | 3.0926 |
3.1781 | 14.79 | 515 | 3.0927 |
3.167 | 14.94 | 520 | 3.0917 |
3.209 | 15.08 | 525 | 3.0909 |
3.1433 | 15.22 | 530 | 3.0900 |
3.1615 | 15.37 | 535 | 3.0896 |
3.1727 | 15.51 | 540 | 3.0895 |
3.1608 | 15.66 | 545 | 3.0897 |
3.2079 | 15.8 | 550 | 3.0895 |
3.1996 | 15.94 | 555 | 3.0888 |
3.2229 | 16.09 | 560 | 3.0874 |
3.2007 | 16.23 | 565 | 3.0864 |
3.1452 | 16.37 | 570 | 3.0860 |
3.1491 | 16.52 | 575 | 3.0858 |
3.1616 | 16.66 | 580 | 3.0862 |
3.1639 | 16.8 | 585 | 3.0862 |
3.1946 | 16.95 | 590 | 3.0856 |
3.1553 | 17.09 | 595 | 3.0854 |
3.1203 | 17.24 | 600 | 3.0851 |
3.2122 | 17.38 | 605 | 3.0849 |
3.2104 | 17.52 | 610 | 3.0843 |
3.2037 | 17.67 | 615 | 3.0844 |
3.1389 | 17.81 | 620 | 3.0843 |
3.1264 | 17.95 | 625 | 3.0845 |
3.1723 | 18.1 | 630 | 3.0845 |
3.1485 | 18.24 | 635 | 3.0848 |
3.1838 | 18.38 | 640 | 3.0850 |
3.2078 | 18.53 | 645 | 3.0848 |
3.1725 | 18.67 | 650 | 3.0845 |
3.1422 | 18.82 | 655 | 3.0843 |
3.128 | 18.96 | 660 | 3.0841 |
3.2523 | 19.1 | 665 | 3.0839 |
3.2098 | 19.25 | 670 | 3.0838 |
3.1384 | 19.39 | 675 | 3.0837 |
3.1944 | 19.53 | 680 | 3.0837 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu118
- Datasets 2.14.6
- Tokenizers 0.14.1