<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
neuroscience-to-dev-bio-jsv4
This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0695
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 128
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 100000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
24.3287 | 0.76 | 3 | 18.8652 |
18.1915 | 1.78 | 7 | 18.8556 |
18.1215 | 2.79 | 11 | 18.8221 |
17.8507 | 3.81 | 15 | 18.6982 |
17.0492 | 4.83 | 19 | 18.2237 |
15.9062 | 5.84 | 23 | 17.5422 |
14.8768 | 6.86 | 27 | 16.4438 |
14.0759 | 7.87 | 31 | 15.5454 |
13.5381 | 8.89 | 35 | 14.7087 |
13.0031 | 9.9 | 39 | 13.8568 |
12.5788 | 10.92 | 43 | 13.2354 |
12.2285 | 11.94 | 47 | 12.7442 |
11.9333 | 12.95 | 51 | 12.3690 |
11.6802 | 13.97 | 55 | 12.0725 |
11.4449 | 14.98 | 59 | 11.7544 |
11.1385 | 16.0 | 63 | 11.3739 |
14.4029 | 16.76 | 66 | 11.0449 |
10.5022 | 17.78 | 70 | 10.6118 |
10.1749 | 18.79 | 74 | 10.1711 |
9.8214 | 19.81 | 78 | 9.6969 |
9.4196 | 20.83 | 82 | 9.1333 |
8.9308 | 21.84 | 86 | 8.3086 |
8.3288 | 22.86 | 90 | 7.4233 |
7.7657 | 23.87 | 94 | 6.8660 |
7.3611 | 24.89 | 98 | 6.5694 |
7.0561 | 25.9 | 102 | 6.3261 |
6.8051 | 26.92 | 106 | 6.1499 |
6.5919 | 27.94 | 110 | 5.9731 |
6.4156 | 28.95 | 114 | 5.8249 |
6.2572 | 29.97 | 118 | 5.6809 |
6.1213 | 30.98 | 122 | 5.5631 |
5.993 | 32.0 | 126 | 5.4450 |
7.8389 | 32.76 | 129 | 5.3694 |
5.7759 | 33.78 | 133 | 5.2713 |
5.6726 | 34.79 | 137 | 5.1745 |
5.5731 | 35.81 | 141 | 5.0841 |
5.4773 | 36.83 | 145 | 4.9938 |
5.3928 | 37.84 | 149 | 4.9069 |
5.2863 | 38.86 | 153 | 4.8186 |
5.1979 | 39.87 | 157 | 4.7304 |
5.1039 | 40.89 | 161 | 4.6431 |
5.0214 | 41.9 | 165 | 4.5553 |
4.9208 | 42.92 | 169 | 4.4656 |
4.8369 | 43.94 | 173 | 4.3771 |
4.7415 | 44.95 | 177 | 4.2874 |
4.6508 | 45.97 | 181 | 4.1996 |
4.5609 | 46.98 | 185 | 4.1074 |
4.4651 | 48.0 | 189 | 4.0151 |
5.8328 | 48.76 | 192 | 3.9468 |
4.2769 | 49.78 | 196 | 3.8541 |
4.19 | 50.79 | 200 | 3.7615 |
4.0956 | 51.81 | 204 | 3.6687 |
3.9945 | 52.83 | 208 | 3.5726 |
3.904 | 53.84 | 212 | 3.4752 |
3.8107 | 54.86 | 216 | 3.3814 |
3.7135 | 55.87 | 220 | 3.2854 |
3.6174 | 56.89 | 224 | 3.1882 |
3.5205 | 57.9 | 228 | 3.0898 |
3.4233 | 58.92 | 232 | 2.9924 |
3.333 | 59.94 | 236 | 2.8947 |
3.2258 | 60.95 | 240 | 2.7939 |
3.1279 | 61.97 | 244 | 2.6950 |
3.0278 | 62.98 | 248 | 2.5971 |
2.9348 | 64.0 | 252 | 2.4995 |
3.7766 | 64.76 | 255 | 2.4249 |
2.7327 | 65.78 | 259 | 2.3260 |
2.6385 | 66.79 | 263 | 2.2277 |
2.5418 | 67.81 | 267 | 2.1310 |
2.4482 | 68.83 | 271 | 2.0338 |
2.3636 | 69.84 | 275 | 1.9374 |
2.2525 | 70.86 | 279 | 1.8416 |
2.1472 | 71.87 | 283 | 1.7484 |
2.0559 | 72.89 | 287 | 1.6557 |
1.9571 | 73.9 | 291 | 1.5656 |
1.8667 | 74.92 | 295 | 1.4765 |
1.7716 | 75.94 | 299 | 1.3897 |
1.6853 | 76.95 | 303 | 1.3049 |
1.5902 | 77.97 | 307 | 1.2224 |
1.5023 | 78.98 | 311 | 1.1420 |
1.4194 | 80.0 | 315 | 1.0651 |
1.7788 | 80.76 | 318 | 1.0091 |
1.2485 | 81.78 | 322 | 0.9366 |
1.1745 | 82.79 | 326 | 0.8685 |
1.0937 | 83.81 | 330 | 0.8023 |
1.0191 | 84.83 | 334 | 0.7405 |
0.9504 | 85.84 | 338 | 0.6814 |
0.8818 | 86.86 | 342 | 0.6264 |
0.8132 | 87.87 | 346 | 0.5749 |
0.7557 | 88.89 | 350 | 0.5265 |
0.6961 | 89.9 | 354 | 0.4816 |
0.6379 | 90.92 | 358 | 0.4399 |
0.5835 | 91.94 | 362 | 0.4015 |
0.5349 | 92.95 | 366 | 0.3657 |
0.4894 | 93.97 | 370 | 0.3336 |
0.448 | 94.98 | 374 | 0.3042 |
0.4121 | 96.0 | 378 | 0.2773 |
0.4961 | 96.76 | 381 | 0.2595 |
0.3365 | 97.78 | 385 | 0.2378 |
0.3104 | 98.79 | 389 | 0.2192 |
0.2786 | 99.81 | 393 | 0.2009 |
0.2543 | 100.83 | 397 | 0.1836 |
0.2335 | 101.84 | 401 | 0.1681 |
0.2148 | 102.86 | 405 | 0.1549 |
0.1932 | 103.87 | 409 | 0.1435 |
0.1766 | 104.89 | 413 | 0.1328 |
0.1607 | 105.9 | 417 | 0.1236 |
0.1451 | 106.92 | 421 | 0.1170 |
0.1314 | 107.94 | 425 | 0.1103 |
0.1196 | 108.95 | 429 | 0.1075 |
0.1094 | 109.97 | 433 | 0.1015 |
0.102 | 110.98 | 437 | 0.0897 |
0.0916 | 112.0 | 441 | 0.0855 |
0.1094 | 112.76 | 444 | 0.0823 |
0.0754 | 113.78 | 448 | 0.0790 |
0.073 | 114.79 | 452 | 0.0775 |
0.0656 | 115.81 | 456 | 0.0734 |
0.0575 | 116.83 | 460 | 0.0712 |
0.0525 | 117.84 | 464 | 0.0680 |
0.0492 | 118.86 | 468 | 0.0667 |
0.0467 | 119.87 | 472 | 0.0650 |
0.042 | 120.89 | 476 | 0.0639 |
0.0385 | 121.9 | 480 | 0.0648 |
0.0355 | 122.92 | 484 | 0.0643 |
0.0338 | 123.94 | 488 | 0.0674 |
0.033 | 124.95 | 492 | 0.0635 |
0.0288 | 125.97 | 496 | 0.0643 |
0.0279 | 126.98 | 500 | 0.0634 |
0.0288 | 128.0 | 504 | 0.0630 |
0.034 | 128.76 | 507 | 0.0640 |
0.0225 | 129.78 | 511 | 0.0641 |
0.0212 | 130.79 | 515 | 0.0683 |
0.022 | 131.81 | 519 | 0.0655 |
0.0197 | 132.83 | 523 | 0.0692 |
0.0192 | 133.84 | 527 | 0.0687 |
0.0188 | 134.86 | 531 | 0.0672 |
0.0184 | 135.87 | 535 | 0.0695 |
Framework versions
- Transformers 4.27.3
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2