<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
distilroberta-base-finetuned-wikitextepoch_150
This model is a fine-tuned version of distilroberta-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.8929
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 150
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.2428 | 1.0 | 1121 | 2.0500 |
2.1209 | 2.0 | 2242 | 1.9996 |
2.0665 | 3.0 | 3363 | 1.9501 |
2.0179 | 4.0 | 4484 | 1.9311 |
1.9759 | 5.0 | 5605 | 1.9255 |
1.9089 | 6.0 | 6726 | 1.8805 |
1.9143 | 7.0 | 7847 | 1.8715 |
1.8744 | 8.0 | 8968 | 1.8671 |
1.858 | 9.0 | 10089 | 1.8592 |
1.8141 | 10.0 | 11210 | 1.8578 |
1.7917 | 11.0 | 12331 | 1.8574 |
1.7752 | 12.0 | 13452 | 1.8423 |
1.7722 | 13.0 | 14573 | 1.8287 |
1.7354 | 14.0 | 15694 | 1.8396 |
1.7217 | 15.0 | 16815 | 1.8244 |
1.6968 | 16.0 | 17936 | 1.8278 |
1.659 | 17.0 | 19057 | 1.8412 |
1.6442 | 18.0 | 20178 | 1.8328 |
1.6441 | 19.0 | 21299 | 1.8460 |
1.6267 | 20.0 | 22420 | 1.8343 |
1.612 | 21.0 | 23541 | 1.8249 |
1.5963 | 22.0 | 24662 | 1.8253 |
1.6101 | 23.0 | 25783 | 1.7843 |
1.5747 | 24.0 | 26904 | 1.8047 |
1.5559 | 25.0 | 28025 | 1.8618 |
1.5484 | 26.0 | 29146 | 1.8660 |
1.5411 | 27.0 | 30267 | 1.8318 |
1.5247 | 28.0 | 31388 | 1.8216 |
1.5278 | 29.0 | 32509 | 1.8075 |
1.4954 | 30.0 | 33630 | 1.8073 |
1.4863 | 31.0 | 34751 | 1.7958 |
1.4821 | 32.0 | 35872 | 1.8080 |
1.4357 | 33.0 | 36993 | 1.8373 |
1.4602 | 34.0 | 38114 | 1.8199 |
1.447 | 35.0 | 39235 | 1.8325 |
1.4292 | 36.0 | 40356 | 1.8075 |
1.4174 | 37.0 | 41477 | 1.8168 |
1.4103 | 38.0 | 42598 | 1.8095 |
1.4168 | 39.0 | 43719 | 1.8233 |
1.4005 | 40.0 | 44840 | 1.8388 |
1.3799 | 41.0 | 45961 | 1.8235 |
1.3657 | 42.0 | 47082 | 1.8298 |
1.3559 | 43.0 | 48203 | 1.8165 |
1.3723 | 44.0 | 49324 | 1.8059 |
1.3535 | 45.0 | 50445 | 1.8451 |
1.3533 | 46.0 | 51566 | 1.8458 |
1.3469 | 47.0 | 52687 | 1.8237 |
1.3247 | 48.0 | 53808 | 1.8264 |
1.3142 | 49.0 | 54929 | 1.8209 |
1.2958 | 50.0 | 56050 | 1.8244 |
1.293 | 51.0 | 57171 | 1.8311 |
1.2784 | 52.0 | 58292 | 1.8287 |
1.2731 | 53.0 | 59413 | 1.8600 |
1.2961 | 54.0 | 60534 | 1.8086 |
1.2739 | 55.0 | 61655 | 1.8303 |
1.2716 | 56.0 | 62776 | 1.8214 |
1.2459 | 57.0 | 63897 | 1.8440 |
1.2492 | 58.0 | 65018 | 1.8503 |
1.2393 | 59.0 | 66139 | 1.8316 |
1.2077 | 60.0 | 67260 | 1.8283 |
1.2426 | 61.0 | 68381 | 1.8413 |
1.2032 | 62.0 | 69502 | 1.8461 |
1.2123 | 63.0 | 70623 | 1.8469 |
1.2069 | 64.0 | 71744 | 1.8478 |
1.198 | 65.0 | 72865 | 1.8479 |
1.1972 | 66.0 | 73986 | 1.8516 |
1.1885 | 67.0 | 75107 | 1.8341 |
1.1784 | 68.0 | 76228 | 1.8322 |
1.1866 | 69.0 | 77349 | 1.8559 |
1.1648 | 70.0 | 78470 | 1.8758 |
1.1595 | 71.0 | 79591 | 1.8684 |
1.1661 | 72.0 | 80712 | 1.8553 |
1.1478 | 73.0 | 81833 | 1.8658 |
1.1488 | 74.0 | 82954 | 1.8452 |
1.1538 | 75.0 | 84075 | 1.8505 |
1.1267 | 76.0 | 85196 | 1.8430 |
1.1339 | 77.0 | 86317 | 1.8333 |
1.118 | 78.0 | 87438 | 1.8419 |
1.12 | 79.0 | 88559 | 1.8669 |
1.1144 | 80.0 | 89680 | 1.8647 |
1.104 | 81.0 | 90801 | 1.8643 |
1.0864 | 82.0 | 91922 | 1.8528 |
1.0863 | 83.0 | 93043 | 1.8456 |
1.0912 | 84.0 | 94164 | 1.8509 |
1.0873 | 85.0 | 95285 | 1.8690 |
1.0862 | 86.0 | 96406 | 1.8577 |
1.0879 | 87.0 | 97527 | 1.8612 |
1.0783 | 88.0 | 98648 | 1.8410 |
1.0618 | 89.0 | 99769 | 1.8517 |
1.0552 | 90.0 | 100890 | 1.8459 |
1.0516 | 91.0 | 102011 | 1.8723 |
1.0424 | 92.0 | 103132 | 1.8832 |
1.0478 | 93.0 | 104253 | 1.8922 |
1.0523 | 94.0 | 105374 | 1.8753 |
1.027 | 95.0 | 106495 | 1.8625 |
1.0364 | 96.0 | 107616 | 1.8673 |
1.0203 | 97.0 | 108737 | 1.8806 |
1.0309 | 98.0 | 109858 | 1.8644 |
1.0174 | 99.0 | 110979 | 1.8659 |
1.0184 | 100.0 | 112100 | 1.8590 |
1.0234 | 101.0 | 113221 | 1.8614 |
1.013 | 102.0 | 114342 | 1.8866 |
1.0092 | 103.0 | 115463 | 1.8770 |
1.0051 | 104.0 | 116584 | 1.8445 |
1.0105 | 105.0 | 117705 | 1.8512 |
1.0233 | 106.0 | 118826 | 1.8896 |
0.9967 | 107.0 | 119947 | 1.8687 |
0.9795 | 108.0 | 121068 | 1.8618 |
0.9846 | 109.0 | 122189 | 1.8877 |
0.9958 | 110.0 | 123310 | 1.8522 |
0.9689 | 111.0 | 124431 | 1.8765 |
0.9879 | 112.0 | 125552 | 1.8692 |
0.99 | 113.0 | 126673 | 1.8689 |
0.9798 | 114.0 | 127794 | 1.8898 |
0.9676 | 115.0 | 128915 | 1.8782 |
0.9759 | 116.0 | 130036 | 1.8840 |
0.9576 | 117.0 | 131157 | 1.8662 |
0.9637 | 118.0 | 132278 | 1.8984 |
0.9645 | 119.0 | 133399 | 1.8872 |
0.9793 | 120.0 | 134520 | 1.8705 |
0.9643 | 121.0 | 135641 | 1.9036 |
0.961 | 122.0 | 136762 | 1.8683 |
0.9496 | 123.0 | 137883 | 1.8785 |
0.946 | 124.0 | 139004 | 1.8912 |
0.9681 | 125.0 | 140125 | 1.8837 |
0.9403 | 126.0 | 141246 | 1.8824 |
0.9452 | 127.0 | 142367 | 1.8824 |
0.9437 | 128.0 | 143488 | 1.8665 |
0.945 | 129.0 | 144609 | 1.8655 |
0.9453 | 130.0 | 145730 | 1.8695 |
0.9238 | 131.0 | 146851 | 1.8697 |
0.9176 | 132.0 | 147972 | 1.8618 |
0.9405 | 133.0 | 149093 | 1.8679 |
0.9184 | 134.0 | 150214 | 1.9025 |
0.9298 | 135.0 | 151335 | 1.9045 |
0.9215 | 136.0 | 152456 | 1.9014 |
0.9249 | 137.0 | 153577 | 1.8505 |
0.9246 | 138.0 | 154698 | 1.8542 |
0.9205 | 139.0 | 155819 | 1.8731 |
0.9368 | 140.0 | 156940 | 1.8673 |
0.9251 | 141.0 | 158061 | 1.8835 |
0.9224 | 142.0 | 159182 | 1.8727 |
0.9326 | 143.0 | 160303 | 1.8380 |
0.916 | 144.0 | 161424 | 1.8857 |
0.9361 | 145.0 | 162545 | 1.8547 |
0.9121 | 146.0 | 163666 | 1.8587 |
0.9156 | 147.0 | 164787 | 1.8863 |
0.9131 | 148.0 | 165908 | 1.8809 |
0.9185 | 149.0 | 167029 | 1.8734 |
0.9183 | 150.0 | 168150 | 1.8929 |
Framework versions
- Transformers 4.21.0
- Pytorch 1.5.0
- Datasets 2.4.0
- Tokenizers 0.12.1