<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
distilroberta-base-finetuned-wikitextepoch_150
This model is a fine-tuned version of distilroberta-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.8929
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 150
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 2.2428 | 1.0 | 1121 | 2.0500 |
| 2.1209 | 2.0 | 2242 | 1.9996 |
| 2.0665 | 3.0 | 3363 | 1.9501 |
| 2.0179 | 4.0 | 4484 | 1.9311 |
| 1.9759 | 5.0 | 5605 | 1.9255 |
| 1.9089 | 6.0 | 6726 | 1.8805 |
| 1.9143 | 7.0 | 7847 | 1.8715 |
| 1.8744 | 8.0 | 8968 | 1.8671 |
| 1.858 | 9.0 | 10089 | 1.8592 |
| 1.8141 | 10.0 | 11210 | 1.8578 |
| 1.7917 | 11.0 | 12331 | 1.8574 |
| 1.7752 | 12.0 | 13452 | 1.8423 |
| 1.7722 | 13.0 | 14573 | 1.8287 |
| 1.7354 | 14.0 | 15694 | 1.8396 |
| 1.7217 | 15.0 | 16815 | 1.8244 |
| 1.6968 | 16.0 | 17936 | 1.8278 |
| 1.659 | 17.0 | 19057 | 1.8412 |
| 1.6442 | 18.0 | 20178 | 1.8328 |
| 1.6441 | 19.0 | 21299 | 1.8460 |
| 1.6267 | 20.0 | 22420 | 1.8343 |
| 1.612 | 21.0 | 23541 | 1.8249 |
| 1.5963 | 22.0 | 24662 | 1.8253 |
| 1.6101 | 23.0 | 25783 | 1.7843 |
| 1.5747 | 24.0 | 26904 | 1.8047 |
| 1.5559 | 25.0 | 28025 | 1.8618 |
| 1.5484 | 26.0 | 29146 | 1.8660 |
| 1.5411 | 27.0 | 30267 | 1.8318 |
| 1.5247 | 28.0 | 31388 | 1.8216 |
| 1.5278 | 29.0 | 32509 | 1.8075 |
| 1.4954 | 30.0 | 33630 | 1.8073 |
| 1.4863 | 31.0 | 34751 | 1.7958 |
| 1.4821 | 32.0 | 35872 | 1.8080 |
| 1.4357 | 33.0 | 36993 | 1.8373 |
| 1.4602 | 34.0 | 38114 | 1.8199 |
| 1.447 | 35.0 | 39235 | 1.8325 |
| 1.4292 | 36.0 | 40356 | 1.8075 |
| 1.4174 | 37.0 | 41477 | 1.8168 |
| 1.4103 | 38.0 | 42598 | 1.8095 |
| 1.4168 | 39.0 | 43719 | 1.8233 |
| 1.4005 | 40.0 | 44840 | 1.8388 |
| 1.3799 | 41.0 | 45961 | 1.8235 |
| 1.3657 | 42.0 | 47082 | 1.8298 |
| 1.3559 | 43.0 | 48203 | 1.8165 |
| 1.3723 | 44.0 | 49324 | 1.8059 |
| 1.3535 | 45.0 | 50445 | 1.8451 |
| 1.3533 | 46.0 | 51566 | 1.8458 |
| 1.3469 | 47.0 | 52687 | 1.8237 |
| 1.3247 | 48.0 | 53808 | 1.8264 |
| 1.3142 | 49.0 | 54929 | 1.8209 |
| 1.2958 | 50.0 | 56050 | 1.8244 |
| 1.293 | 51.0 | 57171 | 1.8311 |
| 1.2784 | 52.0 | 58292 | 1.8287 |
| 1.2731 | 53.0 | 59413 | 1.8600 |
| 1.2961 | 54.0 | 60534 | 1.8086 |
| 1.2739 | 55.0 | 61655 | 1.8303 |
| 1.2716 | 56.0 | 62776 | 1.8214 |
| 1.2459 | 57.0 | 63897 | 1.8440 |
| 1.2492 | 58.0 | 65018 | 1.8503 |
| 1.2393 | 59.0 | 66139 | 1.8316 |
| 1.2077 | 60.0 | 67260 | 1.8283 |
| 1.2426 | 61.0 | 68381 | 1.8413 |
| 1.2032 | 62.0 | 69502 | 1.8461 |
| 1.2123 | 63.0 | 70623 | 1.8469 |
| 1.2069 | 64.0 | 71744 | 1.8478 |
| 1.198 | 65.0 | 72865 | 1.8479 |
| 1.1972 | 66.0 | 73986 | 1.8516 |
| 1.1885 | 67.0 | 75107 | 1.8341 |
| 1.1784 | 68.0 | 76228 | 1.8322 |
| 1.1866 | 69.0 | 77349 | 1.8559 |
| 1.1648 | 70.0 | 78470 | 1.8758 |
| 1.1595 | 71.0 | 79591 | 1.8684 |
| 1.1661 | 72.0 | 80712 | 1.8553 |
| 1.1478 | 73.0 | 81833 | 1.8658 |
| 1.1488 | 74.0 | 82954 | 1.8452 |
| 1.1538 | 75.0 | 84075 | 1.8505 |
| 1.1267 | 76.0 | 85196 | 1.8430 |
| 1.1339 | 77.0 | 86317 | 1.8333 |
| 1.118 | 78.0 | 87438 | 1.8419 |
| 1.12 | 79.0 | 88559 | 1.8669 |
| 1.1144 | 80.0 | 89680 | 1.8647 |
| 1.104 | 81.0 | 90801 | 1.8643 |
| 1.0864 | 82.0 | 91922 | 1.8528 |
| 1.0863 | 83.0 | 93043 | 1.8456 |
| 1.0912 | 84.0 | 94164 | 1.8509 |
| 1.0873 | 85.0 | 95285 | 1.8690 |
| 1.0862 | 86.0 | 96406 | 1.8577 |
| 1.0879 | 87.0 | 97527 | 1.8612 |
| 1.0783 | 88.0 | 98648 | 1.8410 |
| 1.0618 | 89.0 | 99769 | 1.8517 |
| 1.0552 | 90.0 | 100890 | 1.8459 |
| 1.0516 | 91.0 | 102011 | 1.8723 |
| 1.0424 | 92.0 | 103132 | 1.8832 |
| 1.0478 | 93.0 | 104253 | 1.8922 |
| 1.0523 | 94.0 | 105374 | 1.8753 |
| 1.027 | 95.0 | 106495 | 1.8625 |
| 1.0364 | 96.0 | 107616 | 1.8673 |
| 1.0203 | 97.0 | 108737 | 1.8806 |
| 1.0309 | 98.0 | 109858 | 1.8644 |
| 1.0174 | 99.0 | 110979 | 1.8659 |
| 1.0184 | 100.0 | 112100 | 1.8590 |
| 1.0234 | 101.0 | 113221 | 1.8614 |
| 1.013 | 102.0 | 114342 | 1.8866 |
| 1.0092 | 103.0 | 115463 | 1.8770 |
| 1.0051 | 104.0 | 116584 | 1.8445 |
| 1.0105 | 105.0 | 117705 | 1.8512 |
| 1.0233 | 106.0 | 118826 | 1.8896 |
| 0.9967 | 107.0 | 119947 | 1.8687 |
| 0.9795 | 108.0 | 121068 | 1.8618 |
| 0.9846 | 109.0 | 122189 | 1.8877 |
| 0.9958 | 110.0 | 123310 | 1.8522 |
| 0.9689 | 111.0 | 124431 | 1.8765 |
| 0.9879 | 112.0 | 125552 | 1.8692 |
| 0.99 | 113.0 | 126673 | 1.8689 |
| 0.9798 | 114.0 | 127794 | 1.8898 |
| 0.9676 | 115.0 | 128915 | 1.8782 |
| 0.9759 | 116.0 | 130036 | 1.8840 |
| 0.9576 | 117.0 | 131157 | 1.8662 |
| 0.9637 | 118.0 | 132278 | 1.8984 |
| 0.9645 | 119.0 | 133399 | 1.8872 |
| 0.9793 | 120.0 | 134520 | 1.8705 |
| 0.9643 | 121.0 | 135641 | 1.9036 |
| 0.961 | 122.0 | 136762 | 1.8683 |
| 0.9496 | 123.0 | 137883 | 1.8785 |
| 0.946 | 124.0 | 139004 | 1.8912 |
| 0.9681 | 125.0 | 140125 | 1.8837 |
| 0.9403 | 126.0 | 141246 | 1.8824 |
| 0.9452 | 127.0 | 142367 | 1.8824 |
| 0.9437 | 128.0 | 143488 | 1.8665 |
| 0.945 | 129.0 | 144609 | 1.8655 |
| 0.9453 | 130.0 | 145730 | 1.8695 |
| 0.9238 | 131.0 | 146851 | 1.8697 |
| 0.9176 | 132.0 | 147972 | 1.8618 |
| 0.9405 | 133.0 | 149093 | 1.8679 |
| 0.9184 | 134.0 | 150214 | 1.9025 |
| 0.9298 | 135.0 | 151335 | 1.9045 |
| 0.9215 | 136.0 | 152456 | 1.9014 |
| 0.9249 | 137.0 | 153577 | 1.8505 |
| 0.9246 | 138.0 | 154698 | 1.8542 |
| 0.9205 | 139.0 | 155819 | 1.8731 |
| 0.9368 | 140.0 | 156940 | 1.8673 |
| 0.9251 | 141.0 | 158061 | 1.8835 |
| 0.9224 | 142.0 | 159182 | 1.8727 |
| 0.9326 | 143.0 | 160303 | 1.8380 |
| 0.916 | 144.0 | 161424 | 1.8857 |
| 0.9361 | 145.0 | 162545 | 1.8547 |
| 0.9121 | 146.0 | 163666 | 1.8587 |
| 0.9156 | 147.0 | 164787 | 1.8863 |
| 0.9131 | 148.0 | 165908 | 1.8809 |
| 0.9185 | 149.0 | 167029 | 1.8734 |
| 0.9183 | 150.0 | 168150 | 1.8929 |
Framework versions
- Transformers 4.21.0
- Pytorch 1.5.0
- Datasets 2.4.0
- Tokenizers 0.12.1