<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
gpt-2-spiritualtest-LoRA
This model is a fine-tuned version of Aharneish/gpt2-spiritual on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.6679
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 300
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.489 | 2.12 | 500 | 1.9065 |
2.2722 | 4.24 | 1000 | 1.6764 |
2.1401 | 6.36 | 1500 | 1.5225 |
2.0433 | 8.47 | 2000 | 1.3953 |
1.9827 | 10.59 | 2500 | 1.3053 |
1.9249 | 12.71 | 3000 | 1.2289 |
1.8814 | 14.83 | 3500 | 1.1599 |
1.8562 | 16.95 | 4000 | 1.1164 |
1.8285 | 19.07 | 4500 | 1.0753 |
1.8037 | 21.19 | 5000 | 1.0442 |
1.7835 | 23.31 | 5500 | 1.0104 |
1.7675 | 25.42 | 6000 | 0.9916 |
1.7554 | 27.54 | 6500 | 0.9726 |
1.7389 | 29.66 | 7000 | 0.9672 |
1.7284 | 31.78 | 7500 | 0.9443 |
1.7196 | 33.9 | 8000 | 0.9335 |
1.7104 | 36.02 | 8500 | 0.9153 |
1.7013 | 38.14 | 9000 | 0.9058 |
1.6862 | 40.25 | 9500 | 0.8875 |
1.6828 | 42.37 | 10000 | 0.8942 |
1.6779 | 44.49 | 10500 | 0.8804 |
1.67 | 46.61 | 11000 | 0.8699 |
1.6648 | 48.73 | 11500 | 0.8617 |
1.6576 | 50.85 | 12000 | 0.8481 |
1.6506 | 52.97 | 12500 | 0.8562 |
1.647 | 55.08 | 13000 | 0.8444 |
1.6382 | 57.2 | 13500 | 0.8349 |
1.6401 | 59.32 | 14000 | 0.8380 |
1.6304 | 61.44 | 14500 | 0.8254 |
1.6283 | 63.56 | 15000 | 0.8234 |
1.6159 | 65.68 | 15500 | 0.8119 |
1.622 | 67.8 | 16000 | 0.8119 |
1.6146 | 69.92 | 16500 | 0.8091 |
1.6101 | 72.03 | 17000 | 0.8034 |
1.6049 | 74.15 | 17500 | 0.7934 |
1.5976 | 76.27 | 18000 | 0.7905 |
1.5949 | 78.39 | 18500 | 0.7883 |
1.5907 | 80.51 | 19000 | 0.7874 |
1.5952 | 82.63 | 19500 | 0.7869 |
1.5843 | 84.75 | 20000 | 0.7811 |
1.5857 | 86.86 | 20500 | 0.7793 |
1.5813 | 88.98 | 21000 | 0.7725 |
1.5753 | 91.1 | 21500 | 0.7727 |
1.5725 | 93.22 | 22000 | 0.7663 |
1.5687 | 95.34 | 22500 | 0.7643 |
1.5696 | 97.46 | 23000 | 0.7667 |
1.5605 | 99.58 | 23500 | 0.7615 |
1.5681 | 101.69 | 24000 | 0.7581 |
1.5587 | 103.81 | 24500 | 0.7563 |
1.5573 | 105.93 | 25000 | 0.7559 |
1.5532 | 108.05 | 25500 | 0.7482 |
1.5488 | 110.17 | 26000 | 0.7496 |
1.5468 | 112.29 | 26500 | 0.7440 |
1.5496 | 114.41 | 27000 | 0.7427 |
1.5471 | 116.53 | 27500 | 0.7449 |
1.5367 | 118.64 | 28000 | 0.7405 |
1.5375 | 120.76 | 28500 | 0.7368 |
1.5362 | 122.88 | 29000 | 0.7302 |
1.5347 | 125.0 | 29500 | 0.7294 |
1.5309 | 127.12 | 30000 | 0.7306 |
1.5267 | 129.24 | 30500 | 0.7240 |
1.5289 | 131.36 | 31000 | 0.7288 |
1.523 | 133.47 | 31500 | 0.7268 |
1.5197 | 135.59 | 32000 | 0.7200 |
1.5184 | 137.71 | 32500 | 0.7192 |
1.5188 | 139.83 | 33000 | 0.7140 |
1.5161 | 141.95 | 33500 | 0.7182 |
1.5156 | 144.07 | 34000 | 0.7136 |
1.5066 | 146.19 | 34500 | 0.7079 |
1.5063 | 148.31 | 35000 | 0.7099 |
1.5103 | 150.42 | 35500 | 0.7099 |
1.5046 | 152.54 | 36000 | 0.7059 |
1.503 | 154.66 | 36500 | 0.7057 |
1.5005 | 156.78 | 37000 | 0.7026 |
1.4998 | 158.9 | 37500 | 0.7014 |
1.4989 | 161.02 | 38000 | 0.6996 |
1.4931 | 163.14 | 38500 | 0.6997 |
1.4915 | 165.25 | 39000 | 0.6957 |
1.489 | 167.37 | 39500 | 0.6974 |
1.4906 | 169.49 | 40000 | 0.6969 |
1.4859 | 171.61 | 40500 | 0.6956 |
1.4881 | 173.73 | 41000 | 0.6921 |
1.4836 | 175.85 | 41500 | 0.6928 |
1.4818 | 177.97 | 42000 | 0.6901 |
1.482 | 180.08 | 42500 | 0.6912 |
1.4778 | 182.2 | 43000 | 0.6885 |
1.4763 | 184.32 | 43500 | 0.6885 |
1.4807 | 186.44 | 44000 | 0.6848 |
1.474 | 188.56 | 44500 | 0.6833 |
1.4712 | 190.68 | 45000 | 0.6829 |
1.4715 | 192.8 | 45500 | 0.6826 |
1.4682 | 194.92 | 46000 | 0.6831 |
1.4706 | 197.03 | 46500 | 0.6819 |
1.4674 | 199.15 | 47000 | 0.6818 |
1.4804 | 201.27 | 47500 | 0.6895 |
1.4891 | 203.39 | 48000 | 0.6905 |
1.4856 | 205.51 | 48500 | 0.6900 |
1.4826 | 207.63 | 49000 | 0.6861 |
1.4833 | 209.75 | 49500 | 0.6871 |
1.4844 | 211.86 | 50000 | 0.6865 |
1.4793 | 213.98 | 50500 | 0.6859 |
1.4805 | 216.1 | 51000 | 0.6851 |
1.4749 | 218.22 | 51500 | 0.6838 |
1.4751 | 220.34 | 52000 | 0.6826 |
1.4748 | 222.46 | 52500 | 0.6809 |
1.4738 | 224.58 | 53000 | 0.6816 |
1.4759 | 226.69 | 53500 | 0.6815 |
1.4674 | 228.81 | 54000 | 0.6792 |
1.472 | 230.93 | 54500 | 0.6785 |
1.4681 | 233.05 | 55000 | 0.6767 |
1.4667 | 235.17 | 55500 | 0.6762 |
1.4659 | 237.29 | 56000 | 0.6766 |
1.4697 | 239.41 | 56500 | 0.6764 |
1.4621 | 241.53 | 57000 | 0.6737 |
1.4613 | 243.64 | 57500 | 0.6745 |
1.4624 | 245.76 | 58000 | 0.6737 |
1.4593 | 247.88 | 58500 | 0.6736 |
1.4621 | 250.0 | 59000 | 0.6737 |
1.4742 | 252.12 | 59500 | 0.6798 |
1.4749 | 254.24 | 60000 | 0.6790 |
1.471 | 256.36 | 60500 | 0.6830 |
1.4713 | 258.47 | 61000 | 0.6820 |
1.4777 | 260.59 | 61500 | 0.6793 |
1.4738 | 262.71 | 62000 | 0.6784 |
1.4692 | 264.83 | 62500 | 0.6772 |
1.472 | 266.95 | 63000 | 0.6758 |
1.4707 | 269.07 | 63500 | 0.6762 |
1.4654 | 271.19 | 64000 | 0.6732 |
1.4691 | 273.31 | 64500 | 0.6746 |
1.4658 | 275.42 | 65000 | 0.6746 |
1.4648 | 277.54 | 65500 | 0.6746 |
1.4622 | 279.66 | 66000 | 0.6733 |
1.4641 | 281.78 | 66500 | 0.6708 |
1.4617 | 283.9 | 67000 | 0.6724 |
1.4605 | 286.02 | 67500 | 0.6713 |
1.4593 | 288.14 | 68000 | 0.6701 |
1.4612 | 290.25 | 68500 | 0.6699 |
1.4563 | 292.37 | 69000 | 0.6693 |
1.4555 | 294.49 | 69500 | 0.6677 |
1.4565 | 296.61 | 70000 | 0.6675 |
1.455 | 298.73 | 70500 | 0.6679 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1