generated_from_keras_callback

<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->

distilgpt_new_0080

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Train Loss Validation Loss Epoch
5.6632 4.5153 0
4.4292 4.0923 1
4.1169 3.8723 2
3.9326 3.7260 3
3.8026 3.6281 4
3.7045 3.5355 5
3.6254 3.4645 6
3.5604 3.4093 7
3.5048 3.3587 8
3.4569 3.3136 9
3.4155 3.2778 10
3.3791 3.2443 11
3.3470 3.2157 12
3.3183 3.1854 13
3.2922 3.1642 14
3.2685 3.1400 15
3.2467 3.1193 16
3.2267 3.1009 17
3.2078 3.0838 18
3.1904 3.0689 19
3.1739 3.0520 20
3.1584 3.0379 21
3.1438 3.0255 22
3.1300 3.0116 23
3.1168 2.9965 24
3.1044 2.9866 25
3.0925 2.9752 26
3.0812 2.9631 27
3.0704 2.9539 28
3.0601 2.9458 29
3.0502 2.9340 30
3.0408 2.9251 31
3.0317 2.9179 32
3.0230 2.9082 33
3.0147 2.9002 34
3.0065 2.8948 35
2.9987 2.8855 36
2.9911 2.8779 37
2.9838 2.8706 38
2.9767 2.8643 39
2.9698 2.8570 40
2.9632 2.8501 41
2.9567 2.8441 42
2.9505 2.8385 43
2.9445 2.8327 44
2.9385 2.8260 45
2.9329 2.8213 46
2.9272 2.8160 47
2.9217 2.8107 48
2.9162 2.8052 49
2.9110 2.8020 50
2.9060 2.7938 51
2.9010 2.7896 52
2.8962 2.7857 53
2.8913 2.7827 54
2.8866 2.7768 55
2.8821 2.7724 56
2.8776 2.7679 57
2.8733 2.7642 58
2.8691 2.7610 59
2.8649 2.7556 60
2.8607 2.7513 61
2.8568 2.7485 62
2.8529 2.7424 63
2.8490 2.7395 64
2.8452 2.7383 65
2.8414 2.7325 66
2.8378 2.7292 67
2.8343 2.7251 68
2.8307 2.7206 69
2.8273 2.7177 70
2.8237 2.7138 71
2.8204 2.7093 72
2.8171 2.7073 73
2.8139 2.7057 74
2.8106 2.7029 75
2.8075 2.6991 76
2.8043 2.6961 77
2.8013 2.6929 78
2.7983 2.6896 79

Framework versions