generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

gpt-expt-sp-v3-K-600-9-mixed-with-tv-v3

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
0.1876 6.59 10000 0.0951
0.0512 13.18 20000 0.0827
0.0459 19.76 30000 0.0756
0.0437 26.35 40000 0.0749
0.0427 32.94 50000 0.0739
0.042 39.53 60000 0.0728
0.0416 46.11 70000 0.0720
0.0411 52.7 80000 0.0707
0.0408 59.29 90000 0.0699
0.0405 65.88 100000 0.0709
0.0401 72.46 110000 0.0686
0.0399 79.05 120000 0.0681
0.0396 85.64 130000 0.0676
0.0394 92.23 140000 0.0670
0.0392 98.81 150000 0.0676
0.039 105.4 160000 0.0657
0.0388 111.99 170000 0.0654
0.0386 118.58 180000 0.0648
0.0385 125.16 190000 0.0653
0.0383 131.75 200000 0.0652
0.0382 138.34 210000 0.0648
0.0381 144.93 220000 0.0647
0.0379 151.52 230000 0.0645
0.0378 158.1 240000 0.0644
0.0377 164.69 250000 0.0642
0.0377 171.28 260000 0.0642
0.0375 177.87 270000 0.0637
0.0374 184.45 280000 0.0636
0.0373 191.04 290000 0.0639
0.0372 197.63 300000 0.0637
0.0371 204.22 310000 0.0633
0.037 210.8 320000 0.0635
0.0369 217.39 330000 0.0631
0.0368 223.98 340000 0.0628
0.0367 230.57 350000 0.0629
0.0366 237.15 360000 0.0627
0.0365 243.74 370000 0.0629
0.0364 250.33 380000 0.0628
0.0364 256.92 390000 0.0624
0.0363 263.5 400000 0.0625
0.0362 270.09 410000 0.0624
0.0361 276.68 420000 0.0625
0.036 283.27 430000 0.0629
0.0359 289.86 440000 0.0621
0.0358 296.44 450000 0.0623
0.0358 303.03 460000 0.0619
0.0357 309.62 470000 0.0620
0.0356 316.21 480000 0.0619
0.0355 322.79 490000 0.0617
0.0354 329.38 500000 0.0621
0.0353 335.97 510000 0.0615
0.0353 342.56 520000 0.0615
0.0352 349.14 530000 0.0616
0.0351 355.73 540000 0.0614
0.035 362.32 550000 0.0614
0.035 368.91 560000 0.0612
0.0349 375.49 570000 0.0613
0.0348 382.08 580000 0.0612
0.0348 388.67 590000 0.0612
0.0347 395.26 600000 0.0611
0.0347 401.84 610000 0.0610
0.0346 408.43 620000 0.0610
0.0345 415.02 630000 0.0609
0.0345 421.61 640000 0.0610
0.0344 428.19 650000 0.0609
0.0344 434.78 660000 0.0609
0.0343 441.37 670000 0.0608
0.0343 447.96 680000 0.0608
0.0343 454.55 690000 0.0608
0.0342 461.13 700000 0.0608
0.0342 467.72 710000 0.0607
0.0342 474.31 720000 0.0607
0.0342 480.9 730000 0.0607
0.0341 487.48 740000 0.0607
0.0341 494.07 750000 0.0607

Framework versions