generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

gpt-expt-sp-v3-K-300-9-mixed-with-tv

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
0.3825 4.28 5000 0.1047
0.0883 8.56 10000 0.0668
0.0633 12.84 15000 0.0505
0.0545 17.12 20000 0.0472
0.0503 21.4 25000 0.0441
0.0477 25.68 30000 0.0420
0.0457 29.97 35000 0.0408
0.0444 34.25 40000 0.0401
0.0435 38.53 45000 0.0399
0.0428 42.81 50000 0.0391
0.0423 47.09 55000 0.0388
0.0418 51.37 60000 0.0386
0.0414 55.65 65000 0.0384
0.0411 59.93 70000 0.0381
0.0408 64.21 75000 0.0381
0.0405 68.49 80000 0.0379
0.0403 72.77 85000 0.0376
0.0401 77.05 90000 0.0375
0.0399 81.34 95000 0.0375
0.0398 85.62 100000 0.0374
0.0396 89.9 105000 0.0373
0.0395 94.18 110000 0.0373
0.0393 98.46 115000 0.0372
0.0392 102.74 120000 0.0370
0.0391 107.02 125000 0.0369
0.039 111.3 130000 0.0370
0.0389 115.58 135000 0.0369
0.0388 119.86 140000 0.0369
0.0387 124.14 145000 0.0368
0.0386 128.42 150000 0.0367
0.0385 132.71 155000 0.0367
0.0388 136.99 160000 0.0511
0.0384 141.27 165000 0.0367
0.0383 145.55 170000 0.0366
0.0382 149.83 175000 0.0365
0.0381 154.11 180000 0.0365
0.0381 158.39 185000 0.0364
0.038 162.67 190000 0.0364
0.0379 166.95 195000 0.0364
0.0378 171.23 200000 0.0365
0.0378 175.51 205000 0.0363
0.0377 179.79 210000 0.0362
0.0377 184.08 215000 0.0362
0.0376 188.36 220000 0.0362
0.0375 192.64 225000 0.0361
0.0375 196.92 230000 0.0361
0.0374 201.2 235000 0.0361
0.0373 205.48 240000 0.0360
0.0373 209.76 245000 0.0360
0.0372 214.04 250000 0.0360
0.0372 218.32 255000 0.0359
0.0371 222.6 260000 0.0359
0.0371 226.88 265000 0.0359
0.037 231.16 270000 0.0359
0.037 235.45 275000 0.0358
0.0375 239.73 280000 0.0358
0.0369 244.01 285000 0.0357
0.0368 248.29 290000 0.0357
0.0367 252.57 295000 0.0358
0.0367 256.85 300000 0.0357
0.0368 261.13 305000 0.0356
0.0366 265.41 310000 0.0357
0.0366 269.69 315000 0.0356
0.0365 273.97 320000 0.0355
0.0364 278.25 325000 0.0356
0.0364 282.53 330000 0.0356
0.0363 286.81 335000 0.0355
0.0363 291.1 340000 0.0355
0.0362 295.38 345000 0.0354
0.0362 299.66 350000 0.0355
0.0361 303.94 355000 0.0354
0.0361 308.22 360000 0.0354
0.036 312.5 365000 0.0354
0.036 316.78 370000 0.0354
0.036 321.06 375000 0.0353
0.0359 325.34 380000 0.0353
0.0359 329.62 385000 0.0354
0.0358 333.9 390000 0.0353
0.0358 338.18 395000 0.0353
0.0357 342.47 400000 0.0352
0.0358 346.75 405000 0.0351
0.0356 351.03 410000 0.0352
0.0356 355.31 415000 0.0352
0.0356 359.59 420000 0.0352
0.0355 363.87 425000 0.0352
0.0355 368.15 430000 0.0351
0.0355 372.43 435000 0.0351
0.0354 376.71 440000 0.0351
0.0354 380.99 445000 0.0350
0.0354 385.27 450000 0.0351
0.0353 389.55 455000 0.0351
0.0353 393.84 460000 0.0350
0.0353 398.12 465000 0.0350
0.0352 402.4 470000 0.0350
0.0352 406.68 475000 0.0350
0.0352 410.96 480000 0.0350
0.0352 415.24 485000 0.0350
0.0351 419.52 490000 0.0350
0.0351 423.8 495000 0.0349
0.0351 428.08 500000 0.0349
0.0351 432.36 505000 0.0349
0.0351 436.64 510000 0.0349
0.035 440.92 515000 0.0349
0.035 445.21 520000 0.0349
0.035 449.49 525000 0.0349
0.035 453.77 530000 0.0349
0.035 458.05 535000 0.0349
0.035 462.33 540000 0.0349
0.0349 466.61 545000 0.0349
0.0349 470.89 550000 0.0349
0.0349 475.17 555000 0.0349
0.0349 479.45 560000 0.0349
0.0349 483.73 565000 0.0349
0.0349 488.01 570000 0.0349
0.0349 492.29 575000 0.0349
0.0349 496.58 580000 0.0349

Framework versions