<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-take-2-train-extractor
This model is a fine-tuned version of rohitp1/libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers on the None dataset. It achieves the following results on the evaluation set:
- Loss: 123.6555
- Wer: 0.2525
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.3
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
488.7409 | 0.22 | 200 | 175.5911 | 0.4211 |
470.3788 | 0.45 | 400 | 174.7645 | 0.4192 |
472.5283 | 0.67 | 600 | 173.8402 | 0.4184 |
474.1535 | 0.9 | 800 | 173.4610 | 0.4162 |
488.9395 | 1.12 | 1000 | 172.2722 | 0.4172 |
468.5794 | 1.35 | 1200 | 170.7173 | 0.4134 |
473.337 | 1.57 | 1400 | 171.2823 | 0.4069 |
453.5572 | 1.79 | 1600 | 168.4595 | 0.4093 |
456.1514 | 2.02 | 1800 | 166.4398 | 0.4000 |
447.1798 | 2.24 | 2000 | 167.9152 | 0.3994 |
438.2698 | 2.47 | 2200 | 166.1868 | 0.3974 |
438.1535 | 2.69 | 2400 | 164.5998 | 0.3946 |
442.7301 | 2.91 | 2600 | 162.8684 | 0.3956 |
440.5328 | 3.14 | 2800 | 162.3347 | 0.3861 |
449.2731 | 3.36 | 3000 | 160.7815 | 0.3847 |
436.718 | 3.59 | 3200 | 158.1402 | 0.3849 |
425.2622 | 3.81 | 3400 | 157.0624 | 0.3778 |
430.4346 | 4.04 | 3600 | 156.7345 | 0.3764 |
402.7262 | 4.26 | 3800 | 154.0662 | 0.3635 |
405.4374 | 4.48 | 4000 | 153.8651 | 0.3683 |
395.4657 | 4.71 | 4200 | 152.3929 | 0.3609 |
401.6397 | 4.93 | 4400 | 150.4990 | 0.3576 |
397.0791 | 5.16 | 4600 | 151.3244 | 0.3634 |
399.281 | 5.38 | 4800 | 149.6291 | 0.3513 |
392.448 | 5.61 | 5000 | 149.6411 | 0.3474 |
396.3989 | 5.83 | 5200 | 148.5435 | 0.3459 |
381.1296 | 6.05 | 5400 | 147.9963 | 0.3501 |
384.1926 | 6.28 | 5600 | 145.6473 | 0.3435 |
364.3308 | 6.5 | 5800 | 145.9607 | 0.3381 |
365.9475 | 6.73 | 6000 | 142.4151 | 0.3382 |
359.6295 | 6.95 | 6200 | 139.8908 | 0.3315 |
361.9945 | 7.17 | 6400 | 143.2300 | 0.3403 |
370.9596 | 7.4 | 6600 | 140.1414 | 0.3280 |
363.0185 | 7.62 | 6800 | 140.3988 | 0.3240 |
354.5542 | 7.85 | 7000 | 143.5237 | 0.3286 |
356.7341 | 8.07 | 7200 | 145.7105 | 0.3229 |
342.3261 | 8.3 | 7400 | 137.8948 | 0.3188 |
343.8778 | 8.52 | 7600 | 138.7520 | 0.3085 |
327.9473 | 8.74 | 7800 | 136.1127 | 0.3122 |
339.7105 | 8.97 | 8000 | 136.3135 | 0.3084 |
322.9032 | 9.19 | 8200 | 136.0534 | 0.3089 |
332.4099 | 9.42 | 8400 | 136.3784 | 0.3079 |
333.1054 | 9.64 | 8600 | 136.3690 | 0.3020 |
325.0327 | 9.87 | 8800 | 138.1514 | 0.3022 |
326.1452 | 10.09 | 9000 | 130.8793 | 0.2944 |
319.7307 | 10.31 | 9200 | 133.0722 | 0.2945 |
322.89 | 10.54 | 9400 | 131.6615 | 0.2961 |
307.7924 | 10.76 | 9600 | 129.8601 | 0.2917 |
322.2392 | 10.99 | 9800 | 131.7703 | 0.2911 |
306.9055 | 11.21 | 10000 | 130.2165 | 0.2878 |
297.5498 | 11.43 | 10200 | 130.4440 | 0.2920 |
300.9818 | 11.66 | 10400 | 130.6544 | 0.2862 |
300.7568 | 11.88 | 10600 | 128.4007 | 0.2857 |
298.6313 | 12.11 | 10800 | 129.3903 | 0.2808 |
286.8174 | 12.33 | 11000 | 129.0809 | 0.2824 |
290.7518 | 12.56 | 11200 | 130.4312 | 0.2827 |
292.7182 | 12.78 | 11400 | 129.6407 | 0.2829 |
287.0013 | 13.0 | 11600 | 128.5187 | 0.2841 |
262.7644 | 13.23 | 11800 | 128.3923 | 0.2798 |
277.8379 | 13.45 | 12000 | 128.4876 | 0.2786 |
272.4847 | 13.68 | 12200 | 126.7397 | 0.2738 |
286.6665 | 13.9 | 12400 | 129.2148 | 0.2823 |
281.27 | 14.13 | 12600 | 131.3539 | 0.2796 |
266.3464 | 14.35 | 12800 | 127.2011 | 0.2758 |
274.4771 | 14.57 | 13000 | 128.8553 | 0.2784 |
266.4516 | 14.8 | 13200 | 125.6450 | 0.2730 |
266.1086 | 15.02 | 13400 | 125.1995 | 0.2709 |
264.5101 | 15.25 | 13600 | 126.9386 | 0.2723 |
266.8765 | 15.47 | 13800 | 124.8972 | 0.2724 |
255.5908 | 15.7 | 14000 | 125.3817 | 0.2716 |
260.3176 | 15.92 | 14200 | 124.9812 | 0.2698 |
251.0676 | 16.14 | 14400 | 127.1510 | 0.2695 |
255.0812 | 16.37 | 14600 | 127.9661 | 0.2709 |
254.8599 | 16.59 | 14800 | 125.1549 | 0.2670 |
255.7383 | 16.82 | 15000 | 125.9465 | 0.2705 |
242.564 | 17.04 | 15200 | 126.6244 | 0.2669 |
245.8529 | 17.26 | 15400 | 125.0135 | 0.2668 |
250.1366 | 17.49 | 15600 | 123.4417 | 0.2633 |
244.0923 | 17.71 | 15800 | 123.3352 | 0.2654 |
248.4393 | 17.94 | 16000 | 122.9122 | 0.2645 |
252.4732 | 18.16 | 16200 | 122.2313 | 0.2581 |
249.2825 | 18.39 | 16400 | 123.7648 | 0.2618 |
250.1891 | 18.61 | 16600 | 124.0998 | 0.2607 |
243.6611 | 18.83 | 16800 | 123.0910 | 0.2576 |
242.8351 | 19.06 | 17000 | 122.3869 | 0.2576 |
237.169 | 19.28 | 17200 | 123.0963 | 0.2577 |
230.8865 | 19.51 | 17400 | 124.9314 | 0.2589 |
228.3782 | 19.73 | 17600 | 126.1155 | 0.2602 |
235.9318 | 19.96 | 17800 | 121.9966 | 0.2551 |
231.499 | 20.18 | 18000 | 123.4103 | 0.2583 |
234.1825 | 20.4 | 18200 | 122.7898 | 0.2572 |
234.1546 | 20.63 | 18400 | 124.8323 | 0.2577 |
228.4214 | 20.85 | 18600 | 122.2580 | 0.2561 |
229.5802 | 21.08 | 18800 | 122.1630 | 0.2550 |
222.507 | 21.3 | 19000 | 122.7615 | 0.2543 |
223.9583 | 21.52 | 19200 | 123.3316 | 0.2557 |
231.9215 | 21.75 | 19400 | 121.7923 | 0.2542 |
229.7037 | 21.97 | 19600 | 121.5026 | 0.2533 |
232.5929 | 22.2 | 19800 | 123.7730 | 0.2527 |
213.1247 | 22.42 | 20000 | 121.8280 | 0.2506 |
224.965 | 22.65 | 20200 | 123.2294 | 0.2527 |
228.214 | 22.87 | 20400 | 122.9256 | 0.2544 |
216.6104 | 23.09 | 20600 | 124.1280 | 0.2510 |
220.0993 | 23.32 | 20800 | 124.4064 | 0.2523 |
232.2647 | 23.54 | 21000 | 123.6555 | 0.2525 |
Framework versions
- Transformers 4.23.1
- Pytorch 1.12.1
- Datasets 2.6.1
- Tokenizers 0.13.1