<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
libri-smallw2v2-no-copy-kl-alpha-0.75-T-1-take-3
This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 233.4488
- Wer: 0.3266
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
344.6087 | 1.12 | 400 | 259.4728 | 0.4553 |
331.2433 | 2.24 | 800 | 249.7617 | 0.4505 |
324.318 | 3.36 | 1200 | 243.3380 | 0.4476 |
318.8272 | 4.48 | 1600 | 245.5005 | 0.4397 |
314.193 | 5.6 | 2000 | 245.8428 | 0.4389 |
310.3951 | 6.72 | 2400 | 249.0932 | 0.4358 |
307.6068 | 7.84 | 2800 | 248.7932 | 0.4386 |
308.3191 | 8.96 | 3200 | 242.1969 | 0.4309 |
302.0043 | 10.08 | 3600 | 251.7498 | 0.4271 |
296.6045 | 11.2 | 4000 | 246.3982 | 0.4272 |
295.0746 | 12.32 | 4400 | 249.5031 | 0.4201 |
294.2732 | 13.45 | 4800 | 247.5096 | 0.4167 |
290.1665 | 14.57 | 5200 | 247.6531 | 0.4227 |
293.2169 | 15.69 | 5600 | 246.6296 | 0.4150 |
287.6487 | 16.81 | 6000 | 244.2763 | 0.4132 |
287.8109 | 17.93 | 6400 | 243.7672 | 0.4116 |
282.7126 | 19.05 | 6800 | 241.8889 | 0.4073 |
280.5111 | 20.17 | 7200 | 251.7473 | 0.4015 |
276.0679 | 21.29 | 7600 | 242.1010 | 0.3990 |
275.7184 | 22.41 | 8000 | 244.3330 | 0.3966 |
273.0371 | 23.53 | 8400 | 240.5063 | 0.3908 |
268.2875 | 24.65 | 8800 | 241.2827 | 0.3916 |
262.7938 | 25.77 | 9200 | 236.9669 | 0.3870 |
262.0252 | 26.89 | 9600 | 238.0447 | 0.3837 |
256.3527 | 28.01 | 10000 | 231.9627 | 0.3777 |
253.4488 | 29.13 | 10400 | 241.3886 | 0.3778 |
251.0047 | 30.25 | 10800 | 239.3254 | 0.3716 |
247.1357 | 31.37 | 11200 | 234.5317 | 0.3743 |
245.7466 | 32.49 | 11600 | 237.2660 | 0.3732 |
240.7763 | 33.61 | 12000 | 234.2133 | 0.3719 |
240.4568 | 34.73 | 12400 | 233.6447 | 0.3652 |
236.5008 | 35.85 | 12800 | 231.5317 | 0.3634 |
234.3844 | 36.97 | 13200 | 235.9599 | 0.3667 |
234.8658 | 38.1 | 13600 | 235.8033 | 0.3629 |
229.0455 | 39.22 | 14000 | 235.2613 | 0.3597 |
228.1712 | 40.34 | 14400 | 237.7952 | 0.3559 |
225.4442 | 41.46 | 14800 | 237.6732 | 0.3553 |
223.492 | 42.58 | 15200 | 228.6896 | 0.3549 |
222.2095 | 43.7 | 15600 | 233.7846 | 0.3528 |
221.3752 | 44.82 | 16000 | 235.6401 | 0.3503 |
220.0048 | 45.94 | 16400 | 236.2913 | 0.3486 |
214.734 | 47.06 | 16800 | 233.7592 | 0.3452 |
213.6554 | 48.18 | 17200 | 233.3319 | 0.3468 |
212.7388 | 49.3 | 17600 | 232.7798 | 0.3447 |
210.9421 | 50.42 | 18000 | 239.8152 | 0.3483 |
211.6293 | 51.54 | 18400 | 235.0050 | 0.3450 |
209.8978 | 52.66 | 18800 | 235.3156 | 0.3453 |
207.996 | 53.78 | 19200 | 233.3227 | 0.3429 |
206.4369 | 54.9 | 19600 | 231.2948 | 0.3395 |
202.3726 | 56.02 | 20000 | 226.5554 | 0.3387 |
201.5557 | 57.14 | 20400 | 236.6525 | 0.3413 |
203.2557 | 58.26 | 20800 | 231.1979 | 0.3387 |
200.6613 | 59.38 | 21200 | 232.5989 | 0.3361 |
199.9518 | 60.5 | 21600 | 233.6743 | 0.3368 |
198.7427 | 61.62 | 22000 | 236.3777 | 0.3365 |
195.5101 | 62.75 | 22400 | 230.0184 | 0.3354 |
195.4992 | 63.87 | 22800 | 229.9954 | 0.3332 |
193.023 | 64.99 | 23200 | 233.4538 | 0.3343 |
195.5294 | 66.11 | 23600 | 235.9190 | 0.3328 |
193.4176 | 67.23 | 24000 | 235.0465 | 0.3316 |
191.4303 | 68.35 | 24400 | 234.2414 | 0.3331 |
190.1889 | 69.47 | 24800 | 234.3200 | 0.3306 |
188.0727 | 70.59 | 25200 | 231.8731 | 0.3288 |
189.5906 | 71.71 | 25600 | 234.2662 | 0.3297 |
187.4333 | 72.83 | 26000 | 234.4295 | 0.3298 |
188.8704 | 73.95 | 26400 | 234.3622 | 0.3272 |
186.7061 | 75.07 | 26800 | 233.1743 | 0.3257 |
185.7288 | 76.19 | 27200 | 233.8410 | 0.3255 |
184.4937 | 77.31 | 27600 | 230.8933 | 0.3249 |
183.4367 | 78.43 | 28000 | 233.5291 | 0.3274 |
184.048 | 79.55 | 28400 | 233.0128 | 0.3274 |
184.0794 | 80.67 | 28800 | 232.1064 | 0.3259 |
181.9872 | 81.79 | 29200 | 234.1832 | 0.3259 |
181.738 | 82.91 | 29600 | 233.4488 | 0.3266 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.7.1
- Tokenizers 0.11.0