<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
film95000roberta-base
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 14840
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.5804 | 0.34 | 500 | 2.4511 |
2.5301 | 0.67 | 1000 | 2.4079 |
2.4702 | 1.01 | 1500 | 2.3465 |
2.4039 | 1.35 | 2000 | 2.2871 |
2.3628 | 1.68 | 2500 | 2.2480 |
2.3216 | 2.02 | 3000 | 2.2253 |
2.259 | 2.36 | 3500 | 2.1989 |
2.2392 | 2.69 | 4000 | 2.1679 |
2.2156 | 3.03 | 4500 | 2.1489 |
2.17 | 3.37 | 5000 | 2.1207 |
2.1497 | 3.7 | 5500 | 2.1003 |
2.1281 | 4.04 | 6000 | 2.0753 |
2.0873 | 4.38 | 6500 | 2.0626 |
2.0658 | 4.71 | 7000 | 2.0411 |
2.0446 | 5.05 | 7500 | 2.0332 |
2.0091 | 5.39 | 8000 | 2.0082 |
1.9974 | 5.72 | 8500 | 1.9966 |
1.9802 | 6.06 | 9000 | 1.9752 |
1.9498 | 6.4 | 9500 | 1.9578 |
1.9426 | 6.73 | 10000 | 1.9451 |
1.9199 | 7.07 | 10500 | 1.9226 |
1.8933 | 7.41 | 11000 | 1.9161 |
1.8836 | 7.74 | 11500 | 1.8952 |
1.8625 | 8.08 | 12000 | 1.8846 |
1.8405 | 8.42 | 12500 | 1.8810 |
1.8311 | 8.75 | 13000 | 1.8703 |
1.8187 | 9.09 | 13500 | 1.8634 |
1.804 | 9.43 | 14000 | 1.8441 |
1.7908 | 9.76 | 14500 | 1.8436 |
Framework versions
- Transformers 4.27.3
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2