<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
xlnet-base-cased_fold_3_binary
This model is a fine-tuned version of xlnet-base-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.3616
- F1: 0.7758
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 25
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
No log | 1.0 | 289 | 0.4668 | 0.7666 |
0.4142 | 2.0 | 578 | 0.4259 | 0.7631 |
0.4142 | 3.0 | 867 | 0.6744 | 0.7492 |
0.235 | 4.0 | 1156 | 0.8879 | 0.7678 |
0.235 | 5.0 | 1445 | 1.0036 | 0.7639 |
0.1297 | 6.0 | 1734 | 1.1427 | 0.7616 |
0.0894 | 7.0 | 2023 | 1.2126 | 0.7626 |
0.0894 | 8.0 | 2312 | 1.5098 | 0.7433 |
0.0473 | 9.0 | 2601 | 1.3616 | 0.7758 |
0.0473 | 10.0 | 2890 | 1.5966 | 0.7579 |
0.0325 | 11.0 | 3179 | 1.6669 | 0.7508 |
0.0325 | 12.0 | 3468 | 1.7401 | 0.7437 |
0.0227 | 13.0 | 3757 | 1.7797 | 0.7515 |
0.0224 | 14.0 | 4046 | 1.7349 | 0.7418 |
0.0224 | 15.0 | 4335 | 1.7527 | 0.7595 |
0.0152 | 16.0 | 4624 | 1.7492 | 0.7634 |
0.0152 | 17.0 | 4913 | 1.8178 | 0.7628 |
0.0117 | 18.0 | 5202 | 1.7736 | 0.7688 |
0.0117 | 19.0 | 5491 | 1.8449 | 0.7704 |
0.0055 | 20.0 | 5780 | 1.8687 | 0.7652 |
0.0065 | 21.0 | 6069 | 1.8083 | 0.7669 |
0.0065 | 22.0 | 6358 | 1.8568 | 0.7559 |
0.0054 | 23.0 | 6647 | 1.8760 | 0.7678 |
0.0054 | 24.0 | 6936 | 1.8948 | 0.7697 |
0.0048 | 25.0 | 7225 | 1.9109 | 0.7680 |
Framework versions
- Transformers 4.21.0
- Pytorch 1.12.0+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1