<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
distilroberta-base-NER-ind
This model is a fine-tuned version of distilroberta-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1396
- F1: 0.8126
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.750420024069848e-05
- train_batch_size: 16
- eval_batch_size: 48
- seed: 15
- gradient_accumulation_steps: 3
- total_train_batch_size: 48
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
0.1765 | 0.2 | 1000 | 0.1597 | 0.7715 |
0.1629 | 0.4 | 2000 | 0.1552 | 0.7767 |
0.1511 | 0.6 | 3000 | 0.1458 | 0.7906 |
0.1497 | 0.8 | 4000 | 0.1416 | 0.7847 |
0.1455 | 1.0 | 5000 | 0.1416 | 0.7971 |
0.1264 | 1.2 | 6000 | 0.1361 | 0.8024 |
0.1235 | 1.39 | 7000 | 0.1356 | 0.8050 |
0.1249 | 1.59 | 8000 | 0.1381 | 0.8053 |
0.1267 | 1.79 | 9000 | 0.1368 | 0.8059 |
0.1241 | 1.99 | 10000 | 0.1366 | 0.8040 |
0.1046 | 2.19 | 11000 | 0.1354 | 0.8090 |
0.1043 | 2.39 | 12000 | 0.1365 | 0.8118 |
0.1069 | 2.59 | 13000 | 0.1349 | 0.8107 |
0.1032 | 2.79 | 14000 | 0.1359 | 0.8113 |
0.1055 | 2.99 | 15000 | 0.1337 | 0.8091 |
0.0909 | 3.19 | 16000 | 0.1393 | 0.8137 |
0.0908 | 3.39 | 17000 | 0.1399 | 0.8124 |
0.0903 | 3.59 | 18000 | 0.1390 | 0.8118 |
0.089 | 3.79 | 19000 | 0.1397 | 0.8128 |
0.0878 | 3.98 | 20000 | 0.1396 | 0.8126 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3