<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->
TimShieh/bert-base-cased-finetuned-semeval2017-MLM-tf
This model is a fine-tuned version of bert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 1.3086
- Validation Loss: 2.3846
- Epoch: 98
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'inner_optimizer': {'class_name': 'AdamWeightDecay', 'config': {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 3500, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}}, 'dynamic': True, 'initial_scale': 32768.0, 'dynamic_growth_steps': 2000}
- training_precision: mixed_float16
Training results
Train Loss | Validation Loss | Epoch |
---|---|---|
2.8433 | 2.6662 | 0 |
2.7598 | 2.5861 | 1 |
2.7534 | 2.5040 | 2 |
2.7240 | 2.4523 | 3 |
2.6318 | 2.5183 | 4 |
2.5957 | 2.4422 | 5 |
2.5625 | 2.4058 | 6 |
2.5154 | 2.3935 | 7 |
2.4640 | 2.3379 | 8 |
2.4819 | 2.3405 | 9 |
2.4415 | 2.3790 | 10 |
2.3893 | 2.3233 | 11 |
2.3388 | 2.3790 | 12 |
2.3699 | 2.3036 | 13 |
2.2846 | 2.2498 | 14 |
2.3063 | 2.2773 | 15 |
2.2476 | 2.2761 | 16 |
2.1773 | 2.2033 | 17 |
2.2143 | 2.2317 | 18 |
2.2070 | 2.2894 | 19 |
2.1538 | 2.2133 | 20 |
2.0956 | 2.3578 | 21 |
2.1071 | 2.2503 | 22 |
2.0790 | 2.3071 | 23 |
2.0288 | 2.3034 | 24 |
2.0533 | 2.3170 | 25 |
2.0120 | 2.2095 | 26 |
1.9467 | 2.2245 | 27 |
1.9542 | 2.2078 | 28 |
1.9432 | 2.3161 | 29 |
1.9461 | 2.3124 | 30 |
1.9065 | 2.2365 | 31 |
1.8745 | 2.1271 | 32 |
1.8496 | 2.3026 | 33 |
1.8842 | 2.3762 | 34 |
1.8175 | 2.2070 | 35 |
1.8243 | 2.2866 | 36 |
1.7952 | 2.2107 | 37 |
1.7777 | 2.3179 | 38 |
1.7657 | 2.2338 | 39 |
1.7292 | 2.2267 | 40 |
1.7378 | 2.2141 | 41 |
1.6489 | 2.2372 | 42 |
1.6734 | 2.2508 | 43 |
1.6657 | 2.2625 | 44 |
1.6595 | 2.3270 | 45 |
1.6092 | 2.2561 | 46 |
1.5995 | 2.1516 | 47 |
1.6329 | 2.3223 | 48 |
1.6295 | 2.3140 | 49 |
1.5856 | 2.2300 | 50 |
1.6024 | 2.2130 | 51 |
1.5409 | 2.1686 | 52 |
1.5413 | 2.2330 | 53 |
1.5623 | 2.3331 | 54 |
1.5061 | 2.2902 | 55 |
1.5053 | 2.3790 | 56 |
1.5196 | 2.2583 | 57 |
1.4747 | 2.2444 | 58 |
1.5104 | 2.3633 | 59 |
1.4382 | 2.3433 | 60 |
1.4945 | 2.2789 | 61 |
1.4732 | 2.2114 | 62 |
1.4642 | 2.2824 | 63 |
1.4017 | 2.3451 | 64 |
1.4142 | 2.2902 | 65 |
1.3868 | 2.3788 | 66 |
1.4233 | 2.2854 | 67 |
1.4314 | 2.4351 | 68 |
1.3653 | 2.2845 | 69 |
1.4202 | 2.2994 | 70 |
1.3817 | 2.2924 | 71 |
1.3771 | 2.3026 | 72 |
1.3795 | 2.2778 | 73 |
1.3843 | 2.3807 | 74 |
1.3457 | 2.3068 | 75 |
1.3408 | 2.3956 | 76 |
1.3737 | 2.2845 | 77 |
1.3277 | 2.3673 | 78 |
1.3330 | 2.3530 | 79 |
1.2938 | 2.2045 | 80 |
1.3352 | 2.3202 | 81 |
1.3095 | 2.3502 | 82 |
1.2982 | 2.2561 | 83 |
1.3452 | 2.2632 | 84 |
1.2995 | 2.3359 | 85 |
1.3235 | 2.2482 | 86 |
1.3153 | 2.3753 | 87 |
1.3092 | 2.3334 | 88 |
1.3178 | 2.4354 | 89 |
1.3300 | 2.3763 | 90 |
1.3079 | 2.3011 | 91 |
1.2800 | 2.3430 | 92 |
1.3166 | 2.2550 | 93 |
1.2893 | 2.3375 | 94 |
1.2898 | 2.3382 | 95 |
1.3280 | 2.4488 | 96 |
1.3429 | 2.3147 | 97 |
1.3086 | 2.3846 | 98 |
Framework versions
- Transformers 4.26.1
- TensorFlow 2.11.0
- Datasets 2.10.1
- Tokenizers 0.13.2