Hyperparameters:
- learning rate: 2e-5
- weight decay: 0.01
- per_device_train_batch_size: 16
- per_device_eval_batch_size: 16
- gradient_accumulation_steps:1
- eval steps: 50000
- max_length: 512
- num_epochs: 1
- hidden_dropout_prob: 0.3
- attention_probs_dropout_prob: 0.25
Dataset version:
- taskydata/tasky_or_not/v_1
Checkpoint:
- 455000 steps.
Results on Validation set:
| Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|---|
| 50000 | 0.0148 | 0.10890 | 0.9798 | 0.9755 | 0.9843 | 0.9799 |
| 100000 | 0.0121 | 0.09090 | 0.9863 | 0.9958 | 0.9767 | 0.9862 |
| 150000 | 0.0080 | 0.11800 | 0.9863 | 0.9779 | 0.9950 | 0.9864 |
| 200000 | 0.0116 | 0.08965 | 0.9877 | 0.9905 | 0.9848 | 0.9876 |
| 250000 | 0.0073 | 3.50100 | 0.6507 | 0.5905 | 0.9830 | 0.7378 |
| 300000 | 0.0072 | 0.09807 | 0.9850 | 0.9863 | 0.9870 | 0.9849 |
| 350000 | 0.0053 | 0.09830 | 0.9854 | 0.9939 | 0.9870 | 0.9852 |
| 400000 | 0.0046 | 0.08130 | 0.9893 | 0.9957 | 0.9828 | 0.9892 |
| 450000 | 0.0054 | 0.61280 | 0.9095 | 0.5835 | 0.9888 | 0.9162 |
| 455000 | 0.0055 | 0.15790 | 0.9710 | 0.9561 | 0.9874 | 0.9715 |
Uploaded Checkpoint:
- 400000