<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
DSPFirst-Finetuning-1
This model is a fine-tuned version of ahotrod/electra_large_discriminator_squad2_512 on a generated Questions and Answers dataset from the DSPFirst textbook based on the SQuAD 2.0 format.
Dataset
A visualization of the dataset can be found here. The split between train and test is 80% and 20% respectively.
DatasetDict({
train: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 4755
})
test: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 1189
})
})
It achieves the following results on the evaluation set:
- Loss: 0.9236
Model description
More information needed
Intended uses & limitations
Since the dataset is generated from the DSPFirst textbook, its quality is not guaranteed.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 6
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 86
- total_train_batch_size: 516
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Model hyperparameters
- hidden_dropout_prob: 0.5
- attention_probs_dropout_prob = 0.5
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.0131 | 0.7 | 20 | 0.9549 |
6.1542 | 1.42 | 40 | 0.9302 |
6.1472 | 2.14 | 60 | 0.9249 |
5.9662 | 2.84 | 80 | 0.9248 |
6.1467 | 3.56 | 100 | 0.9236 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.1.0
- Tokenizers 0.12.1