<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
german-jeopardy-longt5-base
This model is a fine-tuned version of google/long-t5-tglobal-base on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:
- Loss: 1.8533
- Brevity Penalty: 0.8910
- System Length: 18642
- Reference Length: 20793
- ROUGE-1: 35.31
- ROUGE-2: 16.35
- ROUGE-L: 33.91
- ROUGE-Lsum: 33.96
- Exact Match: 1.36
- BLEU: 10.80
- F1: 34.41
Model description
See google/long-t5-tglobal-base for more information about the
model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
Intended uses & limitations
This model can be used for question generation on German text.
Training and evaluation data
See lmqg/qg_dequad.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
Training results
Training Loss | Epoch | Step | BLEU | Brevity Penalty | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Exact Match | F1 | Gen Len | Validation Loss | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | System Length | Totals 1 | Totals 2 | Totals 3 | Totals 4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3.1671 | 1.0 | 145 | 5.9441 | 0.7156 | 6177 | 1669 | 604 | 179 | 0.0023 | 0.2528 | 12.0218 | 2.1902 | 38.7954 | 12.1665 | 5.2458 | 1.9227 | 21250 | 0.2595 | 0.1035 | 0.2491 | 0.2492 | 15922 | 15922 | 13718 | 11514 | 9310 |
2.5597 | 2.0 | 291 | 7.7787 | 0.7556 | 6785 | 2044 | 804 | 293 | 0.0064 | 0.2864 | 12.6084 | 2.0164 | 40.876 | 14.1994 | 6.595 | 2.9338 | 21250 | 0.2931 | 0.1291 | 0.2817 | 0.2818 | 16599 | 16599 | 14395 | 12191 | 9987 |
2.3464 | 2.99 | 436 | 9.2407 | 0.7935 | 7251 | 2326 | 969 | 400 | 0.0073 | 0.3114 | 13.2296 | 1.9138 | 42.0129 | 15.45 | 7.5403 | 3.7569 | 21250 | 0.3162 | 0.1456 | 0.3031 | 0.3031 | 17259 | 17259 | 15055 | 12851 | 10647 |
2.1679 | 4.0 | 582 | 9.6363 | 0.7795 | 7382 | 2393 | 1006 | 434 | 0.0109 | 0.3226 | 13.1207 | 1.8524 | 43.3903 | 16.1591 | 7.981 | 4.1727 | 21250 | 0.3272 | 0.1504 | 0.3147 | 0.3149 | 17013 | 17013 | 14809 | 12605 | 10401 |
2.0454 | 5.0 | 728 | 10.3812 | 0.7665 | 7581 | 2555 | 1111 | 482 | 0.0132 | 0.3357 | 12.9782 | 1.7997 | 45.1599 | 17.5204 | 8.9749 | 4.7371 | 21250 | 0.3401 | 0.1606 | 0.3278 | 0.3279 | 16787 | 16787 | 14583 | 12379 | 10175 |
1.9502 | 5.99 | 873 | 10.7668 | 0.7992 | 7759 | 2618 | 1162 | 511 | 0.0127 | 0.3406 | 13.4841 | 1.7696 | 44.6973 | 17.2748 | 8.9723 | 4.7548 | 21250 | 0.3452 | 0.1631 | 0.3321 | 0.3319 | 17359 | 17359 | 15155 | 12951 | 10747 |
1.8414 | 7.0 | 1019 | 11.3408 | 0.7721 | 7791 | 2693 | 1236 | 570 | 0.015 | 0.347 | 13.0563 | 1.7472 | 46.147 | 18.3459 | 9.9078 | 5.5496 | 21250 | 0.3513 | 0.1679 | 0.3391 | 0.3391 | 16883 | 16883 | 14679 | 12475 | 10271 |
1.7614 | 8.0 | 1165 | 11.8447 | 0.8198 | 8024 | 2799 | 1296 | 610 | 0.0145 | 0.352 | 13.515 | 1.7203 | 45.2643 | 18.0313 | 9.7305 | 5.4881 | 21250 | 0.3565 | 0.1711 | 0.3422 | 0.3423 | 17727 | 17727 | 15523 | 13319 | 11115 |
1.6997 | 9.0 | 1310 | 11.9689 | 0.8027 | 8046 | 2835 | 1314 | 615 | 0.0168 | 0.3568 | 13.4306 | 1.7167 | 46.183 | 18.6293 | 10.0968 | 5.6892 | 21250 | 0.3613 | 0.1746 | 0.3466 | 0.3466 | 17422 | 17422 | 15218 | 13014 | 10810 |
1.6159 | 10.0 | 1456 | 12.5678 | 0.8182 | 8087 | 2928 | 1395 | 681 | 0.0181 | 0.3564 | 13.5268 | 1.6892 | 45.6944 | 18.8976 | 10.4966 | 6.1429 | 21250 | 0.3612 | 0.1795 | 0.3485 | 0.3482 | 17698 | 17698 | 15494 | 13290 | 11086 |
1.5681 | 10.99 | 1601 | 12.497 | 0.813 | 8154 | 2933 | 1383 | 664 | 0.0168 | 0.3605 | 13.6044 | 1.6923 | 46.3164 | 19.0442 | 10.4797 | 6.0402 | 21250 | 0.3654 | 0.1789 | 0.3506 | 0.3505 | 17605 | 17605 | 15401 | 13197 | 10993 |
1.4987 | 12.0 | 1747 | 12.8959 | 0.8169 | 8295 | 3011 | 1432 | 697 | 0.0181 | 0.3675 | 13.6134 | 1.6825 | 46.928 | 19.461 | 10.7929 | 6.2997 | 21250 | 0.3734 | 0.1846 | 0.3576 | 0.3577 | 17676 | 17676 | 15472 | 13268 | 11064 |
1.4461 | 13.0 | 1893 | 12.8688 | 0.8139 | 8246 | 3005 | 1424 | 700 | 0.0191 | 0.3658 | 13.5812 | 1.6784 | 46.7964 | 19.4915 | 10.7773 | 6.3584 | 21250 | 0.3725 | 0.1857 | 0.358 | 0.3576 | 17621 | 17621 | 15417 | 13213 | 11009 |
1.4002 | 13.99 | 2038 | 13.4526 | 0.8329 | 8457 | 3130 | 1504 | 745 | 0.02 | 0.3727 | 13.9179 | 1.6725 | 47.0749 | 19.8591 | 11.0939 | 6.5621 | 21250 | 0.3797 | 0.1915 | 0.3637 | 0.3634 | 17965 | 17965 | 15761 | 13557 | 11353 |
1.3391 | 15.0 | 2184 | 13.211 | 0.8283 | 8443 | 3091 | 1468 | 719 | 0.0204 | 0.3737 | 13.9133 | 1.6783 | 47.2177 | 19.7168 | 10.8959 | 6.3803 | 21250 | 0.3804 | 0.1901 | 0.3634 | 0.363 | 17881 | 17881 | 15677 | 13473 | 11269 |
1.2921 | 16.0 | 2330 | 13.4907 | 0.8373 | 8457 | 3147 | 1511 | 747 | 0.0195 | 0.3716 | 13.9882 | 1.6738 | 46.8662 | 19.8662 | 11.0801 | 6.5337 | 21250 | 0.3782 | 0.1902 | 0.3624 | 0.3624 | 18045 | 18045 | 15841 | 13637 | 11433 |
1.2572 | 17.0 | 2475 | 13.8581 | 0.8267 | 8473 | 3219 | 1561 | 783 | 0.02 | 0.3753 | 13.7618 | 1.6770 | 47.4598 | 20.57 | 11.6103 | 6.9656 | 21250 | 0.3821 | 0.1948 | 0.3669 | 0.3665 | 17853 | 17853 | 15649 | 13445 | 11241 |
1.199 | 18.0 | 2621 | 13.7496 | 0.8326 | 8484 | 3190 | 1551 | 771 | 0.0186 | 0.3745 | 13.8798 | 1.6934 | 47.2409 | 20.2475 | 11.4456 | 6.7947 | 21250 | 0.3812 | 0.1922 | 0.3657 | 0.3658 | 17959 | 17959 | 15755 | 13551 | 11347 |
1.1668 | 18.99 | 2766 | 13.7379 | 0.8395 | 8504 | 3179 | 1541 | 776 | 0.0204 | 0.376 | 13.9256 | 1.6926 | 47.0198 | 20.0164 | 11.2663 | 6.7631 | 21250 | 0.3828 | 0.1939 | 0.3665 | 0.3665 | 18086 | 18086 | 15882 | 13678 | 11474 |
1.1164 | 19.91 | 2900 | 14.1906 | 0.8529 | 8625 | 3250 | 1609 | 820 | 0.0204 | 0.3803 | 14.069 | 1.7026 | 47.0463 | 20.15 | 11.5548 | 6.996 | 21250 | 0.3874 | 0.1964 | 0.3716 | 0.3715 | 18333 | 18333 | 16129 | 13925 | 11721 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.14.1