BART_corrector

This model is a fine-tuned version of ainize/bart-base-cnn on a homemade dataset. Each sample of the dataset is an english sentence that has been duplicated 10 times and where random errors (7%) were added.

It achieves the following results on the evaluation set:

Loss: 0.0025
Rouge1: 81.4214
Rouge2: 80.2027
Rougel: 81.4202
Rougelsum: 81.4241
Gen Len: 19.3962

Model description

More information needed

Intended uses & limitations

The goal of this model is to correct a sentence, given several versions of it with various mistakes.

Text sample : TheIdeSbgn of thh Eiffel Toweg is aYtribeted to Ma. . ahd design of The Eijfel Tower is attribQtedBto ta. . The designYof the EifZel Tower Vs APtWibuteQ to Ma. . The xeQign oC the EiffelXTower ik attributed to Ma. . ghebFesign of theSbiffel TJwer is atMributed to Ma. . The desOBn of thQ Eiffel ToweP isfattributnd toBMa. . The design of the EBfUel Fower is JtAriOuted tx Ma. . The design of Jhe ENffel LoweF is aptrVbuted Lo Ma. . The deslgX of the lPffel Towermis attributedhtohMa. . The desRgn of thekSuffel Tower is Ttkribufed to Ma. .

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 4
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.0071	1.0	2365	0.0039	81.3664	80.0861	81.3601	81.3667	19.3967
0.0033	2.0	4730	0.0029	81.3937	80.1548	81.3902	81.3974	19.3961
0.0018	3.0	7095	0.0029	81.3838	80.1404	81.385	81.3878	19.3965
0.001	4.0	9460	0.0025	81.4214	80.2027	81.4202	81.4241	19.3962

Framework versions

Transformers 4.21.1
Pytorch 1.12.1+cu113
Datasets 2.4.0
Tokenizers 0.12.1