text_shortening_model_v1

This model is a fine-tuned version of t5-small on a dataset of 699 original-shortened texts pairs of advertising texts. It achieves the following results on the evaluation set:

Loss: 1.9266
Rouge1: 0.4797
Rouge2: 0.2787
Rougel: 0.4325
Rougelsum: 0.4321
Bert precision: 0.8713
Bert recall: 0.8594
Average word count: 10.0714
Max word count: 18
Min word count: 1
Average token count: 15.45

Model description

Data is cleaned and preprocessed: "summarize" prefix added for each original text input.

Loss is a combination of:

CrossEntropy
Custom loss which can be seen as a length penalty: +1 if predicted text length > 12, else 0

Loss = theta * Custom loss + (1 - theta) * CrossEntropy

(theta = 0.3)

Intended uses & limitations

More information needed

Training and evaluation data

699 original-shortened texts pairs of advertising texts of various lengths.

Original texts lengths: > 12
Shortened texts lengths: < 13

Splitting amongst sub-datasets:

70% of the dataset is used for training
20% of the dataset is used for validation
10% of the dataset is kept for testing

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count
1.7188	1.0	8	1.9266	0.4797	0.2787	0.4325	0.4321	0.8713	0.8594	10.0714	18	1	15.45

Framework versions

Transformers 4.32.1
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3