mt5-small-finetuned-amazon-en-es
This model is a fine-tuned version of google/mt5-small on the amazon_reviews_multi dataset. It achieves the following results on the evaluation set:
- Loss: 3.0205
- Rouge1: 16.4636
- Rouge2: 8.2233
- Rougel: 16.3489
- Rougelsum: 16.3382
Model Description
This model is a fine-tuned version of the mT5-small model, a multilingual Transformer-based model pretrained for various NLP tasks. It has been further refined using a dataset consisting of book reviews and review titles, making it particularly well-suited for tasks similar to summarizing book reviews.
Intended Uses & Limitations
Intended Uses:
- Text Summarization: The model is intended for generating concise and coherent summaries from longer texts in English and Spanish, with a specific focus on book reviews.
Limitations:
- Length Constraints: The model may produce summaries that are limited in length, which might not capture all details of the source text.
- Quality Variance: The quality of generated summaries may vary depending on the complexity of the source text and the quality of training data.
- Bilingual Considerations: While the model supports both english and spanish, the quality of summaries can differ between languages, i.e., one language may have less robust performance.
Training and Evaluation Data
The training and evaluation data used for fine-tuning this model encompassed a diverse collection of textual reviews and their corresponding titles. This dataset serves as the foundation for the model's text summarization capabilities. Below are the key aspects of the dataset preparation:
-
Size and Splits: _______.
-
Main Domain Selection: The focus was placed on summarizing book reviews—consistent with Amazon's foundational roots in the book industry. Within the dataset, two primary product categories were identified as relevant for this purpose: "book" and "digital_ebook_purchase." Therefore, the datasets in both English and Spanish were meticulously filtered to retain only examples related to these product categories, ensuring the model's specialization.
-
Filtering for Quality: Ensuring that the model generates meaningful summaries is of paramount importance. To achieve this, a filtering strategy was employed. Specifically, examples with exceedingly short titles were filtered out, enhancing the model's capacity to produce more informative and contextually relevant summaries. This filtering process involved a heuristic approach, wherein titles were split based on whitespace, and the Dataset.filter() method was applied to retain examples meeting the defined criteria.
Baseline: Lead-3 Summarization
A common baseline for text summarization tasks is the "Lead-3" baseline, which simply extracts the first three sentences from the source text as the summary. This baseline helps provide a reference point for evaluating the model's performance. On the validation set, the Lead-3 baseline achieved the following ROUGE scores:
- ROUGE-1: 16.75
- ROUGE-2: 8.81
- ROUGE-L: 15.61
- ROUGE-Lsum: 15.96
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
3.4071 | 1.0 | 1209 | 3.1603 | 17.3175 | 8.3009 | 16.7074 | 16.755 |
3.0542 | 2.0 | 2418 | 3.1411 | 18.3538 | 9.0086 | 17.8745 | 17.8275 |
3.3216 | 3.0 | 3627 | 3.0424 | 15.7882 | 7.908 | 15.5215 | 15.5397 |
3.2157 | 4.0 | 4836 | 3.0497 | 15.6788 | 7.7739 | 15.3788 | 15.4032 |
3.1488 | 5.0 | 6045 | 3.0347 | 15.8221 | 7.8918 | 15.6714 | 15.6797 |
3.0838 | 6.0 | 7254 | 3.0254 | 16.2869 | 8.2442 | 16.1594 | 16.1527 |
3.0639 | 7.0 | 8463 | 3.0197 | 17.1527 | 8.4248 | 16.9826 | 16.9533 |
3.0388 | 8.0 | 9672 | 3.0205 | 16.4636 | 8.2233 | 16.3489 | 16.3382 |
Framework versions
- Transformers 4.32.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3