<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
RO-Sentiment
This model is a fine-tuned version of readerbench/RoBERT-base on the Decathlon reviews and Cinemagia reviews dataset. It achieves the following results on the evaluation set:
- Loss: 0.3923
- Accuracy: 0.8307
- Precision: 0.8366
- Recall: 0.8959
- F1: 0.8652
- F1 Weighted: 0.8287
Output labels:
- LABEL_0 = Negative Sentiment
- LABEL_1 = Positive Sentiment
Evaluation on other datasets
SENT_RO
precision | recall | f1-score | support | |
---|---|---|---|---|
Negative (0) | 0.79 | 0.83 | 0.81 | 11,675 |
Positive (1) | 0.88 | 0.85 | 0.87 | 17,271 |
Accuracy | 0.85 | 28,946 | ||
Macro Avg | 0.84 | 0.84 | 0.84 | 28,946 |
Weighted Avg | 0.85 | 0.85 | 0.85 | 28,946 |
LaRoSeDa
precision | recall | f1-score | support | |
---|---|---|---|---|
Negative (0) | 0.79 | 0.94 | 0.86 | 7,500 |
Positive (1) | 0.93 | 0.75 | 0.83 | 7,500 |
Accuracy | 0.85 | 15,000 | ||
Macro Avg | 0.86 | 0.85 | 0.84 | 15,000 |
Weighted Avg | 0.86 | 0.85 | 0.84 | 15,000 |
Model description
Finetuned Romanian BERT model for sentiment classification.
Trained on a mix of product reviews from Decathlon retailer website and movie reviews from cinemagia.
Intended uses & limitations
Sentiment classification for Romanian Language.
Biased towards Product reviews.
There is no "neutral" sentiment label.
Training and evaluation data
Trained on:
-
Decathlon Dataset available on request
-
Cinemagia Movie reviews public on kaggle Link
Evaluated on
- Holdout data from training dataset
- RO_SENT Dataset
- LaROSeDa Dataset
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 10 (Early stop epoch 3, best epoch 2)
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 | F1 Weighted |
---|---|---|---|---|---|---|---|---|
0.4198 | 1.0 | 1629 | 0.3983 | 0.8377 | 0.8791 | 0.8721 | 0.8756 | 0.8380 |
0.3861 | 2.0 | 3258 | 0.4312 | 0.8429 | 0.8963 | 0.8665 | 0.8812 | 0.8442 |
0.3189 | 3.0 | 4887 | 0.3923 | 0.8307 | 0.8366 | 0.8959 | 0.8652 | 0.8287 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3