distilbert-base-uncased-Regression-Edmunds_Car_Reviews-American_Made
This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2486
- Mae: 0.3469
- Mse: 0.2486
- Rmse: 0.4986
Model description
This project works to predict the rating of a car based on the review (only American-headquartered automanufacturers).
For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/NLP%20Regression/HF-Edmunds_Consumer_car-Regression-American.ipynb
Intended uses & limitations
I used this to improve my skillset. I thank all of authors of the different technologies and dataset(s) for their contributions that have this possible. I am not too worried about getting credit for my part, but make sure to properly cite the authors of the different technologies and dataset(s) as they absolutely deserve credit for their contributions.
Training and evaluation data
Dataset Source: https://www.kaggle.com/datasets/ankkur13/edmundsconsumer-car-ratings-and-reviews
I only used car manufacturers headquartered in America that are not luxury brands. Additionally, I removed manufacturers with limited reviews.
Training procedure
The script for this project will be uploaded to my GitHub profile soon. Once it is, I will make sure to add the link here.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Mae | Mse | Rmse |
---|---|---|---|---|---|---|
0.6385 | 1.0 | 777 | 0.2743 | 0.3633 | 0.2743 | 0.5237 |
0.2551 | 2.0 | 1554 | 0.2588 | 0.3536 | 0.2588 | 0.5088 |
0.2161 | 3.0 | 2331 | 0.2568 | 0.3508 | 0.2568 | 0.5068 |
Framework versions
- Transformers 4.22.2
- Pytorch 1.12.1
- Datasets 2.5.2
- Tokenizers 0.12.1