finetuned_distilBERT_forDetecting_GeneratedReviews_TR

This model was trained on Generated_Restaurant_Reviews_GPT3.5 dataset and yorumsepeti dataset (https://www.kaggle.com/datasets/dgknrsln/yorumsepeti) It achieves the following results on the evaluation set:

Model description

This model is the result of training the multilingual DistilBERT model with Turkish restaurant reviews data. With this model, it is aimed to determine whether a restaurant review is created by humans or by the large language model.

Intended uses & limitations

The aim of this model is to determine whether restaurant reviews are generated by the large language model or by humans. Since the model is trained with Turkish data, it should be used in Turkish restaurant reviews.

Training and evaluation data

"Generated_Restaurant_Reviews_GPT3.5" and "yorumsepeti" datasets were used as train and validation data

Training procedure

The "distillbert-base-multilingual-cased" model was fine-tuned with Turkish data. There are two classes in Turkish data, zero and one. Zero represents human written restaurant reviews and one represents the large language model generated restaurant reviews. batch size:32, epoch:4, optimizer: Adam optimizer, learning rate: CustomSchedule implemented.

Training hyperparameters

The following hyperparameters were used during training:

optimizer: Adam optimizer
training_precision: float32

Training results

Training loss: 0.0375 Training accuracy: 0.9892

Validation loss: 0.1056 Validation accuracy: 0.9688

Framework versions

Transformers 4.32.1
TensorFlow 2.12.0
Tokenizers 0.13.3