mgt-detection ai-detection

Machine-generated text-detection by fine-tuning of language models

This project is related to a bachelor's thesis with the title "Turning Poachers into Gamekeepers: Detecting Machine-Generated Text in Academia using Large Language Models" (not yet published) written by Nicolai Thorer Sivesind and Andreas Bentzen Winje at the Department of Computer Science at the Norwegian University of Science and Technology.

It contains text classification models trained to distinguish human-written text from text generated by language models like ChatGPT and GPT-3. The best models were able to achieve an accuracy of 100% on real and GPT-3-generated wikipedia articles (4500 samples), and an accuracy of 98.4% on real and ChatGPT-generated research abstracts (3000 samples).

The dataset card for the dataset that was created in relation to this project can be found here.

NOTE: the hosted inference on this site only works for the RoBERTa-models, and not for the Bloomz-models. The Bloomz-models otherwise can produce wrong predictions when not explicitly providing the attention mask from the tokenizer to the model for inference. To be sure, the pipeline-library seems to produce the most consistent results.

Fine-tuned detectors

This project includes 12 fine-tuned models based on the RoBERTa-base model, and three sizes of the bloomz-models.

Base-model RoBERTa-base Bloomz-560m Bloomz-1b7 Bloomz-3b
Wiki roberta-wiki Bloomz-560m-wiki Bloomz-1b7-wiki Bloomz-3b-wiki
Academic roberta-academic Bloomz-560m-academic Bloomz-1b7-academic Bloomz-3b-academic
Mixed roberta-mixed Bloomz-560m-mixed Bloomz-1b7-mixed Bloomz-3b-mixed

Datasets

The models were trained on selections from the GPT-wiki-intros and ChatGPT-Research-Abstracts, and are separated into three types, wiki-detectors, academic-detectors and mixed-detectors, respectively.

Hyperparameters

All models were trained using the same hyperparameters:

{
 "num_train_epochs": 1,
 "adam_beta1": 0.9,
 "adam_beta2": 0.999,
 "batch_size": 8,
 "adam_epsilon": 1e-08
 "optim": "adamw_torch" # the optimizer (AdamW)
 "learning_rate": 5e-05, # (LR)
 "lr_scheduler_type": "linear", # scheduler type for LR
 "seed": 42, # seed for PyTorch RNG-generator.
}

Metrics

Metrics can be found at https://wandb.ai/idatt2900-072/IDATT2900-072.

In-domain performance of wiki-detectors:

Base model Accuracy Precision Recall F1-score
Bloomz-560m 0.973 *1.000 0.945 0.972
Bloomz-1b7 0.972 *1.000 0.945 0.972
Bloomz-3b *1.000 *1.000 *1.000 *1.000
RoBERTa 0.998 0.999 0.997 0.998

In-domain peformance of academic-detectors:

Base model Accuracy Precision Recall F1-score
Bloomz-560m 0.964 0.963 0.965 0.964
Bloomz-1b7 0.946 0.941 0.951 0.946
Bloomz-3b *0.984 *0.983 0.985 *0.984
RoBERTa 0.982 0.968 *0.997 0.982

F1-scores of the mixed-detectors on all three datasets:

Base model Mixed Wiki CRA
Bloomz-560m 0.948 0.972 *0.848
Bloomz-1b7 0.929 0.964 0.816
Bloomz-3b 0.988 0.996 0.772
RoBERTa *0.993 *0.997 0.829

Credits

Citation

Please use the following citation:

@misc {sivesind_2023,
    author       = { {Nicolai Thorer Sivesind} and {Andreas Bentzen Winje} },
    title        = { Machine-generated text-detection by fine-tuning of language models },
    url          = { https://huggingface.co/andreas122001/roberta-academic-detector }
    year         = 2023,
    publisher    = { Hugging Face }
}