generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

gpt2-arxiv

A gpt2 powered predictive keyboard trained on ~1.6M manuscript abstracts from the ArXiv. This model uses https://www.kaggle.com/datasets/Cornell-University/arxiv

from transformers import pipeline
from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
llm = pipeline('text-generation',model='pearsonkyle/gpt2-arxiv', tokenizer=tokenizer)

texts = llm("Directly imaged exoplanets probe", 
             max_length=50, do_sample=True, num_return_sequences=5, 
             penalty_alpha=0.65, top_k=40, repetition_penalty=1.25,
             temperature=0.95)

for i in range(5):
    print(texts[i]['generated_text']+'\n')

Model description

GPT-2: 12-layer, 768-hidden, 12-heads, 117M parameters

Intended uses & limitations

Coming soon...

Be careful when generating a lot of text or when changing the sampling mode of the language model. It can sometimes produce things that are not truthful, e.g.,

Training procedure

~49 hours on a 3090 training for 1.25M iterations

Training hyperparameters

The following hyperparameters were used during training:

Framework versions