BART fine-tuned for keyphrase generation

This is the <a href="https://huggingface.co/facebook/bart-base">bart-base</a> (<a href = "https://arxiv.org/abs/1910.13461">Lewis et al.. 2019</a>) model <a href="https://arxiv.org/abs/2209.03791">finetuned for the keyphrase generation task</a> on the fragments of the following corpora:

Krapivin (<a href = "http://eprints.biblio.unitn.it/1671/1/disi09055%2Dkrapivin%2Dautayeu%2Dmarchese.pdf">Krapivin et al., 2009</a>)
Inspec (<a href = "https://aclanthology.org/W03-1028.pdf">Hulth, 2003</a>)
KPTimes (<a href = "https://aclanthology.org/W19-8617.pdf">Gallina, 2019</a>)
DUC-2001 (<a href = "https://cdn.aaai.org/AAAI/2008/AAAI08-136.pdf">Wan, 2008</a>)
PubMed (<a href = "https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=08b75d31a90f206b36e806a7ec372f6f0d12457e">Schutz, 2008</a>)
NamedKeys (<a href = "https://joyceho.github.io/assets/pdf/paper/gero-bcb19.pdf">Gero & Ho, 2019</a>).

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("beogradjanka/bart_finetuned_keyphrase_extraction")
model = AutoModelForSeq2SeqLM.from_pretrained("beogradjanka/bart_finetuned_keyphrase_extraction")

text = "In this paper, we investigate cross-domain limitations of keyphrase generation using the models for abstractive text summarization.\
        We present an evaluation of BART fine-tuned for keyphrase generation across three types of texts, \
        namely scientific texts from computer science and biomedical domains and news texts. \
        We explore the role of transfer learning between different domains to improve the model performance on small text corpora."

tokenized_text = tokenizer.prepare_seq2seq_batch([text], return_tensors='pt')
translation = model.generate(**tokenized_text)
translated_text = tokenizer.batch_decode(translation, skip_special_tokens=True)[0]
print(translated_text)

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-5
train_batch_size: 8
optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
num_epochs: 6

BibTeX:

@article{glazkova2023applying,
  title={Applying Transformer-Based Text Summarization for Keyphrase Generation},
  author={Glazkova, Anna and Morozov, Dmitry},
  journal={Lobachevskii Journal of Mathematics},
  volume={44},
  number={1},
  pages={123--136},
  year={2023},
  doi={10.1134/S1995080223010134}
}

BART fine-tuned for keyphrase generation

Training Hyperparameters

NSDT 3DConvert

UnrealSynth

DreamTexture.js