BART model used to generate scientific papers' title given the highlights and the abstract of the paper. This model is specifically tuned for biology and medicine papers.
This model is the result of a fine-tuning process done on sshleifer/distilbart-cnn-12-6. We performed a first fine-tuning epoch on CSPubSumm (Ed Collins, et al. "A supervised approach to extractive summarisation of scientific papers."), BIOPubSumm, and AIPubSumm (L. Cagliero, M. La Quatra "Extracting highlights of scientific articles: A supervised summarization approach.").
A second fine-tuning epoch was performed only on BIOPubSumm to let the model better understand how biology and medicine titles are composed.
You can find more details in the GitHub repo.
Usage
This checkpoint should be loaded into BartForConditionalGeneration.from_pretrained
. See the
BART docs for more information.
Metrics
We have tested the model on all three the test sets, with the following results:
Dataset | Rouge-1 F1 | Rouge-2 F1 | Rouge-L F1 | BERTScore F1 |
---|---|---|---|---|
BIOPubSumm | 0.45979 | 0.25406 | 0.39607 | 0.90272 |
AIPubSumm | 0.44455 | 0.23214 | 0.35779 | 0.90721 |
CSPubSumm | 0.49769 | 0.30773 | 0.43376 | 0.91561 |