Codet5+ 220m Py Sum

This Model is based on the CodeT5+ (220m) from salesforce and was finetuned for the code summarization task by using the XCodeGlue Dataset. The Code is accessible on Github.

Results

Modell BLEU
CodeT5-base-sum-python 23.564
CodeT5-base-multi-sum 23.985
Code-Trans-S-ST 5.495
Code-Trans-S-TF 21.093
Code-Trans-S-MT 5.450
Code-Trans-S-MT-TF 16.378
Code-Trans-B-ST 4.638
Code-Trans-B-TF 21.671
Code-Trans-B-MT 2.957
Code-Trans-B-MT-TF 13.766
Code-Trans-L-TF 23.306
Code-Trans-L-MT 13.487
Code-Trans-L-MT-TF 16.362
CodeT5+ 220m Py Sum* 25.245

Example on how to use

The model can be easily download from Huggingface and used in a summarization pipeline.

from transformers import AutoTokenizer, AutoModelWithLMHead, SummarizationPipeline

pipeline = SummarizationPipeline(
    model=AutoModelWithLMHead.from_pretrained("Paul-B98/codet5p_220m_py_sum"),
    tokenizer=AutoTokenizer.from_pretrained("Salesforce/codet5p-220m"),
    device=0
)

example_method = """
def greet(name):
    print(f"Hello, {name}!")
"""

pipeline([example_method])[0]["summary_text"]