int8 summarization translation

t5-small exported to the ONNX format and dynamically quantized.

Model description

T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format.

For more information, please take a look at the original paper.

Paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Authors: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

Usage example

You can use this model with Transformers pipeline.

from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("echarlaix/t5-small-dynamic")
model = ORTModelForSeq2SeqLM.from_pretrained("echarlaix/t5-small-dynamic")
translator = pipeline("translation_en_to_fr", model=model, tokenizer=tokenizer)
text = "He never went out without a book under his arm, and he often came back with two."
results = translator(text)
print(results)