cT5-small left-to-right
Github: https://github.com/mtreviso/chunked-t5
This is a variant of cT5 that was trained with a left-to-right autoregressive decoding mask. As a consequence, it does not support parallel decoding, but it still predicts the end-of-chunk token </c>
at the end of each chunk.