Donut demo

This model is the result of fine-tuning VisionEncoderDecoderModel on the naver-clova-ix/cord-v2 dataset.

The Weights and Biases report can be found here.