

The purpose of the DeUnCaser is to fix text that lacks punctation. It is particulary targeted towards the output from Automated Speak Recognition software. In addition to the lack of casing and punctation, it also often lacks pauses between words. Try this demo, and you will understand.

The DeUnCaser is based on North-T5. It is a sequence-to-sequence mT5 model. It will make an attempt to add punctation, spaces and capitalisation to any text that is thrown at it. It is primarily trained to fix Norwegian text.