audio speech wav2vec2 pt portuguese-speech-corpus italian-speech-corpus english-speech-corpus arabic-speech-corpus spontaneous speech PyTorch

Wav2vec 2.0 XLS-R For Spontaneous Speech Emotion Recognition

This is the model that got first place in the SER track of the Automatic Speech Recognition for spontaneous and prepared speech & Speech Emotion Recognition in Portuguese (SE&R 2022) Workshop.

The following datasets were used in the training:

The test set used is a part of the CORAA SER v1.0 that has been set aside for this purpose.

It achieves the following results on the test set:

Datasets Details

The following image shows the overall distribution of the datasets:


The following image shows the number of instances by label:



The repository that implements the model to be trained and tested is avaible here.