vit-base-vocalsound

This model is a fine-tuned version of google/vit-base-patch16-224 on VocalSound dataset. It achieves the following results on the evaluation set:

Training and evaluation data

Training: VocalSound training split (#samples = 15570)

Evaluation: VocalSound test split(#samples = 3594)

Training hyperparameters

The following hyperparameters were used during training:

Framework versions