speecht5 TTS

Fine-tuned SpeechT5 TTS Model for Haitian Creole

This model is a fine-tuned version of microsoft/speecht5-tts for the Haitian Creole language. It was fine-tuned on the CMU Haitian dataset.

Model Description

The model is based on the SpeechT5 architecture, which is a variant of the T5 (Text-to-Text Transfer Transformer) model designed specifically for text-to-speech tasks. The model is capable of converting input text in Haitian Creole into corresponding speech.

Intended Uses & Limitations

The model is intended for text-to-speech (TTS) applications in Haitian Creole language processing. It can be used for generating speech from written text, enabling applications such as audiobook narration, voice assistants, and more.

However, there are some limitations to be aware of:

Training and Evaluation Data

The model was fine-tuned on the CMU Haitian dataset, which contains text and corresponding audio samples in Haitian Creole. The dataset was split into training and evaluation sets to assess the model's performance.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

Training Results

The training progress and evaluation results are as follows:

Training Loss Epoch Step Validation Loss
0.5147 2.42 1000 0.4753
0.4932 4.84 2000 0.4629
0.4926 7.26 3000 0.4566
0.4907 9.69 4000 0.4542
0.4839 12.11 5000 0.4532

Training Output

The training was completed with the following output:

Framework Versions