faster-whisper finetuned model for PL Phonetic transcription

This model is a result of finetuning openai/whisper-medium model on custom PL dataset and then conversion to faster-whisper model. In training dataset there were also 5 english speakers and 4 japanese speakers for which polish transcription was manually created.

About model:

Example:

from faster_whisper import WhisperModel
import huggingface_hub

model_path = huggingface_hub.snapshot_download("shmart/shmisper-medium-PL")
model = WhisperModel(model_path, device="cuda", compute_type="float16")

options = {
    'language': "pl",
    'beam_size': 5,
    'without_timestamps': True,
    'suppress_tokens': [],
    'log_prob_threshold': None,
    'no_speech_threshold': 0.05
}

input_wav_path = './audio.wav'
result, info = model.transcribe(input_wav_path, **options)
text = ' '.join([r.text for r in result])
print(text)