Whisper Model Quantized

The repository contains the whisper model quantized using the Smooth Quant using ONNXRuntime

  1. Only the Whisper decoder is quantized in the model
  2. The model has been modified to accept fixed shapes input of (1, 80, 3000) for the encoder and (1,448) for decoder.
  3. For inference the un-quantized encoder model and quantized decoder model is used.
  4. This model is for testing and could be modified in the future with better versions.

Evaluation:

The model achieves WER of 6.02% on the librispeech_asr (clean) test dataset