yuekai/model_repo_whisper_large_v2 - AI Model Zoo

Client

https://huggingface.co/spaces/yuekai/triton-asr-client

https://github.com/yuekaizhang/Triton-ASR-Client

Server

docker pull soar97/triton-whisper:23.06

docker run -it --name "whisper-server" --gpus all --net host -v $your_mount_dir --shm-size=2g soar97/triton-whisper:23.06

apt-get install git-lfs
git-lfs install
git clone https://huggingface.co/yuekai/model_repo_whisper_large_v2.git

export CUDA_VISIBLE_DEVICES="1"

model_repo_path=./model_repo_whisper

tritonserver --model-repository $model_repo_path \
            --pinned-memory-pool-byte-size=2048000000 \
            --cuda-memory-pool-byte-size=0:4096000000 \
            --http-port 10086 \
            --metrics-port 10087

Benchmark Results

Decoding on a single V100 GPU, audios are padding to 30s, using aishell1 test set files

Model	Backend	Concurrency	RTF
Large-v2	ONNX FP16	4	0.14

Module	Time Distribution
feature_extractor	0.8%
encoder	9.6%
decoder	67.4%
greedy search	22.2%

Client

Server

Benchmark Results

NSDT 3DConvert

UnrealSynth

DreamTexture.js