speech - AI Model Zoo - BimAnt

Multimodal

Feature Extraction Text-to-Image Image-to-Text Text-to-Video Visual Question Answering Document Question Answering Graph Machine Learning

Computer Vision

Depth Estimation Image Classification Object Detection Image Segmentation Image-to-Image Unconditional Image Generation Video Classification Zero-Shot Image Classification

Natural Language Processing

Text Classification Token Classification Table Question Answering Question Answering Zero-Shot Classification Translation Summarization Conversational Text Generation Text2Text Generation Fill-Mask Sentence Similarity

Audio

Text-to-Speech Text-to-Audio Automatic Speech Recognition Audio-to-Audio Audio Classification Voice Activity Detection

Tabular

Tabular Classification Tabular Regression

Reinforcement Learning

Reinforcement Learning Robotics

NSDT 3DConvert

Convert 30+ 3D formats online: GLTF, GLB, GBX, OBJ, DAE, IFC, STEP, STL...

UnrealSynth

Unreal engine based photo realistic synthetic data generator for YOLO.

DreamTexture.js

AI powered 3d texture generation and projection SDK for three.js.

Models with tag speech retrieved: 692

anuragshas/wav2vec2-large-xlsr-53-odia speech

nvidia/stt_ru_conformer_ctc_large speech

asapp/sew-d-base-plus-400k-ft-ls100h speech

chompk/wav2vec2-large-xlsr-thai-tokenized speech

vumichien/wav2vec2-large-xlsr-japanese speech

carlosdanielhernandezmena/stt_es_quartznet15x5_ft_ep53_944h speech

Narrativa/byt5-base-tweet-hate-detection speech

OthmaneJ/distil-wav2vec2 speech

imvladikon/wav2vec2-large-xlsr-53-hebrew speech

CuongLD/wav2vec2-large-xlsr-vietnamese speech

KBLab/wav2vec2-large-voxpopuli-sv-swedish speech

amoghsgopadi/wav2vec2-large-xlsr-kn speech

anton-l/wav2vec2-large-xlsr-53-estonian speech

seba3y/speecht5-asr-punctuation-sensitive speech

Sakil/distilbert_lazylearner_hatespeech_detection speech

wietsedv/wav2vec2-large-xlsr-53-dutch speech

qinyue/wav2vec2-large-xlsr-53-chinese-zn-cn-aishell1 speech

Edresson/wav2vec2-large-100k-voxpopuli-ft-Common-Voice_plus_TTS-Dataset-portuguese speech

m3hrdadfi/wav2vec2-large-xlsr-georgian speech

pgwi/en_tr_titanet_large speech