Results for "automatic-speech-recognition"
100 matches found.
pyannote/speaker-diarization-3.1
No description available.
openai/whisper-large-v3
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. There are two flavours of Whisper mo...
argmaxinc/whisperkit-coreml
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-russian
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
openai/whisper-large-v3-turbo
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. There are two flavours of Whisper mo...
facebook/wav2vec2-base-960h
- librispeechasr - audio - automatic-speech-recognition - hf-asr-leaderboard - exampletitle: Librispeech sample 1 src: https://cdn-media.hug...
jonatasgrosman/wav2vec2-large-xlsr-53-portuguese
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
MahmoudAshraf/mms-300m-1130-forced-aligner
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-polish
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
jonatasgrosman/wav2vec2-large-xlsr-53-greek
No description available.
indonesian-nlp/wav2vec2-indonesian-javanese-sundanese
- id - jv - sun - mozilla-foundation/commonvoice70 - openslr - magicdata - titml - wer - audio - automatic-speech-recognition - hf-asr-leade...
jonatasgrosman/wav2vec2-large-xlsr-53-dutch
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
jonatasgrosman/wav2vec2-large-xlsr-53-arabic
No description available.
openai/whisper-small
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...
jonatasgrosman/wav2vec2-large-xlsr-53-hungarian
No description available.
Khalsuu/filipino-wav2vec2-l-xls-r-300m-official
More information needed...
jonatasgrosman/wav2vec2-large-xlsr-53-japanese
No description available.
anuragshas/wav2vec2-large-xlsr-53-telugu
Fine-tuned facebook/wav2vec2-large-xlsr-53 on Telugu using the OpenSLR SLR66 dataset. When using this model, make sure that your speech inpu...
kingabzpro/wav2vec2-large-xls-r-300m-Urdu
No description available.
gigant/romanian-wav2vec2
The architecture is based on facebook/wav2vec2-xls-r-300m with a speech recognition CTC head and an added 5-gram language model (using pyctc...
Harveenchadha/vakyansh-wav2vec2-tamil-tam-250
No description available.
KBLab/wav2vec2-large-voxrex-swedish
No description available.
nguyenvulebinh/wav2vec2-base-vi-vlsp2020
Our models use wav2vec2 architecture, pre-trained on 13k hours of Vietnamese youtube audio (un-label data) and fine-tuned on 250 hours label...
jonatasgrosman/wav2vec2-large-xlsr-53-persian
No description available.
airesearch/wav2vec2-large-xlsr-53-th
No description available.
mesolitica/wav2vec2-xls-r-300m-mixed
No description available.
distil-whisper/distil-large-v3
Distil-Whisper inherits the encoder-decoder architecture from Whisper. The encoder maps a sequence of speech vector inputs to a sequence of ...
jonatasgrosman/wav2vec2-large-xlsr-53-finnish
No description available.
NbAiLab/nb-wav2vec2-1b-bokmaal-v2
No description available.
Yehor/w2v-xls-r-uk
No description available.
comodoro/wav2vec2-xls-r-300m-cs-250
Fine-tuned facebook/wav2vec2-large-xlsr-53 on Czech using the Common Voice dataset. When using this model, make sure that your speech input ...
arijitx/wav2vec2-xls-r-300m-bengali
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the OPENSLRSLR53 - bengali dataset. It achieves the following results ...
theainerd/Wav2Vec2-large-xlsr-hindi
No description available.
openai/whisper-base
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...
NbAiLab/nb-wav2vec2-1b-nynorsk
This is one of several Wav2Vec-models our team created during the 🤗 hosted Robust Speech Event. This is the complete list of our models and ...
softcatala/wav2vec2-large-xlsr-catala
No description available.
nvidia/parakeet-ctc-1.1b
- en libraryname: nemo - librispeechasr - fishercorpus - Switchboard-1 - WSJ-0 - WSJ-1 - National-Singapore-Corpus-Part-1 - National-Singapo...
pyannote/speaker-diarization-community-1
- pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - ...
imvladikon/wav2vec2-xls-r-300m-hebrew
More information needed...
gagan3012/wav2vec2-xlsr-nepali
No description available.
saattrupdan/wav2vec2-xls-r-300m-ftspeech
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the FTSpeech dataset, being a dataset of 1,800 hours of transcribed sp...
kresnik/wav2vec2-large-xlsr-korean
- kresnik/zerothkorean - speech - audio - automatic-speech-recognition...
eddiegulay/wav2vec2-large-xlsr-mvc-swahili
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn
No description available.
pyannote/overlapped-speech-detection
No description available.
pyannote/voice-activity-detection
No description available.
openai/whisper-tiny
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...
microsoft/VibeVoice-ASR
- en - zh - es - pt - de - ja - ko - fr - ru - id - sv - it - he - nl - pl - no - tr - th - ar - hu - ca - cs - da - fa - af - hi - fi - et ...
Qwen/Qwen3-ASR-1.7B
No description available.
pyannote/speaker-diarization
No description available.
Systran/faster-whisper-small
No description available.
Systran/faster-whisper-large-v3
No description available.
classla/wav2vec2-xls-r-parlaspeech-hr
- parlaspeech-hr - audio - automatic-speech-recognition - parlaspeech - exampletitle: example 1 src: https://huggingface.co/classla/wav2vec2...
openai/whisper-medium
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...
Harveenchadha/vakyansh-wav2vec2-sanskrit-sam-60
No description available....
gvs/wav2vec2-large-xlsr-malayalam
No description available.
gagan3012/wav2vec2-xlsr-khmer
No description available.
kingabzpro/wav2vec2-large-xlsr-53-punjabi
No description available.
Revai/reverb-diarization-v1
Details on the model, it's performance, and more available on Arxiv. For more information on how to run this diarization model see https://g...
pyannote/speaker-diarization-3.0
No description available.
stefan-it/wav2vec2-large-xlsr-53-basque
No description available.
Systran/faster-whisper-base
No description available.
sumedh/wav2vec2-large-xlsr-marathi
No description available.
facebook/wav2vec2-xlsr-53-espeak-cv-ft
No description available.
mlx-community/parakeet-tdt-0.6b-v3
No description available.
argmaxinc/parakeetkit-pro
No description available.
comodoro/wav2vec2-xls-r-300m-sk-cv8
- sk - automatic-speech-recognition - mozilla-foundation/commonvoice80 - robust-speech-event - xlsr-fine-tuning-week - hf-asr-leaderboard - ...
mistralai/Voxtral-Mini-4B-Realtime-2602
No description available.
t-tech/T-one
No description available.
anton-l/wav2vec2-large-xlsr-53-estonian
No description available.
mlx-community/parakeet-tdt-0.6b-v2
No description available.
mpoyraz/wav2vec2-xls-r-300m-cv7-turkish
This ASR model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on Turkish language....
lgris/wav2vec2-large-xlsr-open-brazilian-portuguese-v2
No description available.
infinitejoy/wav2vec2-large-xls-r-300m-welsh
More information needed...
freds0/distil-whisper-large-v3-ptbr
The model aims to perform automatic speech transcription in Brazilian Portuguese with high accuracy. By combining data from Common Voice 16 ...
ifrz/wav2vec2-large-xlsr-galician
Fine-tuned model for Galician language...
facebook/mms-1b-all
Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 1000+ languages, see suppor...
UsefulSensors/moonshine-tiny-zh
This Moonshine model is trained for the speech recognition task, capable of transcribing Chinese speech audio into Chinese text. Moonshine A...
Systran/faster-whisper-tiny
No description available.
microsoft/Phi-4-multimodal-instruct
🎉Phi-4: [mini-reasoning | [reasoning](https://huggi......
SpideyDLK/wav2vec2-large-xls-r-300m-sinhala-low-LR-part1
No description available.
m3hrdadfi/wav2vec2-large-xlsr-lithuanian
No description available.
Systran/faster-whisper-medium
No description available.
anton-l/wav2vec2-large-xlsr-53-slovenian
No description available.
zai-org/GLM-ASR-Nano-2512
No description available.
nvidia/canary-1b-v2
trackdownloads: true - nvidia/Granary - nvidia/nemo-asr-set-3.0 - bg - hr - cs - da - nl - en - et - fi - fr - de - el - hu - it - lv - lt -...
tristayqc/my_zh_CN_asr_cv13_model
More information needed...
amoghsgopadi/wav2vec2-large-xlsr-kn
- openslr - wer - audio - automatic-speech-recognition - speech - xlsr-fine-tuning-week - name: XLSR Wav2Vec2 Large 53 Kannada by Amogh Gopa...
nyrahealth/CrisperWhisper
No description available.
DrishtiSharma/wav2vec2-large-xls-r-300m-bg-d2
- bg - automatic-speech-recognition - bg - generatedfromtrainer - hf-asr-leaderboard - mozilla-foundation/commonvoice80 - robust-speech-even...
Qwen/Qwen3-ASR-0.6B
No description available.
facebook/wav2vec2-conformer-rope-large-960h-ft
No description available.
gchhablani/wav2vec2-large-xlsr-gu
No description available.
facebook/hubert-large-ls960-ft
No description available.
ihanif/wav2vec2-xls-r-300m-pashto
More information needed...
jonatasgrosman/wav2vec2-large-xlsr-53-english
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - en - hf-asr-leaderboard - mozilla-foun...
Systran/faster-whisper-large-v2
No description available.
aismlv/wav2vec2-large-xlsr-kazakh
No description available.
jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli
This checkpoint is a wav2vec2-large model that is useful for generating transcriptions with punctuation. It is intended for use in building ...
jimregan/wav2vec2-large-xlsr-latvian-cv
No description available.