Results for "automatic-speech-recognition"

100 matches found.

pyannote

pyannote/speaker-diarization-3.1

No description available.

🎙️ automatic-speech-recognition 13,708,933
openai

openai/whisper-large-v3

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. There are two flavours of Whisper mo...

🎙️ automatic-speech-recognition 5,963,450
argmaxinc

argmaxinc/whisperkit-coreml

No description available.

🎙️ automatic-speech-recognition 5,147,800
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-russian

- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...

🎙️ automatic-speech-recognition 4,346,023
openai

openai/whisper-large-v3-turbo

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. There are two flavours of Whisper mo...

🎙️ automatic-speech-recognition 4,274,459
facebook

facebook/wav2vec2-base-960h

- librispeechasr - audio - automatic-speech-recognition - hf-asr-leaderboard - exampletitle: Librispeech sample 1 src: https://cdn-media.hug...

🎙️ automatic-speech-recognition 3,540,061
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-portuguese

- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...

🎙️ automatic-speech-recognition 3,370,633
MahmoudAshraf

MahmoudAshraf/mms-300m-1130-forced-aligner

No description available.

🎙️ automatic-speech-recognition 3,050,006
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-polish

- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...

🎙️ automatic-speech-recognition 3,017,504
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-greek

No description available.

🎙️ automatic-speech-recognition 2,749,675
indonesian-nlp

indonesian-nlp/wav2vec2-indonesian-javanese-sundanese

- id - jv - sun - mozilla-foundation/commonvoice70 - openslr - magicdata - titml - wer - audio - automatic-speech-recognition - hf-asr-leade...

🎙️ automatic-speech-recognition 2,735,282
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-dutch

- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...

🎙️ automatic-speech-recognition 2,553,499
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-arabic

No description available.

🎙️ automatic-speech-recognition 2,264,585
openai

openai/whisper-small

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...

🎙️ automatic-speech-recognition 2,036,069
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-hungarian

No description available.

🎙️ automatic-speech-recognition 1,955,076
Khalsuu

Khalsuu/filipino-wav2vec2-l-xls-r-300m-official

More information needed...

🎙️ automatic-speech-recognition 1,733,412
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-japanese

No description available.

🎙️ automatic-speech-recognition 1,685,235
anuragshas

anuragshas/wav2vec2-large-xlsr-53-telugu

Fine-tuned facebook/wav2vec2-large-xlsr-53 on Telugu using the OpenSLR SLR66 dataset. When using this model, make sure that your speech inpu...

🎙️ automatic-speech-recognition 1,676,773
kingabzpro

kingabzpro/wav2vec2-large-xls-r-300m-Urdu

No description available.

🎙️ automatic-speech-recognition 1,590,162
gigant

gigant/romanian-wav2vec2

The architecture is based on facebook/wav2vec2-xls-r-300m with a speech recognition CTC head and an added 5-gram language model (using pyctc...

🎙️ automatic-speech-recognition 1,528,073
Harveenchadha

Harveenchadha/vakyansh-wav2vec2-tamil-tam-250

No description available.

🎙️ automatic-speech-recognition 1,448,478
KBLab

KBLab/wav2vec2-large-voxrex-swedish

No description available.

🎙️ automatic-speech-recognition 1,359,384
nguyenvulebinh

nguyenvulebinh/wav2vec2-base-vi-vlsp2020

Our models use wav2vec2 architecture, pre-trained on 13k hours of Vietnamese youtube audio (un-label data) and fine-tuned on 250 hours label...

🎙️ automatic-speech-recognition 1,328,043
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-persian

No description available.

🎙️ automatic-speech-recognition 1,299,243
airesearch

airesearch/wav2vec2-large-xlsr-53-th

No description available.

🎙️ automatic-speech-recognition 1,298,490
mesolitica

mesolitica/wav2vec2-xls-r-300m-mixed

No description available.

🎙️ automatic-speech-recognition 1,264,671
distil-whisper

distil-whisper/distil-large-v3

Distil-Whisper inherits the encoder-decoder architecture from Whisper. The encoder maps a sequence of speech vector inputs to a sequence of ...

🎙️ automatic-speech-recognition 1,216,096
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-finnish

No description available.

🎙️ automatic-speech-recognition 1,014,291
NbAiLab

NbAiLab/nb-wav2vec2-1b-bokmaal-v2

No description available.

🎙️ automatic-speech-recognition 1,005,161
Yehor

Yehor/w2v-xls-r-uk

No description available.

🎙️ automatic-speech-recognition 1,002,445
comodoro

comodoro/wav2vec2-xls-r-300m-cs-250

Fine-tuned facebook/wav2vec2-large-xlsr-53 on Czech using the Common Voice dataset. When using this model, make sure that your speech input ...

🎙️ automatic-speech-recognition 991,024
arijitx

arijitx/wav2vec2-xls-r-300m-bengali

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the OPENSLRSLR53 - bengali dataset. It achieves the following results ...

🎙️ automatic-speech-recognition 973,587
theainerd

theainerd/Wav2Vec2-large-xlsr-hindi

No description available.

🎙️ automatic-speech-recognition 968,169
openai

openai/whisper-base

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...

🎙️ automatic-speech-recognition 966,269
NbAiLab

NbAiLab/nb-wav2vec2-1b-nynorsk

This is one of several Wav2Vec-models our team created during the 🤗 hosted Robust Speech Event. This is the complete list of our models and ...

🎙️ automatic-speech-recognition 966,228
softcatala

softcatala/wav2vec2-large-xlsr-catala

No description available.

🎙️ automatic-speech-recognition 930,774
nvidia

nvidia/parakeet-ctc-1.1b

- en libraryname: nemo - librispeechasr - fishercorpus - Switchboard-1 - WSJ-0 - WSJ-1 - National-Singapore-Corpus-Part-1 - National-Singapo...

🎙️ automatic-speech-recognition 925,544
pyannote

pyannote/speaker-diarization-community-1

- pyannote - pyannote-audio - pyannote-audio-pipeline - audio - voice - speech - speaker - speaker-diarization - speaker-change-detection - ...

🎙️ automatic-speech-recognition 910,780
imvladikon

imvladikon/wav2vec2-xls-r-300m-hebrew

More information needed...

🎙️ automatic-speech-recognition 895,054
gagan3012

gagan3012/wav2vec2-xlsr-nepali

No description available.

🎙️ automatic-speech-recognition 888,040
saattrupdan

saattrupdan/wav2vec2-xls-r-300m-ftspeech

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the FTSpeech dataset, being a dataset of 1,800 hours of transcribed sp...

🎙️ automatic-speech-recognition 881,942
kresnik

kresnik/wav2vec2-large-xlsr-korean

- kresnik/zerothkorean - speech - audio - automatic-speech-recognition...

🎙️ automatic-speech-recognition 856,529
eddiegulay

eddiegulay/wav2vec2-large-xlsr-mvc-swahili

No description available.

🎙️ automatic-speech-recognition 826,874
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn

No description available.

🎙️ automatic-speech-recognition 812,141
pyannote

pyannote/overlapped-speech-detection

No description available.

🎙️ automatic-speech-recognition 791,303
pyannote

pyannote/voice-activity-detection

No description available.

🎙️ automatic-speech-recognition 767,520
openai

openai/whisper-tiny

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...

🎙️ automatic-speech-recognition 760,711
microsoft

microsoft/VibeVoice-ASR

- en - zh - es - pt - de - ja - ko - fr - ru - id - sv - it - he - nl - pl - no - tr - th - ar - hu - ca - cs - da - fa - af - hi - fi - et ...

🎙️ automatic-speech-recognition 754,595
Qwen

Qwen/Qwen3-ASR-1.7B

No description available.

🎙️ automatic-speech-recognition 752,872
pyannote

pyannote/speaker-diarization

No description available.

🎙️ automatic-speech-recognition 741,374
Systran

Systran/faster-whisper-small

No description available.

🎙️ automatic-speech-recognition 715,974
Systran

Systran/faster-whisper-large-v3

No description available.

🎙️ automatic-speech-recognition 656,477
classla

classla/wav2vec2-xls-r-parlaspeech-hr

- parlaspeech-hr - audio - automatic-speech-recognition - parlaspeech - exampletitle: example 1 src: https://huggingface.co/classla/wav2vec2...

🎙️ automatic-speech-recognition 607,293
openai

openai/whisper-medium

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labe...

🎙️ automatic-speech-recognition 599,501
Harveenchadha

Harveenchadha/vakyansh-wav2vec2-sanskrit-sam-60

No description available....

🎙️ automatic-speech-recognition 595,589
gvs

gvs/wav2vec2-large-xlsr-malayalam

No description available.

🎙️ automatic-speech-recognition 566,412
gagan3012

gagan3012/wav2vec2-xlsr-khmer

No description available.

🎙️ automatic-speech-recognition 554,716
kingabzpro

kingabzpro/wav2vec2-large-xlsr-53-punjabi

No description available.

🎙️ automatic-speech-recognition 547,020
Revai

Revai/reverb-diarization-v1

Details on the model, it's performance, and more available on Arxiv. For more information on how to run this diarization model see https://g...

🎙️ automatic-speech-recognition 535,929
pyannote

pyannote/speaker-diarization-3.0

No description available.

🎙️ automatic-speech-recognition 535,467
stefan-it

stefan-it/wav2vec2-large-xlsr-53-basque

No description available.

🎙️ automatic-speech-recognition 491,477
Systran

Systran/faster-whisper-base

No description available.

🎙️ automatic-speech-recognition 437,070
sumedh

sumedh/wav2vec2-large-xlsr-marathi

No description available.

🎙️ automatic-speech-recognition 434,993
facebook

facebook/wav2vec2-xlsr-53-espeak-cv-ft

No description available.

🎙️ automatic-speech-recognition 432,038
mlx-community

mlx-community/parakeet-tdt-0.6b-v3

No description available.

🎙️ automatic-speech-recognition 422,877
argmaxinc

argmaxinc/parakeetkit-pro

No description available.

🎙️ automatic-speech-recognition 416,796
comodoro

comodoro/wav2vec2-xls-r-300m-sk-cv8

- sk - automatic-speech-recognition - mozilla-foundation/commonvoice80 - robust-speech-event - xlsr-fine-tuning-week - hf-asr-leaderboard - ...

🎙️ automatic-speech-recognition 390,903
mistralai

mistralai/Voxtral-Mini-4B-Realtime-2602

No description available.

🎙️ automatic-speech-recognition 390,821
t-tech

t-tech/T-one

No description available.

🎙️ automatic-speech-recognition 366,390
anton-l

anton-l/wav2vec2-large-xlsr-53-estonian

No description available.

🎙️ automatic-speech-recognition 366,084
mlx-community

mlx-community/parakeet-tdt-0.6b-v2

No description available.

🎙️ automatic-speech-recognition 363,805
mpoyraz

mpoyraz/wav2vec2-xls-r-300m-cv7-turkish

This ASR model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on Turkish language....

🎙️ automatic-speech-recognition 354,288
lgris

lgris/wav2vec2-large-xlsr-open-brazilian-portuguese-v2

No description available.

🎙️ automatic-speech-recognition 354,252
infinitejoy

infinitejoy/wav2vec2-large-xls-r-300m-welsh

More information needed...

🎙️ automatic-speech-recognition 350,288
freds0

freds0/distil-whisper-large-v3-ptbr

The model aims to perform automatic speech transcription in Brazilian Portuguese with high accuracy. By combining data from Common Voice 16 ...

🎙️ automatic-speech-recognition 348,699
ifrz

ifrz/wav2vec2-large-xlsr-galician

Fine-tuned model for Galician language...

🎙️ automatic-speech-recognition 347,237
facebook

facebook/mms-1b-all

Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 1000+ languages, see suppor...

🎙️ automatic-speech-recognition 339,545
UsefulSensors

UsefulSensors/moonshine-tiny-zh

This Moonshine model is trained for the speech recognition task, capable of transcribing Chinese speech audio into Chinese text. Moonshine A...

🎙️ automatic-speech-recognition 338,818
Systran

Systran/faster-whisper-tiny

No description available.

🎙️ automatic-speech-recognition 335,764
microsoft

microsoft/Phi-4-multimodal-instruct

🎉Phi-4: [mini-reasoning | [reasoning](https://huggi......

🎙️ automatic-speech-recognition 322,381
SpideyDLK

SpideyDLK/wav2vec2-large-xls-r-300m-sinhala-low-LR-part1

No description available.

🎙️ automatic-speech-recognition 316,058
m3hrdadfi

m3hrdadfi/wav2vec2-large-xlsr-lithuanian

No description available.

🎙️ automatic-speech-recognition 309,364
Systran

Systran/faster-whisper-medium

No description available.

🎙️ automatic-speech-recognition 308,823
anton-l

anton-l/wav2vec2-large-xlsr-53-slovenian

No description available.

🎙️ automatic-speech-recognition 291,866
zai-org

zai-org/GLM-ASR-Nano-2512

No description available.

🎙️ automatic-speech-recognition 289,861
nvidia

nvidia/canary-1b-v2

trackdownloads: true - nvidia/Granary - nvidia/nemo-asr-set-3.0 - bg - hr - cs - da - nl - en - et - fi - fr - de - el - hu - it - lv - lt -...

🎙️ automatic-speech-recognition 284,761
tristayqc

tristayqc/my_zh_CN_asr_cv13_model

More information needed...

🎙️ automatic-speech-recognition 279,947
amoghsgopadi

amoghsgopadi/wav2vec2-large-xlsr-kn

- openslr - wer - audio - automatic-speech-recognition - speech - xlsr-fine-tuning-week - name: XLSR Wav2Vec2 Large 53 Kannada by Amogh Gopa...

🎙️ automatic-speech-recognition 278,602
nyrahealth

nyrahealth/CrisperWhisper

No description available.

🎙️ automatic-speech-recognition 277,381
DrishtiSharma

DrishtiSharma/wav2vec2-large-xls-r-300m-bg-d2

- bg - automatic-speech-recognition - bg - generatedfromtrainer - hf-asr-leaderboard - mozilla-foundation/commonvoice80 - robust-speech-even...

🎙️ automatic-speech-recognition 273,744
Qwen

Qwen/Qwen3-ASR-0.6B

No description available.

🎙️ automatic-speech-recognition 258,648
facebook

facebook/wav2vec2-conformer-rope-large-960h-ft

No description available.

🎙️ automatic-speech-recognition 257,041
gchhablani

gchhablani/wav2vec2-large-xlsr-gu

No description available.

🎙️ automatic-speech-recognition 256,296
facebook

facebook/hubert-large-ls960-ft

No description available.

🎙️ automatic-speech-recognition 236,938
ihanif

ihanif/wav2vec2-xls-r-300m-pashto

More information needed...

🎙️ automatic-speech-recognition 180,521
jonatasgrosman

jonatasgrosman/wav2vec2-large-xlsr-53-english

- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - en - hf-asr-leaderboard - mozilla-foun...

🎙️ automatic-speech-recognition 179,374
Systran

Systran/faster-whisper-large-v2

No description available.

🎙️ automatic-speech-recognition 178,937
aismlv

aismlv/wav2vec2-large-xlsr-kazakh

No description available.

🎙️ automatic-speech-recognition 178,775
jbetker

jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli

This checkpoint is a wav2vec2-large model that is useful for generating transcriptions with punctuation. It is intended for use in building ...

🎙️ automatic-speech-recognition 175,856
jimregan

jimregan/wav2vec2-large-xlsr-latvian-cv

No description available.

🎙️ automatic-speech-recognition 173,371