Results for "audio-classification"

100 matches found.

laion

laion/clap-htsat-fused

No description available.

🎧 audio-classification 20,352,951
audeering

audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim

No description available.

🎧 audio-classification 1,654,605
speechbrain

speechbrain/emotion-recognition-wav2vec2-IEMOCAP

No description available.

🎧 audio-classification 601,207
MIT

MIT/ast-finetuned-audioset-10-10-0.4593

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 592,560
xbgoose

xbgoose/hubert-large-speech-emotion-recognition-russian-dusha-finetuned

No description available.

🎧 audio-classification 313,092
audeering

audeering/wav2vec2-large-robust-24-ft-age-gender

No description available.

🎧 audio-classification 298,191
mo-thecreator

mo-thecreator/Deepfake-audio-detection

More information needed...

🎧 audio-classification 258,465
prithivMLmods

prithivMLmods/Common-Voice-Gender-Detection

No description available.

🎧 audio-classification 237,478
superb

superb/wav2vec2-base-superb-er

This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Emotion Recognition task. The base model is wav2vec2-base, which is pretrained o...

🎧 audio-classification 221,649
OpenMuQ

OpenMuQ/MuQ-large-msd-iter

No description available.

🎧 audio-classification 214,936
speechbrain

speechbrain/lang-id-voxlingua107-ecapa

This is a spoken language recognition model trained on the VoxLingua107 dataset using SpeechBrain. The model uses the ECAPA-TDNN architectur...

🎧 audio-classification 205,123
facebook

facebook/audiobox-aesthetics

No description available.

🎧 audio-classification 163,559
OpenMuQ

OpenMuQ/MuQ-MuLan-large

No description available.

🎧 audio-classification 150,099
JaesungHuh

JaesungHuh/voice-gender-classifier

No description available.

🎧 audio-classification 71,688
m-a-p

m-a-p/MERT-v1-95M

No description available.

🎧 audio-classification 65,664
m-a-p

m-a-p/MERT-v1-330M

No description available.

🎧 audio-classification 51,037
facebook

facebook/mms-lid-1024

Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 1024 languages, see support...

🎧 audio-classification 46,693
facebook

facebook/mms-lid-256

Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 256 languages, see supporte...

🎧 audio-classification 45,086
bvallegc

bvallegc/wav2vec2_spoof_dection1-finetuned-spoofing-classifier

No description available....

🎧 audio-classification 30,372
firdhokk

firdhokk/speech-emotion-recognition-with-openai-whisper-large-v3

No description available.

🎧 audio-classification 29,276
MIT

MIT/ast-finetuned-audioset-14-14-0.443

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 26,837
jakeBland

jakeBland/wav2vec-vm-finetune

This model builds on wav2vec2-xls-r-300m, a self-supervised speech model trained on large-scale multilingual data. We fine-tuned it on the f...

🎧 audio-classification 26,071
facebook

facebook/mms-lid-126

Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 126 languages, see supporte...

🎧 audio-classification 24,566
ehcalabres

ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition

More information needed...

🎧 audio-classification 24,030
facebook

facebook/mms-lid-4017

Developed by: Vineel Pratap et al. - Model type: Multi-Lingual Automatic Speech Recognition model - Language(s): 4017 languages, see support...

🎧 audio-classification 20,705
alefiury

alefiury/wav2vec2-large-xlsr-53-gender-recognition-librispeech

No description available.

🎧 audio-classification 19,362
DBD-research-group

DBD-research-group/AST-BirdSet-XCM

No description available.

🎧 audio-classification 17,726
speechbrain

speechbrain/spkrec-xvect-voxceleb

No description available.

🎧 audio-classification 17,406
padmalcom

padmalcom/wav2vec2-large-nonverbalvocalization-classification

This language indendent wav2vec2 classification model is based on this dataset....

🎧 audio-classification 16,321
SeyedAli

SeyedAli/Musical-genres-Classification-Hubert-V1

More information needed...

🎧 audio-classification 15,844
speechbrain

speechbrain/lang-id-commonlanguage_ecapa

No description available.

🎧 audio-classification 12,099
superb

superb/wav2vec2-base-superb-ks

This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Keyword Spotting task. The base model is wav2vec2-base, which is pretrained on 1...

🎧 audio-classification 11,775
Gustking

Gustking/wav2vec2-large-xlsr-deepfake-audio-classification

pipelinetag: audio-classification...

🎧 audio-classification 11,699
audeering

audeering/wav2vec2-large-robust-6-ft-age-gender

No description available.

🎧 audio-classification 11,070
bookbot

bookbot/distil-ast-audioset

No description available.

🎧 audio-classification 10,330
awsaf49

awsaf49/sonics-spectttra-alpha-120s

No description available.

🎧 audio-classification 10,202
MIT

MIT/ast-finetuned-audioset-16-16-0.442

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 9,973
MIT

MIT/ast-finetuned-audioset-12-12-0.447

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 9,843
superb

superb/hubert-base-superb-er

This is a ported version of S3PRL's Hubert for the SUPERB Emotion Recognition task. The base model is hubert-base-ls960, which is pretrained...

🎧 audio-classification 8,586
DBD-research-group

DBD-research-group/AST-BirdSet-XCL

No description available.

🎧 audio-classification 8,110
mispeech

mispeech/dasheng-base

No description available.

🎧 audio-classification 7,999
Jzuluaga

Jzuluaga/accent-id-commonaccent_xlsr-en-english

No description available.

🎧 audio-classification 7,064
firdhokk

firdhokk/speech-emotion-recognition-with-facebook-wav2vec2-large-xlsr-53

No description available.

🎧 audio-classification 6,960
3loi

3loi/SER-Odyssey-Baseline-WavLM-Categorical

The model was trained on MSP-Podcast for the Odyssey 2024 Emotion Recognition competition baseline This particular model is the categorical ...

🎧 audio-classification 6,636
superb

superb/hubert-large-superb-er

This is a ported version of S3PRL's Hubert for the SUPERB Emotion Recognition task. The base model is hubert-large-ll60k, which is pretraine...

🎧 audio-classification 6,174
Dpngtm

Dpngtm/wav2vec2-emotion-recognition

Model Architecture: Wav2Vec2 with a frozen CNN feature extractor and a trainable sequence classification head. - Language: English - Task: S...

🎧 audio-classification 5,144
mtg-upf

mtg-upf/discogs-maest-30s-pw-129e-519l

No description available.

🎧 audio-classification 4,143
tiantiaf

tiantiaf/whisper-large-v3-narrow-accent

This model includes the implementation of narrow accent classification described in Vox-Profile: A Speech Foundation Model Benchmark for Cha...

🎧 audio-classification 4,121
tiantiaf

tiantiaf/wavlm-large-age-sex

This model includes the implementation of age and sex classification described in Vox-Profile: A Speech Foundation Model Benchmark for Chara...

🎧 audio-classification 4,061
tiantiaf

tiantiaf/whisper-large-v3-speech-flow

This model includes the implementation of speech fluency classification described in Vox-Profile: A Speech Foundation Model Benchmark for Ch...

🎧 audio-classification 3,951
tiantiaf

tiantiaf/whisper-large-v3-voice-quality

This model includes the implementation of voice quality classification described in Vox-Profile: A Speech Foundation Model Benchmark for Cha...

🎧 audio-classification 3,650
HTill

HTill/flexEAT-base_epoch30_pretrain

⚠️ Codebase Update: Input Flexibility & Fine-Tuning Preparation...

🎧 audio-classification 3,524
dima806

dima806/music_genres_classification

Music genre classification is a fundamental and versatile application in many various domains. Some possible use cases for music genre class...

🎧 audio-classification 3,400
tiantiaf

tiantiaf/whisper-large-v3-msp-podcast-emotion

This model includes the implementation of categorical emotion classification described in Vox-Profile: A Speech Foundation Model Benchmark f...

🎧 audio-classification 3,337
tiantiaf

tiantiaf/whisper-large-v3-msp-podcast-emotion-dim

This model includes the implementation of dimensional emotion classification described in Vox-Profile: A Speech Foundation Model Benchmark f...

🎧 audio-classification 3,253
tiantiaf

tiantiaf/whisper-large-v3-broad-accent

This model includes the implementation of broader accent classification described in Vox-Profile: A Speech Foundation Model Benchmark for Ch...

🎧 audio-classification 3,174
7wolf

7wolf/wav2ast-gender-classification

More information needed...

🎧 audio-classification 2,964
Aniemore

Aniemore/wav2vec2-xlsr-53-russian-emotion-recognition

No description available.

🎧 audio-classification 2,822
anton-l

anton-l/wav2vec2-random-tiny-classifier

No description available....

🎧 audio-classification 2,749
Jzuluaga

Jzuluaga/accent-id-commonaccent_ecapa

- en - audio-classification - speechbrain - embeddings - Accent - Identification - pytorch - ECAPA-TDNN - TDNN - CommonAccent - CommonVoice ...

🎧 audio-classification 2,669
superb

superb/wav2vec2-large-superb-er

This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Emotion Recognition task. The base model is wav2vec2-large-lv60, which is pretra...

🎧 audio-classification 2,617
MelodyMachine

MelodyMachine/Deepfake-audio-detection-V2

More information needed...

🎧 audio-classification 2,403
Khoa

Khoa/w2v-speech-emotion-recognition

This model is fine-tuned for recognizing emotions in English speech using the Wav2Vec2 architecture. It is capable of detecting the followin...

🎧 audio-classification 2,322
SeaBenSea

SeaBenSea/hubert-large-turkish-speech-emotion-recognition

No description available.

🎧 audio-classification 2,283
griko

griko/gender_cls_svm_ecapa_voxceleb

Input: Audio file (will be converted to 16kHz, mono, single channel) - Output: Gender prediction ("male" or "female") - Speaker embedding: 1...

🎧 audio-classification 2,036
superb

superb/wav2vec2-base-superb-sid

This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Speaker Identification task. The base model is wav2vec2-base, which is pretraine...

🎧 audio-classification 1,825
Speech-Arena-2025

Speech-Arena-2025/DF_Arena_1B_V_1

No description available.

🎧 audio-classification 1,769
hzhongresearch

hzhongresearch/yamnetp_ahead_ds

No description available.

🎧 audio-classification 1,701
Wiam

Wiam/wav2vec2-lg-xlsr-en-speech-emotion-recognition-finetuned-ravdess-v8

More information needed...

🎧 audio-classification 1,643
MIT

MIT/ast-finetuned-speech-commands-v2

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 1,639
somosnlp-hackathon-2022

somosnlp-hackathon-2022/wav2vec2-base-finetuned-sentiment-classification-MESD

This model was trained to classify underlying sentiment of Spanish audio/speech....

🎧 audio-classification 1,634
aufklarer

aufklarer/WeSpeaker-ResNet34-LM-MLX

No description available.

🎧 audio-classification 1,634
prithivMLmods

prithivMLmods/Speech-Emotion-Classification

No description available.

🎧 audio-classification 1,506
xmj2002

xmj2002/hubert-base-ch-speech-emotion-recognition

This model uses TencentGameMate/chinese-hubert-base) as the pre-training model for training on the CASIA dataset....

🎧 audio-classification 1,393
garystafford

garystafford/wav2vec2-deepfake-voice-detector

No description available.

🎧 audio-classification 1,363
ALM

ALM/wav2vec2-base-audioset

pipelinetag: audio-classification - music - audio - speech - audio-representation-learning - arch-benchmark - general-audio...

🎧 audio-classification 1,319
MarekCech

MarekCech/GenreVim-Music-Classification-DistilHuBERT

This model is finetuned version of ntu-spml/distilhubert for music genre classification. - Blues - Classical music - Country music - Drum & ...

🎧 audio-classification 1,318
awsaf49

awsaf49/sonics-spectttra-gamma-5s

No description available.

🎧 audio-classification 1,266
alkiskoudounas

alkiskoudounas/voc2vec-hubert-ls-pt

voc2vec-hubert is built upon the HuBERT framework and follows its pre-training setup. The pre-training datasets include: AudioSet (vocalizat...

🎧 audio-classification 1,072
WasuratS

WasuratS/distilhubert-finetuned-gtzan

Distilhubert is distilled version of the HuBERT and pretrained on data set with 16k frequency. Architecture of this model is CTC or Connecti...

🎧 audio-classification 1,056
chrisjay

chrisjay/afrospeech-wav2vec-all-6

No description available.

🎧 audio-classification 1,024
mispeech

mispeech/ced-base

No description available.

🎧 audio-classification 986
lewtun

lewtun/distilhubert-finetuned-gtzan

More information needed...

🎧 audio-classification 982
pedromatias97

pedromatias97/genre-recognizer-finetuned-gtzan_dset

More information needed...

🎧 audio-classification 906
dima806

dima806/english_accents_classification

Returns common English accent given a voice audio sample....

🎧 audio-classification 886
Bagus

Bagus/wav2vec2-xlsr-japanese-speech-emotion-recognition

No description available.

🎧 audio-classification 878
MIT

MIT/ast-finetuned-audioset-10-10-0.448

The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after ...

🎧 audio-classification 849
Krithika-p

Krithika-p/my_awesome_emotions_model

More information needed...

🎧 audio-classification 820
Hatman

Hatman/audio-emotion-detection

A model that returns Labels for Angry, Disgusted, Fearful, Happy, Neutral, Sad, Suprised. All aduio was trained at a sampling rate of 16000 ...

🎧 audio-classification 818
KELONMYOSA

KELONMYOSA/wav2vec2-xls-r-300m-emotion-ru

No description available.

🎧 audio-classification 811
lugan

lugan/SynTTS-Commands-Media-Benchmarks

No description available.

🎧 audio-classification 807
MTUCI

MTUCI/AASIST3

No description available.

🎧 audio-classification 795
gaunernst

gaunernst/vit_base_patch16_1024_128.audiomae_as2m_ft_as20k

Model Type: Audio classification / feature backbone - Papers: - Masked Autoencoders that Listen: https://arxiv.org/abs/2207.06405 - Pretrain...

🎧 audio-classification 753
anton-l

anton-l/wav2vec2-base-superb-sv

No description available.

🎧 audio-classification 722
Aniemore

Aniemore/wavlm-emotion-russian-resd

No description available....

🎧 audio-classification 676
tiantiaf

tiantiaf/wavlm-large-categorical-emotion

This model includes the implementation of categorical emotion classification described in Vox-Profile: A Speech Foundation Model Benchmark f...

🎧 audio-classification 671
ALM

ALM/hubert-base-audioset

pipelinetag: audio-classification - music - audio - speech - audio-representation-learning - arch-benchmark - general-audio...

🎧 audio-classification 669
Hemgg

Hemgg/Deepfake-audio-detection

The model is fintune on facebook/wav2vec2-base...

🎧 audio-classification 637
DunnBC22

DunnBC22/wav2vec2-base-Speech_Emotion_Recognition

This model predicts the emotion of the person speaking in the audio sample. For more information on how it was created, check out the follow...

🎧 audio-classification 632
forwarder1121

forwarder1121/voice-based-stress-recognition

Model name: Voice-Based Stress Recognition (StudentNet) - Repository: https://huggingface.co/forwarder1121/voice-based-stress-recognition - ...

🎧 audio-classification 613