Results for "text-to-audio"
86 matches found.
facebook/musicgen-medium
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
facebook/musicgen-small
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
ACE-Step/Ace-Step1.5
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
stabilityai/stable-audio-open-1.0
`Stable Audio Open 1.0` generates variable-length (up to 47s) stereo audio at 44.1kHz from text prompts. It comprises three components: an a...
facebook/musicgen-large
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
razhan/mms-tts-ckb
No description available.
ACE-Step/acestep-5Hz-lm-0.6B
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
ACE-Step/acestep-v15-base
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
ACE-Step/acestep-5Hz-lm-4B
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
ACE-Step/ACE-Step-v1-chinese-rap-LoRA
ACE-Step is a novel open-source foundation model for music generation that overcomes key limitations of existing approaches through a holist...
OpenMOSS-Team/MOSS-SoundEffect
...
ACE-Step/acestep-v15-sft
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
ylacombe/musicgen-melody
No description available.
ACE-Step/acestep-captioner
No description available.
facebook/musicgen-melody
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
HeartMuLa/HeartMuLa-oss-3B
No description available.
HeartMuLa/HeartMuLa-oss-3B-happy-new-year
The best open-sourced music generation model in terms of lyrics controllability and music quality....
Xenova/musicgen-small
No description available.
stabilityai/stable-audio-open-small
`Stable Audio Open Small` generates variable-length (up to 11s) stereo audio at 44.1kHz from text prompts. It comprises three components: an...
mradermacher/zen-musician-i1-GGUF
No description available.
slseanwu/MIDI-LLM_Llama-3.2-1B
Base Model: `meta-llama/Llama-3.2-1B` - Model Size: 1.4B parameters - Extended Vocabulary: 183,286 tokens (128,256 for text + 55,030 for MID...
ACE-Step/acestep-v15-turbo-shift3
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
facebook/musicgen-stereo-small
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
declare-lab/mustango
No description available.
eustlb/higgs-audio-v2-generation-3B-base
No description available.
ACE-Step/acestep-v15-turbo-continuous
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
FabioSarracino/VibeVoice-Large-Q8
No description available.
facebook/musicgen-stereo-medium
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
ACE-Step/acestep-v15-turbo-shift1
🚀 ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer har...
Marvis-AI/marvis-tts-250m-v0.2
Marvis is built on the Sesame CSM-1B (Conversational Speech Model) architecture, a multimodal transformer that operates directly on Residual...
calcuis/ace-gguf
- base model from ace-step - full set gguf (model+encoder+vae) works right away...
ylacombe/musicgen-stereo-melody
No description available.
Marvis-AI/marvis-tts-250m-v0.1
Marvis is built on the Sesame CSM-1B (Conversational Speech Model) architecture, a multimodal transformer that operates directly on Residual...
facebook/musicgen-stereo-large
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
espnet/fastspeech2_conformer
The FastSpeech2Conformer model was proposed with the paper Recent Developments On Espnet Toolkit Boosted By Conformer by Pengcheng Guo, Flor...
riffusion/riffusion-model-v1
Developed by: Seth Forsgren, Hayk Martiros - Model type: Diffusion-based text-to-image generation model - Language(s): English - License: Th...
2Noise/ChatTTS
No description available.
echarlaix/tiny-random-vits
No description available.
mingyi456/Ace-Step1.5-DF11-ComfyUI
For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModel...
Marvis-AI/marvis-tts-250m-v0.1-transformers
Marvis is built on the Sesame CSM-1B (Conversational Speech Model) architecture, a multimodal transformer that operates directly on Residual...
LiquidAI/LFM2.5-Audio-1.5B-ONNX
No description available.
facebook/magnet-small-10secs
Organization developing the model: The FAIR team of Meta AI. Model date: MAGNeT was trained between November 2023 and January 2024. Model ve...
sil-ai/senga-nt-asr-inferred-force-aligned-speecht5-NT-va-acoustic
No description available.
mradermacher/zen-musician-GGUF
No description available.
CypressYang/SongBloom
No description available.
declare-lab/TangoFlux
TangoFlux consists of FluxTransformer blocks which are Diffusion Transformer (DiT) and Multimodal Diffusion Transformer (MMDiT), conditioned...
nateraw/musicgen-songstarter-v0.2
No description available.
Marvis-AI/marvis-tts-250m-v0.2-MLX-6bit
No description available.
2121-8/japanese-parler-tts-mini-bate
No description available.
Marvis-AI/marvis-tts-100m-v0.2
Marvis is built on the Sesame CSM-1B (Conversational Speech Model) architecture, a multimodal transformer that operates directly on Residual...
benjiaiplayground/HeartMuLa-oss-3B-bf16
No description available.
Marvis-AI/marvis-tts-250m-v0.1-MLX-8bit
No description available.
benjiaiplayground/HeartCodec-oss-bf16
No description available.
Marvis-AI/marvis-tts-250m-v0.2-transformers
Marvis is built on the Sesame CSM-1B (Conversational Speech Model) architecture, a multimodal transformer that operates directly on Residual...
Matthijs/mms-tts-eng
No description available....
facebook/magnet-medium-30secs
Organization developing the model: The FAIR team of Meta AI. Model date: MAGNeT was trained between November 2023 and January 2024. Model ve...
tencent/SongGeneration
No description available.
HKUSTAudio/AudioX-MAF
No description available.
HKUSTAudio/AudioX-MAF-MMDiT
No description available.
espnet/fastspeech2_conformer_with_hifigan
No description available.
Beehzod/speechT5_tts_uzbek
More information needed...
facebook/musicgen-melody-large
Organization developing the model: The FAIR team of Meta AI. Model date: MusicGen was trained between April 2023 and May 2023. Model version...
Lingalingeswaran/facebook_mms_tamil
No description available.
Marvis-AI/marvis-tts-250m-v0.2-MLX-8bit
No description available.
eustlb/higgs-v2-archive
No description available.
facebook/audio-magnet-medium
Organization developing the model: The FAIR team of Meta AI. Model date: MAGNeT was trained between November 2023 and January 2024. Model ve...
atul10/nepali_male_v1
``` Nepali language ``` VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is an end-to-end speech synthes...
tencent/HunyuanVideo-Foley
No description available.
bcruz/MIDI-LLM_Llama-3.2-1B-Q4_K_M-GGUF
No description available.
ford442/stable-audio-open-1.0
`Stable Audio Open 1.0` generates variable-length (up to 47s) stereo audio at 44.1kHz from text prompts. It comprises three components: an a...
ychenqz/emotion_classifier
No description available.
sil-ai/senga-nt-asr-inferred-force-aligned-speecht5-MAT-l1-pure
More information needed...
CypressYang/SongBloom_long
No description available.
mradermacher/CiSiMi-GGUF
No description available.
suhaibrashid17/MMS_TTS_Urdu_3
No description available.
froabera/speecht5_finetuned
More information needed...
Urabewe/Ace-Step-Captioner-fp8
Tech Report ACE-Step Captioner is the annotation model used by ACE-Step v1.5 for training data labeling. It is a professional-grade music ca...
Marvis-AI/marvis-tts-100m-v0.2-MLX-6bit
No description available.
sil-ai/senga-nt-asr-inferred-force-aligned-speecht5-MAT-l1blend-0.7
More information needed...
alakxender/mms-tts-div-ft-spk01-f01
| Field | Value | |----| | Model ID | `alakxender/mms-tts-div-ft-spk01-f01` | | Base Architecture| MMS-TTS (VITS) | | Language | Divehi (dv)...
KandirResearch/CiSiMi-v0.1
No description available.
Omarrran/turkish_finetuned_speecht5_tts
No description available.
ManuD/speecht5_finetuned_voxpopuli_de_Merkel
More information needed...
MuzaffarSharofitdinov/mms-tts-uzbek-qiz-ovozi_v2
No description available.
Nekochu/stable-audio-open-1.0-Music
No description available.
sil-ai/senga-nt-asr-inferred-force-aligned-speecht5-MAT-l1blend
More information needed...