Results for "text-generation"
100 matches found.
Qwen/Qwen2.5-7B-Instruct
No description available.
Qwen/Qwen3-0.6B
Qwen3-0.6B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: ...
openai-community/gpt2
GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This means it was pretrained o...
Qwen/Qwen2.5-1.5B-Instruct
No description available.
openai/gpt-oss-20b
No description available.
meta-llama/Llama-3.1-8B-Instruct
- en - de - fr - it - pt - hi - es - th basemodel: meta-llama/Meta-Llama-3.1-8B pipelinetag: text-generation - facebook - meta - pytorch - l...
Qwen/Qwen2.5-0.5B-Instruct
No description available.
Qwen/Qwen2.5-3B-Instruct
No description available.
Qwen/Qwen3-1.7B
Qwen3-1.7B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: ...
Qwen/Qwen3-8B
Qwen3-8B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 8....
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5
No description available.
facebook/opt-125m
OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via C...
Qwen/Qwen3-4B
Qwen3-4B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 4....
dphn/dolphin-2.9.1-yi-1.5-34b
More information needed...
openai/gpt-oss-120b
No description available.
mlx-community/Kimi-K2.5
No description available.
Qwen/Qwen3-32B
Qwen3-32B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 3...
Qwen/Qwen3-4B-Instruct-2507
Qwen3-4B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of ...
meta-llama/Llama-3.2-1B-Instruct
- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...
Qwen/Qwen2.5-32B-Instruct
No description available.
meta-llama/Llama-3.2-3B-Instruct
- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...
Qwen/Qwen2-1.5B-Instruct
Qwen2 is a language model series including decoder language models of different model sizes. For each size, we release the base language mod...
google/gemma-3-1b-it
No description available.
mistralai/Mistral-7B-Instruct-v0.2
No description available.
meta-llama/Meta-Llama-3-8B
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned genera...
EleutherAI/pythia-160m
Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for tra...
zai-org/GLM-5-FP8
No description available.
Qwen/Qwen3-30B-A3B-Instruct-2507
Qwen3-30B-A3B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Numbe...
Qwen/Qwen2.5-7B
No description available.
Qwen/Qwen2.5-14B-Instruct
No description available.
distilbert/distilgpt2
Developed by: Hugging Face - Model type: Transformer-based Language Model - Language: English - License: Apache 2.0 - Model Description: Dis...
TinyLlama/TinyLlama-1.1B-Chat-v1.0
No description available.
Qwen/Qwen3-14B
Qwen3-14B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 1...
RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic
Model Architecture: Meta-Llama-3.2 - Input: Text - Output: Text - Model Optimizations: - Weight quantization: FP8 - Activation quantization:...
Qwen/Qwen2.5-Coder-1.5B-Instruct
No description available.
openai-community/gpt2-large
Model Description: GPT-2 Large is the 774M parameter version of GPT-2, a transformer-based language model created and released by OpenAI. Th...
Qwen/Qwen2.5-32B-Instruct-AWQ
No description available.
microsoft/phi-2
No description available.
zai-org/GLM-4.7-Flash
No description available.
Qwen/Qwen3-0.6B-FP8
This repo contains the FP8 version of Qwen3-0.6B, which has the following features: - Type: Causal Language Models - Training Stage: Pretrai...
Qwen/Qwen2.5-Coder-7B-Instruct
No description available.
apple/OpenELM-1_1B-Instruct
No description available.
Qwen/Qwen2.5-32B
No description available.
deepseek-ai/DeepSeek-V3
No description available.
meta-llama/Llama-3.2-1B
- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
No description available.
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Model Developer: NVIDIA Corporation Model Dates: September 2025 \- December 2025 Data Freshness: The post-training data has a cutoff date of...
meta-llama/Llama-3.2-3B
- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...
meta-llama/Meta-Llama-3-8B-Instruct
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned genera...
meta-llama/Llama-3.1-8B
- en - de - fr - it - pt - hi - es - th pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3 extragatedprompt: >-...
hmellor/tiny-random-LlamaForCausalLM
No description available.
Qwen/Qwen2.5-Coder-0.5B-Instruct
No description available.
bigscience/bloomz-560m
- bigscience/xP3 - ak - ar - as - bm - bn - ca - code - en - es - eu - fon - fr - gu - hi - id - ig - ki - kn - lg - ln - ml - mr - ne - nso...
bullpoint/Qwen3-Coder-Next-AWQ-4bit
Qwen3-Coder-Next has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parame...
Qwen/Qwen3-Next-80B-A3B-Instruct
> [!Note] > Qwen3-Next-80B-A3B-Instruct supports only instruct (non-thinking) mode and does not generate ```` blocks in its output. Qwen3-Ne...
Qwen/Qwen3-30B-A3B
Qwen3-30B-A3B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameter...
Qwen/Qwen2.5-0.5B
No description available.
Qwen/Qwen2.5-Coder-7B-Instruct-AWQ
No description available.
Qwen/Qwen2.5-14B-Instruct-AWQ
No description available.
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4
No description available.
RedHatAI/Qwen2.5-1.5B-quantized.w8a8
Model Architecture: Qwen2 - Input: Text - Output: Text - Model Optimizations: - Activation quantization: INT8 - Weight quantization: INT8 - ...
Qwen/Qwen3-Coder-Next
Qwen3-Coder-Next has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parame...
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Model Developer: NVIDIA Corporation Model Dates: September 2025 \- December 2025 Data Freshness: The post-training data has a cutoff date of...
deepseek-ai/DeepSeek-R1
No description available.
Qwen/Qwen2.5-Math-1.5B
For more details, please refer to our blog post and GitHub repo....
deepseek-ai/DeepSeek-R1-0528
No description available.
h2oai/h2ovl-mississippi-800m
[\[📜 H2OVL-Mississippi Paper\]](https://arxiv.org/abs/2410.13611) [\[🤗 HF Demo\]](https://huggingface.co/spaces/h2oai/h2ovl-mississippi) [\[...
h2oai/h2ovl-mississippi-2b
[\[📜 H2OVL-Mississippi Paper\]](https://arxiv.org/abs/2410.13611) [\[🤗 HF Demo\]](https://huggingface.co/spaces/h2oai/h2ovl-mississippi) [\[...
HuggingFaceTB/SmolLM2-135M
No description available.
Qwen/Qwen3-Coder-30B-A3B-Instruct
Qwen3-Coder-30B-A3B-Instruct has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Numb...
Qwen/Qwen2.5-Coder-32B-Instruct-AWQ
No description available.
llamafactory/tiny-random-Llama-3
No description available.
lmstudio-community/GLM-4.7-Flash-MLX-8bit
No description available.
lmstudio-community/GLM-4.7-Flash-MLX-6bit
No description available.
Qwen/Qwen2.5-72B-Instruct-AWQ
No description available.
Qwen/Qwen3-30B-A3B-Instruct-2507-FP8
This repo contains the FP8 version of Qwen3-30B-A3B-Instruct-2507, which has the following features: - Type: Causal Language Models - Traini...
microsoft/phi-4
No description available.
lmsys/vicuna-7b-v1.5
Vicuna is a chat assistant trained by fine-tuning Llama 2 on user-shared conversations collected from ShareGPT. - Developed by: LMSYS - Mode...
kaitchup/Phi-3-mini-4k-instruct-gptq-4bit
No description available.
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
No description available.
Qwen/Qwen2.5-Coder-32B-Instruct
No description available.
RedHatAI/Llama-3.2-1B-Instruct-FP8
Model Architecture: Llama-3 - Input: Text - Output: Text - Model Optimizations: - Activation quantization: FP8 - Weight quantization: FP8 - ...
Qwen/Qwen2.5-Coder-1.5B
No description available.
microsoft/Phi-3-mini-4k-instruct
🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-...
llm-jp/llm-jp-3-3.7b-instruct
Model type: Transformer-based Language Model - Total seen tokens: 2.1T |Params|Layers|Hidden size|Heads|Context length|Embedding parameters|...
meta-llama/Llama-3.1-70B-Instruct
- en - de - fr - it - pt - hi - es - th libraryname: transformers basemodel: meta-llama/Meta-Llama-3.1-70B newversion: meta-llama/Llama-3.3-...
Qwen/Qwen3-8B-Base
Qwen3-8B-Base has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Number of Parameters: 8.2B - Number...
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
No description available.
openai-community/gpt2-medium
Model Description: GPT-2 Medium is the 355M parameter version of GPT-2, a transformer-based language model created and released by OpenAI. T...
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
No description available.
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8
This repo contains the FP8 version of Qwen3-235B-A22B-Instruct-2507, which has the following features: - Type: Causal Language Models - Trai...
Qwen/Qwen2.5-1.5B-Instruct-AWQ
No description available.
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
No description available.
Qwen/Qwen3-235B-A22B
Qwen3-235B-A22B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Paramet...
casperhansen/llama-3.3-70b-instruct-awq
No description available.
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8
Model Architecture: Meta-Llama-3.1 - Input: Text - Output: Text - Model Optimizations: - Weight quantization: FP8 - Activation quantization:...
EleutherAI/pythia-70m-deduped
Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for tra...
meta-llama/Llama-2-7b-hf
Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and ...
meta-llama/Llama-3.3-70B-Instruct
libraryname: transformers - en - fr - it - pt - hi - es - th - de basemodel: - meta-llama/Llama-3.1-70B - facebook - meta - pytorch - llama ...
meta-llama/Llama-3.1-405B
- en - de - fr - it - pt - hi - es - th pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3 extragatedprompt: >-...