Results for "text-generation"

100 matches found.

Qwen

Qwen/Qwen2.5-7B-Instruct

No description available.

📝 text-generation 20,825,802
Qwen

Qwen/Qwen3-0.6B

Qwen3-0.6B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: ...

📝 text-generation 11,578,352
openai-community

openai-community/gpt2

GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This means it was pretrained o...

📝 text-generation 10,210,526
Qwen

Qwen/Qwen2.5-1.5B-Instruct

No description available.

📝 text-generation 7,235,449
openai

openai/gpt-oss-20b

No description available.

📝 text-generation 7,221,396
meta-llama

meta-llama/Llama-3.1-8B-Instruct

- en - de - fr - it - pt - hi - es - th basemodel: meta-llama/Meta-Llama-3.1-8B pipelinetag: text-generation - facebook - meta - pytorch - l...

📝 text-generation 7,163,568
Qwen

Qwen/Qwen2.5-0.5B-Instruct

No description available.

📝 text-generation 6,964,401
Qwen

Qwen/Qwen2.5-3B-Instruct

No description available.

📝 text-generation 6,652,654
Qwen

Qwen/Qwen3-1.7B

Qwen3-1.7B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: ...

📝 text-generation 6,438,824
Qwen

Qwen/Qwen3-8B

Qwen3-8B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 8....

📝 text-generation 6,387,534
trl-internal-testing

trl-internal-testing/tiny-Qwen2ForCausalLM-2.5

No description available.

📝 text-generation 6,139,720
facebook

facebook/opt-125m

OPT was predominantly pretrained with English text, but a small amount of non-English data is still present within the training corpus via C...

📝 text-generation 5,925,294
Qwen

Qwen/Qwen3-4B

Qwen3-4B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 4....

📝 text-generation 5,459,946
dphn

dphn/dolphin-2.9.1-yi-1.5-34b

More information needed...

📝 text-generation 4,654,456
openai

openai/gpt-oss-120b

No description available.

📝 text-generation 4,279,000
mlx-community

mlx-community/Kimi-K2.5

No description available.

📝 text-generation 3,912,906
Qwen

Qwen/Qwen3-32B

Qwen3-32B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 3...

📝 text-generation 3,790,952
Qwen

Qwen/Qwen3-4B-Instruct-2507

Qwen3-4B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of ...

📝 text-generation 3,777,419
meta-llama

meta-llama/Llama-3.2-1B-Instruct

- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...

📝 text-generation 3,768,116
Qwen

Qwen/Qwen2.5-32B-Instruct

No description available.

📝 text-generation 3,523,535
meta-llama

meta-llama/Llama-3.2-3B-Instruct

- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...

📝 text-generation 3,488,269
Qwen

Qwen/Qwen2-1.5B-Instruct

Qwen2 is a language model series including decoder language models of different model sizes. For each size, we release the base language mod...

📝 text-generation 3,482,090
google

google/gemma-3-1b-it

No description available.

📝 text-generation 3,149,748
mistralai

mistralai/Mistral-7B-Instruct-v0.2

No description available.

📝 text-generation 2,933,361
meta-llama

meta-llama/Meta-Llama-3-8B

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned genera...

📝 text-generation 2,571,135
EleutherAI

EleutherAI/pythia-160m

Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for tra...

📝 text-generation 2,524,662
zai-org

zai-org/GLM-5-FP8

No description available.

📝 text-generation 2,284,370
Qwen

Qwen/Qwen3-30B-A3B-Instruct-2507

Qwen3-30B-A3B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Numbe...

📝 text-generation 2,196,902
Qwen

Qwen/Qwen2.5-7B

No description available.

📝 text-generation 2,032,682
Qwen

Qwen/Qwen2.5-14B-Instruct

No description available.

📝 text-generation 1,992,973
distilbert

distilbert/distilgpt2

Developed by: Hugging Face - Model type: Transformer-based Language Model - Language: English - License: Apache 2.0 - Model Description: Dis...

📝 text-generation 1,985,882
TinyLlama

TinyLlama/TinyLlama-1.1B-Chat-v1.0

No description available.

📝 text-generation 1,902,683
Qwen

Qwen/Qwen3-14B

Qwen3-14B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 1...

📝 text-generation 1,873,834
RedHatAI

RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic

Model Architecture: Meta-Llama-3.2 - Input: Text - Output: Text - Model Optimizations: - Weight quantization: FP8 - Activation quantization:...

📝 text-generation 1,813,922
Qwen

Qwen/Qwen2.5-Coder-1.5B-Instruct

No description available.

📝 text-generation 1,811,818
openai-community

openai-community/gpt2-large

Model Description: GPT-2 Large is the 774M parameter version of GPT-2, a transformer-based language model created and released by OpenAI. Th...

📝 text-generation 1,750,516
Qwen

Qwen/Qwen2.5-32B-Instruct-AWQ

No description available.

📝 text-generation 1,701,553
microsoft

microsoft/phi-2

No description available.

📝 text-generation 1,669,937
zai-org

zai-org/GLM-4.7-Flash

No description available.

📝 text-generation 1,667,490
Qwen

Qwen/Qwen3-0.6B-FP8

This repo contains the FP8 version of Qwen3-0.6B, which has the following features: - Type: Causal Language Models - Training Stage: Pretrai...

📝 text-generation 1,655,644
Qwen

Qwen/Qwen2.5-Coder-7B-Instruct

No description available.

📝 text-generation 1,626,050
apple

apple/OpenELM-1_1B-Instruct

No description available.

📝 text-generation 1,599,430
Qwen

Qwen/Qwen2.5-32B

No description available.

📝 text-generation 1,569,598
deepseek-ai

deepseek-ai/DeepSeek-V3

No description available.

📝 text-generation 1,494,494
meta-llama

meta-llama/Llama-3.2-1B

- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...

📝 text-generation 1,474,173
deepseek-ai

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

No description available.

📝 text-generation 1,466,870
nvidia

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8

Model Developer: NVIDIA Corporation Model Dates: September 2025 \- December 2025 Data Freshness: The post-training data has a cutoff date of...

📝 text-generation 1,427,762
meta-llama

meta-llama/Llama-3.2-3B

- en - de - fr - it - pt - hi - es - th libraryname: transformers pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3...

📝 text-generation 1,414,722
meta-llama

meta-llama/Meta-Llama-3-8B-Instruct

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned genera...

📝 text-generation 1,377,777
meta-llama

meta-llama/Llama-3.1-8B

- en - de - fr - it - pt - hi - es - th pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3 extragatedprompt: >-...

📝 text-generation 1,368,085
hmellor

hmellor/tiny-random-LlamaForCausalLM

No description available.

📝 text-generation 1,336,028
Qwen

Qwen/Qwen2.5-Coder-0.5B-Instruct

No description available.

📝 text-generation 1,311,734
bigscience

bigscience/bloomz-560m

- bigscience/xP3 - ak - ar - as - bm - bn - ca - code - en - es - eu - fon - fr - gu - hi - id - ig - ki - kn - lg - ln - ml - mr - ne - nso...

📝 text-generation 1,303,887
bullpoint

bullpoint/Qwen3-Coder-Next-AWQ-4bit

Qwen3-Coder-Next has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parame...

📝 text-generation 1,288,011
Qwen

Qwen/Qwen3-Next-80B-A3B-Instruct

> [!Note] > Qwen3-Next-80B-A3B-Instruct supports only instruct (non-thinking) mode and does not generate ```` blocks in its output. Qwen3-Ne...

📝 text-generation 1,240,316
Qwen

Qwen/Qwen3-30B-A3B

Qwen3-30B-A3B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameter...

📝 text-generation 1,210,375
Qwen

Qwen/Qwen2.5-0.5B

No description available.

📝 text-generation 1,189,562
Qwen

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

No description available.

📝 text-generation 1,156,750
Qwen

Qwen/Qwen2.5-14B-Instruct-AWQ

No description available.

📝 text-generation 1,132,830
Qwen

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

No description available.

📝 text-generation 1,108,851
RedHatAI

RedHatAI/Qwen2.5-1.5B-quantized.w8a8

Model Architecture: Qwen2 - Input: Text - Output: Text - Model Optimizations: - Activation quantization: INT8 - Weight quantization: INT8 - ...

📝 text-generation 1,084,672
Qwen

Qwen/Qwen3-Coder-Next

Qwen3-Coder-Next has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parame...

📝 text-generation 1,067,139
nvidia

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Model Developer: NVIDIA Corporation Model Dates: September 2025 \- December 2025 Data Freshness: The post-training data has a cutoff date of...

📝 text-generation 1,061,648
deepseek-ai

deepseek-ai/DeepSeek-R1

No description available.

📝 text-generation 1,052,482
Qwen

Qwen/Qwen2.5-Math-1.5B

For more details, please refer to our blog post and GitHub repo....

📝 text-generation 1,047,585
deepseek-ai

deepseek-ai/DeepSeek-R1-0528

No description available.

📝 text-generation 1,030,783
h2oai

h2oai/h2ovl-mississippi-800m

[\[📜 H2OVL-Mississippi Paper\]](https://arxiv.org/abs/2410.13611) [\[🤗 HF Demo\]](https://huggingface.co/spaces/h2oai/h2ovl-mississippi) [\[...

📝 text-generation 1,029,870
h2oai

h2oai/h2ovl-mississippi-2b

[\[📜 H2OVL-Mississippi Paper\]](https://arxiv.org/abs/2410.13611) [\[🤗 HF Demo\]](https://huggingface.co/spaces/h2oai/h2ovl-mississippi) [\[...

📝 text-generation 1,021,968
HuggingFaceTB

HuggingFaceTB/SmolLM2-135M

No description available.

📝 text-generation 1,005,943
Qwen

Qwen/Qwen3-Coder-30B-A3B-Instruct

Qwen3-Coder-30B-A3B-Instruct has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Numb...

📝 text-generation 992,338
Qwen

Qwen/Qwen2.5-Coder-32B-Instruct-AWQ

No description available.

📝 text-generation 985,791
llamafactory

llamafactory/tiny-random-Llama-3

No description available.

📝 text-generation 967,349
lmstudio-community

lmstudio-community/GLM-4.7-Flash-MLX-8bit

No description available.

📝 text-generation 957,389
lmstudio-community

lmstudio-community/GLM-4.7-Flash-MLX-6bit

No description available.

📝 text-generation 947,219
Qwen

Qwen/Qwen2.5-72B-Instruct-AWQ

No description available.

📝 text-generation 920,509
Qwen

Qwen/Qwen3-30B-A3B-Instruct-2507-FP8

This repo contains the FP8 version of Qwen3-30B-A3B-Instruct-2507, which has the following features: - Type: Causal Language Models - Traini...

📝 text-generation 906,081
microsoft

microsoft/phi-4

No description available.

📝 text-generation 896,280
lmsys

lmsys/vicuna-7b-v1.5

Vicuna is a chat assistant trained by fine-tuning Llama 2 on user-shared conversations collected from ShareGPT. - Developed by: LMSYS - Mode...

📝 text-generation 895,601
kaitchup

kaitchup/Phi-3-mini-4k-instruct-gptq-4bit

No description available.

📝 text-generation 887,090
deepseek-ai

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

No description available.

📝 text-generation 884,169
Qwen

Qwen/Qwen2.5-Coder-32B-Instruct

No description available.

📝 text-generation 854,516
RedHatAI

RedHatAI/Llama-3.2-1B-Instruct-FP8

Model Architecture: Llama-3 - Input: Text - Output: Text - Model Optimizations: - Activation quantization: FP8 - Weight quantization: FP8 - ...

📝 text-generation 829,888
Qwen

Qwen/Qwen2.5-Coder-1.5B

No description available.

📝 text-generation 814,176
microsoft

microsoft/Phi-3-mini-4k-instruct

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-...

📝 text-generation 813,831
llm-jp

llm-jp/llm-jp-3-3.7b-instruct

Model type: Transformer-based Language Model - Total seen tokens: 2.1T |Params|Layers|Hidden size|Heads|Context length|Embedding parameters|...

📝 text-generation 810,489
meta-llama

meta-llama/Llama-3.1-70B-Instruct

- en - de - fr - it - pt - hi - es - th libraryname: transformers basemodel: meta-llama/Meta-Llama-3.1-70B newversion: meta-llama/Llama-3.3-...

📝 text-generation 809,344
Qwen

Qwen/Qwen3-8B-Base

Qwen3-8B-Base has the following features: - Type: Causal Language Models - Training Stage: Pretraining - Number of Parameters: 8.2B - Number...

📝 text-generation 774,844
deepseek-ai

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

No description available.

📝 text-generation 749,001
openai-community

openai-community/gpt2-medium

Model Description: GPT-2 Medium is the 355M parameter version of GPT-2, a transformer-based language model created and released by OpenAI. T...

📝 text-generation 737,021
deepseek-ai

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

No description available.

📝 text-generation 736,882
Qwen

Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

This repo contains the FP8 version of Qwen3-235B-A22B-Instruct-2507, which has the following features: - Type: Causal Language Models - Trai...

📝 text-generation 727,557
Qwen

Qwen/Qwen2.5-1.5B-Instruct-AWQ

No description available.

📝 text-generation 723,758
deepseek-ai

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

No description available.

📝 text-generation 718,145
Qwen

Qwen/Qwen3-235B-A22B

Qwen3-235B-A22B has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Paramet...

📝 text-generation 704,690
casperhansen

casperhansen/llama-3.3-70b-instruct-awq

No description available.

📝 text-generation 694,286
RedHatAI

RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8

Model Architecture: Meta-Llama-3.1 - Input: Text - Output: Text - Model Optimizations: - Weight quantization: FP8 - Activation quantization:...

📝 text-generation 691,349
EleutherAI

EleutherAI/pythia-70m-deduped

Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for tra...

📝 text-generation 672,177
meta-llama

meta-llama/Llama-2-7b-hf

Note: Use of this model is governed by the Meta license. In order to download the model weights and tokenizer, please visit the website and ...

📝 text-generation 664,952
meta-llama

meta-llama/Llama-3.3-70B-Instruct

libraryname: transformers - en - fr - it - pt - hi - es - th - de basemodel: - meta-llama/Llama-3.1-70B - facebook - meta - pytorch - llama ...

📝 text-generation 645,521
meta-llama

meta-llama/Llama-3.1-405B

- en - de - fr - it - pt - hi - es - th pipelinetag: text-generation - facebook - meta - pytorch - llama - llama-3 extragatedprompt: >-...

📝 text-generation 641,417