Nirman.online | Premium AI Directory

DAMO-NLP-SG

DAMO-NLP-SG/VideoLLaMA3-7B

No description available.

🎥 video-text-to-text 103,618

llava-hf

llava-hf/LLaVA-NeXT-Video-7B-hf

No description available.

🎥 video-text-to-text 91,811

Kwai-Keye

Kwai-Keye/Keye-VL-8B-Preview

No description available.

🎥 video-text-to-text 64,185

lmms-lab

lmms-lab/LLaVA-Video-7B-Qwen2

- lmms-lab/LLaVA-OneVision-Data - lmms-lab/LLaVA-Video-178K - en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-t...

🎥 video-text-to-text 55,871

Kwai-Keye

Kwai-Keye/Keye-VL-1_5-8B

No description available.

🎥 video-text-to-text 52,740

DAMO-NLP-SG

DAMO-NLP-SG/VideoLLaMA2.1-7B-16F

No description available.

🎥 video-text-to-text 24,981

zai-org

zai-org/cogvlm2-llama3-caption

No description available.

🎥 video-text-to-text 11,120

OpenGVLab

OpenGVLab/InternVideo2_5_Chat_8B

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: InternVideo2.5 results: - task: type: multimo...

🎥 video-text-to-text 5,317

TIGER-Lab

TIGER-Lab/VideoScore-v1.1

No description available.

🎥 video-text-to-text 4,005

lmms-lab

lmms-lab/LLaVA-NeXT-Video-7B-DPO

Model type: LLaVA-Next-Video is an open-source chatbot trained by fine-tuning LLM on multimodal instruction-following data. This model is th...

🎥 video-text-to-text 2,548

OpenGVLab

OpenGVLab/VideoChat-Flash-Qwen2-7B_res448

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: VideoChat-Flash-Qwen2-7Bres448 results: - tas...

🎥 video-text-to-text 2,412

DAMO-NLP-SG

DAMO-NLP-SG/VideoLLaMA3-2B

No description available.

🎥 video-text-to-text 2,303

lmms-lab

lmms-lab/LLaVA-NeXT-Video-7B

Model type: LLaVA-Next-Video is an open-source chatbot trained by fine-tuning LLM on multimodal instruction-following data. This model is th...

🎥 video-text-to-text 2,077

mlx-community

mlx-community/SmolVLM2-500M-Video-Instruct-mlx

No description available.

🎥 video-text-to-text 1,979

llava-hf

llava-hf/LLaVA-NeXT-Video-7B-DPO-hf

No description available.

🎥 video-text-to-text 1,935

PhilipC

PhilipC/HumanOmniV2

No description available.

🎥 video-text-to-text 1,869

OpenGVLab

OpenGVLab/VideoChat-Flash-Qwen2_5-7B_InternVideo2-1B

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: VideoChat-Flash-Qwen25-7BInternVideo2-1B resu...

🎥 video-text-to-text 1,414

Video-R1

Video-R1/Video-R1-7B

No description available.

🎥 video-text-to-text 1,312

chenjoya

chenjoya/videollm-online-8b-v1plus

LLM: meta-llama/Meta-Llama-3-8B-Instruct Vision Strategy: Frame Encoder: google/siglip-large-patch16-384 Frame Tokens: CLS Token + Avg Poole...

🎥 video-text-to-text 1,138

Diankun

Diankun/Spatial-MLLM-subset-sft

No description available.

🎥 video-text-to-text 1,110

allenai

allenai/Molmo2-VideoPoint-4B

No description available.

🎥 video-text-to-text 964

OpenGVLab

OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: VideoChat-Flash-Qwen25-15Bres448 results: - t...

🎥 video-text-to-text 934

TencentARC

TencentARC/TimeLens-8B

No description available.

🎥 video-text-to-text 860

TencentARC

TencentARC/ARC-Hunyuan-Video-7B

No description available.

🎥 video-text-to-text 834

mlx-community

mlx-community/SmolVLM2-256M-Video-Instruct-mlx

No description available.

🎥 video-text-to-text 786

Video-R1

Video-R1/Qwen2.5-VL-7B-COT-SFT

No description available.

🎥 video-text-to-text 691

Diankun

Diankun/Spatial-MLLM-v1.1-Instruct-135K

No description available.

🎥 video-text-to-text 556

Zhang199

Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-16-512

No description available.

🎥 video-text-to-text 541

Skywork

Skywork/SkyCaptioner-V1

No description available.

🎥 video-text-to-text 482

prithivMLmods

prithivMLmods/SAGE-MM-Qwen2.5-VL-7B-SFT_RL-GGUF

No description available.

🎥 video-text-to-text 474

mradermacher

mradermacher/SmolVLM2-2.2B-Instruct-GGUF

No description available.

🎥 video-text-to-text 439

OpenGVLab

OpenGVLab/InternVL_2_5_HiCo_R16

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: InternVL2.5HiCoR16 results: - task: type: mul...

🎥 video-text-to-text 426

Alibaba-DAMO-Academy

Alibaba-DAMO-Academy/PixelRefer-7B

No description available.

🎥 video-text-to-text 395

OpenGVLab

OpenGVLab/VideoChat-R1_7B

No description available.

🎥 video-text-to-text 371

mradermacher

mradermacher/SmolVLM2-2.2B-Instruct-i1-GGUF

No description available.

🎥 video-text-to-text 335

llava-hf

llava-hf/LLaVA-NeXT-Video-34B-hf

No description available.

🎥 video-text-to-text 303

VITA-MLLM

VITA-MLLM/VITA-1.5

This repository contains the model of the paper VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction....

🎥 video-text-to-text 291

Chat-UniVi

Chat-UniVi/Chat-UniVi-7B-v1.5

No description available.

🎥 video-text-to-text 279

OpenGVLab

OpenGVLab/InternVideo2-Chat-8B

No description available.

🎥 video-text-to-text 277

prithivMLmods

prithivMLmods/KAIROS-MM-Qwen2.5-VL-7B-RL-AIO-GGUF

No description available.

🎥 video-text-to-text 256

TIGER-Lab

TIGER-Lab/Vamba-Qwen2-VL-7B

No description available.

🎥 video-text-to-text 236

yaolily

yaolily/TimeChat-Captioner-GRPO-7B

No description available.

🎥 video-text-to-text 236

MLAdaptiveIntelligence

MLAdaptiveIntelligence/LLaVAction-0.5B

No description available.

🎥 video-text-to-text 235

Mungert

Mungert/SkyCaptioner-V1-GGUF

No description available.

🎥 video-text-to-text 225

OpenGVLab

OpenGVLab/VideoChat2_HD_stage4_Mistral_7B_hf

No description available.

🎥 video-text-to-text 222

tsinghua-ee

tsinghua-ee/video-SALMONN-2

No description available.

🎥 video-text-to-text 212

Rihong

Rihong/VideoChat2_HD_Infinity_Mistral_7B

No description available.

🎥 video-text-to-text 196

OpenGVLab

OpenGVLab/VideoChat-Flash-Qwen2-7B_res224

- en libraryname: transformers - accuracy - multimodal pipelinetag: video-text-to-text - name: VideoChat-Flash-Qwen2-7Bres448 results: - tas...

🎥 video-text-to-text 193

chancharikm

chancharikm/qwen2.5-vl-7b-cam-motion

This model is a fine-tuned version of Qwen/Qwen2.5-VL-7B-Instruct on the current most, high-quality camera motion dataset that is publically...

🎥 video-text-to-text 181

QiWang98

QiWang98/VideoRFT-3B

No description available.

🎥 video-text-to-text 172

QiWang98

QiWang98/VideoRFT

No description available.

🎥 video-text-to-text 170

BAAI

BAAI/Video-XL-2

No description available.

🎥 video-text-to-text 169

TencentARC

TencentARC/GRPO-CARE

No description available.

🎥 video-text-to-text 169

prithivMLmods

prithivMLmods/SAGE-MM-Qwen3-VL-4B-SFT_RL-GGUF

No description available.

🎥 video-text-to-text 162

chancharikm

chancharikm/qwen2.5-vl-72b-cam-motion

This model is a fine-tuned version of Qwen/Qwen2.5-VL-72B-Instruct on the current most, high-quality camera motion dataset that is publicall...

🎥 video-text-to-text 156

Results for "video-text-to-text"

DAMO-NLP-SG/VideoLLaMA3-7B

llava-hf/LLaVA-NeXT-Video-7B-hf

Kwai-Keye/Keye-VL-8B-Preview

lmms-lab/LLaVA-Video-7B-Qwen2

Kwai-Keye/Keye-VL-1_5-8B

DAMO-NLP-SG/VideoLLaMA2.1-7B-16F

zai-org/cogvlm2-llama3-caption

OpenGVLab/InternVideo2_5_Chat_8B

TIGER-Lab/VideoScore-v1.1

lmms-lab/LLaVA-NeXT-Video-7B-DPO

OpenGVLab/VideoChat-Flash-Qwen2-7B_res448

DAMO-NLP-SG/VideoLLaMA3-2B

lmms-lab/LLaVA-NeXT-Video-7B

mlx-community/SmolVLM2-500M-Video-Instruct-mlx

llava-hf/LLaVA-NeXT-Video-7B-DPO-hf

PhilipC/HumanOmniV2

OpenGVLab/VideoChat-Flash-Qwen2_5-7B_InternVideo2-1B

Video-R1/Video-R1-7B

chenjoya/videollm-online-8b-v1plus

Diankun/Spatial-MLLM-subset-sft

allenai/Molmo2-VideoPoint-4B

OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448

TencentARC/TimeLens-8B

TencentARC/ARC-Hunyuan-Video-7B

mlx-community/SmolVLM2-256M-Video-Instruct-mlx

Video-R1/Qwen2.5-VL-7B-COT-SFT

Diankun/Spatial-MLLM-v1.1-Instruct-135K

Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-16-512

Skywork/SkyCaptioner-V1

prithivMLmods/SAGE-MM-Qwen2.5-VL-7B-SFT_RL-GGUF

mradermacher/SmolVLM2-2.2B-Instruct-GGUF

OpenGVLab/InternVL_2_5_HiCo_R16

Alibaba-DAMO-Academy/PixelRefer-7B

OpenGVLab/VideoChat-R1_7B

mradermacher/SmolVLM2-2.2B-Instruct-i1-GGUF

llava-hf/LLaVA-NeXT-Video-34B-hf

VITA-MLLM/VITA-1.5

Chat-UniVi/Chat-UniVi-7B-v1.5

OpenGVLab/InternVideo2-Chat-8B

prithivMLmods/KAIROS-MM-Qwen2.5-VL-7B-RL-AIO-GGUF

TIGER-Lab/Vamba-Qwen2-VL-7B

yaolily/TimeChat-Captioner-GRPO-7B

MLAdaptiveIntelligence/LLaVAction-0.5B

Mungert/SkyCaptioner-V1-GGUF

OpenGVLab/VideoChat2_HD_stage4_Mistral_7B_hf

tsinghua-ee/video-SALMONN-2

Rihong/VideoChat2_HD_Infinity_Mistral_7B

OpenGVLab/VideoChat-Flash-Qwen2-7B_res224

chancharikm/qwen2.5-vl-7b-cam-motion

QiWang98/VideoRFT-3B

QiWang98/VideoRFT

BAAI/Video-XL-2

TencentARC/GRPO-CARE

prithivMLmods/SAGE-MM-Qwen3-VL-4B-SFT_RL-GGUF

chancharikm/qwen2.5-vl-72b-cam-motion