Discover the Best AI Models
Search, analyze, and download from our global directory of 3,000+ open-source models.
Model Index 55939 Total
microsoft/table-transformer-detection
The Table Transformer is equivalent to DETR, a Transformer-based object detection model. Note that the authors decided to use the "normalize...
facebook/w2v-bert-2.0
No description available.
ETH-CVG/lightglue_superpoint
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-portuguese
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
magic-leap-community/superpoint
No description available.
speechbrain/spkrec-resnet-voxceleb
No description available.
llava-hf/llava-1.5-7b-hf
Model type: LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It i...
google-bert/bert-base-cased
BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the ...
nomic-ai/nomic-embed-text-v1
libraryname: sentence-transformers pipelinetag: sentence-similarity - feature-extraction - sentence-similarity - mteb - transformers - trans...
google/gemma-3-1b-it
No description available.
timm/resnet50.a1_in1k
Model Type: Image classification / feature backbone - Model Stats: - Params (M): 25.6 - GMACs: 4.1 - Activations (M): 11.1 - Image size: tra...
Salesforce/blip-image-captioning-base
No description available.
intfloat/multilingual-e5-small
- multilingual - af - am - ar - as - az - be - bg - bn - br - bs - ca - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fr - fy ...
MahmoudAshraf/mms-300m-1130-forced-aligner
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-polish
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
w11wo/indonesian-roberta-base-posp-tagger
More information needed...
apple/mobilevit-small
MobileViT is a light-weight, low latency convolutional neural network that combines MobileNetV2-style layers with a new block that replaces ...
Bingsu/yolo-world-mirror
No description available.
laion/CLIP-ViT-B-32-laion2B-s34B-b79K
No description available.
mistralai/Mistral-7B-Instruct-v0.2
No description available.
microsoft/TRELLIS-image-large
No description available.
intfloat/multilingual-e5-base
- mteb - Sentence Transformers - sentence-similarity - sentence-transformers - name: multilingual-e5-base results: - task: type: Classificat...
patrickjohncyh/fashion-clip
UPDATE (10/03/23): We have updated the model! We found that laion/CLIP-ViT-B-32-laion2B-s34B-b79K checkpoint (thanks Bin!) worked better tha...
facebook/dinov2-small
The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...
jonatasgrosman/wav2vec2-large-xlsr-53-greek
No description available.
indonesian-nlp/wav2vec2-indonesian-javanese-sundanese
- id - jv - sun - mozilla-foundation/commonvoice70 - openslr - magicdata - titml - wer - audio - automatic-speech-recognition - hf-asr-leade...
Qwen/Qwen2.5-VL-3B-Instruct
No description available.
openai/clip-vit-base-patch16
The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...
google/siglip-so400m-patch14-384
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
Comfy-Org/z_image_turbo
No description available.
WhereIsAI/UAE-Large-V1
- mteb - sentenceembedding - featureextraction - sentence-transformers - transformers - transformers.js - name: UAE-Large-V1 results: - task...
meta-llama/Meta-Llama-3-8B
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned genera...
sentence-transformers/LaBSE
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-dutch
- commonvoice - mozilla-foundation/commonvoice60 - wer - cer - audio - automatic-speech-recognition - hf-asr-leaderboard - mozilla-foundatio...
answerdotai/JaColBERTv2.5
No description available.
emilyalsentzer/Bio_ClinicalBERT
- fill-mask...
EleutherAI/pythia-160m
Developed by: EleutherAI - Model type: Transformer-based Language Model - Language: English - Learn more: Pythia's GitHub repository for tra...
zai-org/GLM-OCR
No description available.
ibm-granite/granite-timeseries-ttm-r1
TTM falls under the category of βfocused pre-trained modelsβ, wherein each pre-trained TTM is tailored for a particular forecasting setting ...
hustvl/vitmatte-small-composition-1k
ViTMatte is a simple approach to image matting, the task of accurately estimating the foreground object in an image. The model consists of a...
EssentialAI/eai-distill-0.5b
π Website | π₯οΈ Code | π Paper...
stabilityai/stable-diffusion-xl-base-1.0
Developed by: Stability AI - Model type: Diffusion-based text-to-image generative model - License: CreativeML Open RAIL++-M License - Model ...
rhasspy/faster-whisper-tiny-int8
No description available.
google-t5/t5-base
No description available.
zai-org/GLM-5-FP8
No description available.
facebook/bart-large-cnn
BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BAR...
pyannote/segmentation
No description available.
jonatasgrosman/wav2vec2-large-xlsr-53-arabic
No description available.
Qwen/Qwen3-30B-A3B-Instruct-2507
Qwen3-30B-A3B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Numbe...
Comfy-Org/Qwen-Image_ComfyUI
No description available.