Results for "zero-shot-image-classification"
100 matches found.
openai/clip-vit-base-patch32
The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...
openai/clip-vit-large-patch14
The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...
openai/clip-vit-large-patch14-336
More information needed...
laion/CLIP-ViT-B-32-DataComp.XL-s13B-b90K
No description available.
laion/CLIP-ViT-B-32-laion2B-s34B-b79K
No description available.
patrickjohncyh/fashion-clip
UPDATE (10/03/23): We have updated the model! We found that laion/CLIP-ViT-B-32-laion2B-s34B-b79K checkpoint (thanks Bin!) worked better tha...
openai/clip-vit-base-patch16
The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...
google/siglip-so400m-patch14-384
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
google/siglip-base-patch16-224
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
google/siglip2-so400m-patch16-naflex
No description available.
google/siglip2-base-patch16-naflex
No description available.
Marqo/marqo-fashionSigLIP
No description available.
laion/CLIP-ViT-H-14-laion2B-s32B-b79K
No description available.
laion/CLIP-ViT-L-14-laion2B-s32B-b82K
No description available.
laion/CLIP-convnext_base_w-laion2B-s13B-b82K-augreg
No description available.
microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
- clip - biology - medical libraryname: openclip - src: https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT256-vitbasepatch16224/resolve/...
google/siglip2-so400m-patch14-384
No description available.
facebook/PE-Core-L14-336
[\[π Tech Report\]](https://arxiv.org/abs/2504.13181) [\[π Github\]](https://github.com/facebookresearch/perceptionmodels/) Perception Encod...
google/siglip2-base-patch16-224
No description available.
timm/ViT-B-16-SigLIP-i18n-256
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/google-research/bigvision - Dataset: WebL...
timm/MobileCLIP2-S3-OpenCLIP
These weights and model card are adapted from the original Apple model at https://huggingface.co/apple/MobileCLIP2-S3. This version uses can...
laion/CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup
No description available.
wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M
No description available.
yuvalkirstain/PickScore_v1
No description available.
q-future/one-align
No description available.
laion/CLIP-ViT-B-16-laion2B-s34B-b88K
No description available.
google/siglip2-so400m-patch16-384
No description available.
timm/vit_base_patch16_plus_clip_240.laion400m_e31
dataset: LAION-400M...
timm/ViT-B-16-SigLIP2-256
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
google/siglip-base-patch16-256
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
google/siglip2-so400m-patch14-224
No description available.
BAAI/AltCLIP
No description available.
apple/MobileCLIP-S2-OpenCLIP
No description available.
timm/vit_base_patch32_clip_224.laion400m_e32
dataset: LAION-400M...
imageomics/bioclip
No description available.
google/siglip2-giant-opt-patch16-384
No description available.
Xenova/clip-vit-base-patch32
No description available.
vinid/plip
No description available.
google/siglip2-base-patch16-512
No description available.
timm/ViT-SO400M-14-SigLIP-384
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/google-research/bigvision - Dataset: WebL...
google/siglip-base-patch16-384
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
OFA-Sys/chinese-clip-vit-base-patch16
No description available.
flaviagiammarino/pubmed-clip-vit-base-patch32
PubMedCLIP was trained on the Radiology Objects in COntext (ROCO) dataset, a large-scale multimodal medical imaging dataset. The ROCO datase...
Marqo/marqo-fashionCLIP
No description available.
google/siglip2-base-patch16-256
No description available.
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
No description available.
timm/vit_large_patch14_clip_336.openai
The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...
google/siglip2-large-patch16-512
No description available.
google/siglip2-large-patch16-384
No description available.
google/siglip2-large-patch16-256
No description available.
facebook/metaclip-b32-400m
The Demystifying CLIP Data paper aims to reveal CLIPβs method around training data curation. OpenAI never open-sourced code regarding their ...
timm/ViT-SO400M-14-SigLIP
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/google-research/bigvision - Dataset: WebL...
timm/ViT-B-32-SigLIP2-256
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
timm/vit_large_patch14_clip_224.metaclip_2pt5b
dataset: MetaCLIP-2.5B...
google/siglip2-base-patch16-384
No description available.
laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K
No description available.
google/siglip2-so400m-patch16-512
No description available.
laion/CLIP-ViT-L-14-CommonPool.XL-s13B-b90K
No description available.
timm/resnet50_clip.openai
No description available.
timm/vit_base_patch32_clip_224.laion400m_e31
dataset: LAION-400M...
timm/ViT-B-16-SigLIP2-512
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
LanguageBind/LanguageBind_Image
γICLR 2024 π₯γLanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment If you like our project, ...
google/medsiglip-448
No description available.
google/siglip2-so400m-patch16-256
No description available.
laion/CLIP-convnext_xxlarge-laion2B-s34B-b82K-augreg-soup
No description available.
UCSC-VLAA/ViT-L-16-HTxt-Recap-CLIP
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/UCSC-VLAA/Recap-DataComp-1B - Dataset: ht...
LanguageBind/LanguageBind_Video_merge
γICLR 2024 π₯γLanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment If you like our project, ...
facebook/PE-Core-G14-448
[\[π Tech Report\]](https://arxiv.org/abs/2504.13181) [\[π Github\]](https://github.com/facebookresearch/perceptionmodels/) Perception Encod...
timm/ViT-B-16-SigLIP-256
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/google-research/bigvision - Dataset: WebL...
imageomics/bioclip-2
No description available.
google/siglip-large-patch16-384
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
timm/ViT-SO400M-16-SigLIP2-512
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
zer0int/CLIP-GmP-ViT-L-14
π€ New and greatly improved version of the model, check out: - π https://huggingface.co/zer0int/CLIP-KO-LITE-TypoAttack-Attn-Dropout-ViT-L-14...
timm/ViT-SO400M-16-SigLIP2-384
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
timm/ViT-SO400M-14-SigLIP2
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
facebook/metaclip-2-worldwide-huge-quickgelu
No description available.
timm/eva02_enormous_patch14_plus_clip_224.laion2b_s9b_b144k
No description available.
Xenova/clip-vit-base-patch16
No description available.
Salesforce/blip2-itm-vit-g
No description available.
timm/ViT-B-16-SigLIP
Model Type: Contrastive Image-Text, Zero-Shot Image Classification. - Original: https://github.com/google-research/bigvision - Dataset: WebL...
timm/ViT-B-16-SigLIP2
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
qihoo360/fg-clip2-so400m
No description available.
google/siglip-base-patch16-512
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
timm/PE-Core-L-14-336
This is an OpenCLIP (image + text) remaped version of the the original...
timm/ViT-gopt-16-SigLIP2-384
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
wkcn/TinyCLIP-ViT-61M-32-Text-29M-LAION400M
No description available.
google/siglip-so400m-patch14-224
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
google/siglip-base-patch16-256-multilingual
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...
timm/eva02_base_patch16_clip_224.merged2b_s8b_b131k
No description available.
timm/eva02_large_patch14_clip_224.merged2b_s4b_b131k
No description available.
yujiepan/clip-vit-tiny-random-patch14-336
No description available.
laion/CLIP-ViT-B-16-DataComp.XL-s13B-b90K
No description available.
kakaobrain/align-base
No description available.
timm/ViT-SO400M-16-SigLIP2-256
A SigLIP 2 Vision-Lanuage model trained on WebLI. This model has been converted for use in OpenCLIP from the original JAX checkpoints in Big...
laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K
No description available.
facebook/metaclip-h14-fullcc2.5b
The Demystifying CLIP Data paper aims to reveal CLIPβs method around training data curation. OpenAI never open-sourced code regarding their ...
laion/CLIP-ViT-g-14-laion2B-s34B-b88K
No description available.
laion/CLIP-convnext_base_w-laion2B-s13B-b82K
No description available.
wisdomik/QuiltNet-B-32
- zero-shot-image-classification - clip - vision - language - histopathology - histology - medical librarytag: openclip - src: >- https://qu...
google/siglip-large-patch16-256
SigLIP is CLIP, a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a...