Nirman.online | Premium AI Directory

facebook

facebook/dinov2-small

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 2,794,873

google

google/vit-base-patch16-224-in21k

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, ...

🔎 image-feature-extraction 1,218,566

facebook

facebook/dinov2-base

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 1,132,921

facebook

facebook/dinov2-large

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 890,681

facebook

facebook/dinov3-vitb16-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 653,727

facebook

facebook/dinov3-vitl16-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 577,514

timm

timm/vit_small_patch14_reg4_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 22.1 - GMACs: 29.6 - Activations (M): 57.5 - Image size: 51...

🔎 image-feature-extraction 444,794

timm

timm/vit_base_patch14_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 86.6 - GMACs: 151.7 - Activations (M): 397.6 - Image size: ...

🔎 image-feature-extraction 384,381

facebook

facebook/dino-vitb16

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 373,964

nomic-ai

nomic-ai/nomic-embed-vision-v1.5

No description available.

🔎 image-feature-extraction 318,151

timm

timm/vit_large_patch14_reg4_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 304.4 - GMACs: 416.1 - Activations (M): 305.3 - Image size:...

🔎 image-feature-extraction 173,929

timm

timm/vit_base_patch14_reg4_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 86.6 - GMACs: 117.5 - Activations (M): 115.0 - Image size: ...

🔎 image-feature-extraction 158,416

timm

timm/vit_base_patch16_clip_224.openai

The CLIP model was developed by researchers at OpenAI to learn about what contributes to robustness in computer vision tasks. The model was ...

🔎 image-feature-extraction 155,191

facebook

facebook/dinov3-vits16-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 143,268

timm

timm/vit_base_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 85.6 - GMACs: 23.6 - Activations (M): 34.1 - Image size: 256 x 256 - Original...

🔎 image-feature-extraction 140,987

facebook

facebook/dinov2-giant

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 139,919

timm

timm/vit_large_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 303.1 - GMACs: 82.4 - Activations (M): 90.6 - Image size: 256 x 256 - Origina...

🔎 image-feature-extraction 118,450

facebook

facebook/dinov3-vith16plus-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 109,803

facebook

facebook/dino-vits16

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a self-supervised fash...

🔎 image-feature-extraction 102,934

facebook

facebook/dinov2-with-registers-base

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) originally introduced to do supervised image classification on Image...

🔎 image-feature-extraction 92,500

paige-ai

paige-ai/Virchow2

Developed by: Paige, NYC, USA and Microsoft Research, Cambridge, MA USA - Model Type: Image feature backbone - Model Stats: - Params (M): 63...

🔎 image-feature-extraction 90,934

timm

timm/vit_large_patch14_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 304.4 - GMACs: 507.1 - Activations (M): 1058.8 - Image size...

🔎 image-feature-extraction 84,577

microsoft

microsoft/rad-dino

RAD-DINO is described in detail in Exploring Scalable Medical Image Encoders Beyond Text Supervision (F. Pérez-García, H. Sharma, S. Bond-Ta...

🔎 image-feature-extraction 83,274

timm

timm/vit_small_patch14_dinov2.lvd142m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 22.1 - GMACs: 46.8 - Activations (M): 198.8 - Image size: 5...

🔎 image-feature-extraction 77,131

timm

timm/samvit_base_patch16.sa1b

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 89.7 - GMACs: 486.4 - Activations (M): 1343.3 - Image size:...

🔎 image-feature-extraction 76,783

facebook

facebook/dinov3-vits16plus-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 63,664

timm

timm/vit_base_patch16_224.orig_in21k

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 85.8 - GMACs: 16.9 - Activations (M): 16.5 - Image size: 22...

🔎 image-feature-extraction 63,415

google

google/vit-large-patch16-224-in21k

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, ...

🔎 image-feature-extraction 59,525

timm

timm/vit_small_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 21.6 - GMACs: 6.3 - Activations (M): 17.0 - Image size: 256 x 256 - Original:...

🔎 image-feature-extraction 55,647

prov-gigapath

prov-gigapath/prov-gigapath

Overview of Prov-GigaPath model architecture...

🔎 image-feature-extraction 54,451

timm

timm/convnext_tiny.dinov3_lvd1689m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 27.8 - GMACs: 4.5 - Activations (M): 13.4 - Image size: 224...

🔎 image-feature-extraction 47,206

MahmoodLab

MahmoodLab/TITAN

Developed by: Mahmood Lab AI for Pathology @ Harvard/BWH - Model type: Pretrained vision-language encoders - Pretraining dataset: Mass-340K,...

🔎 image-feature-extraction 46,464

MahmoodLab

MahmoodLab/UNI2-h

Developed by: Mahmood Lab AI for Pathology @ Harvard/BWH - Model type: Pretrained vision backbone (ViT-H/14 via DINOv2) for multi-purpose ev...

🔎 image-feature-extraction 45,846

facebook

facebook/dinov3-convnext-base-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 44,724

MahmoodLab

MahmoodLab/CONCH

No description available.

🔎 image-feature-extraction 43,428

timm

timm/vit_base_patch16_224.dino

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 85.8 - GMACs: 16.9 - Activations (M): 16.5 - Image size: 22...

🔎 image-feature-extraction 42,860

TTPlanet

TTPlanet/TTPLanet_SDXL_Controlnet_Tile_Realistic

Here's a refined version of the update notes for the Tile V2: -Introducing the new Tile V2, enhanced with a vastly improved training dataset...

🔎 image-feature-extraction 41,832

google

google/vit-huge-patch14-224-in21k

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, ...

🔎 image-feature-extraction 40,376

StanfordAIMI

StanfordAIMI/dinov2-base-xray-224

AIMI FMs: A Collection of Foundation Models in Radiology...

🔎 image-feature-extraction 39,752

timm

timm/vit_large_patch16_siglip_256.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 39,702

histai

histai/hibou-L

No description available.

🔎 image-feature-extraction 38,858

facebook

facebook/dinov3-convnext-small-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 34,800

timm

timm/vit_base_patch16_siglip_512.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 28,390

PIA-SPACE-LAB

PIA-SPACE-LAB/dinov3-vitl-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 27,970

facebook

facebook/dinov3-convnext-tiny-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 27,034

timm

timm/vit_small_patch16_224.dino

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 21.7 - GMACs: 4.3 - Activations (M): 8.2 - Image size: 224 ...

🔎 image-feature-extraction 26,956

timm

timm/vit_large_patch16_dinov3.sat493m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 303.1 - GMACs: 82.4 - Activations (M): 90.6 - Image size: 256 x 256 - Origina...

🔎 image-feature-extraction 22,929

timm

timm/vit_so400m_patch16_siglip_512.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 22,359

timm

timm/convnextv2_tiny.fcmae

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 27.9 - GMACs: 4.5 - Activations (M): 13.4 - Image size: 224...

🔎 image-feature-extraction 20,596

nvidia

nvidia/RADIO-L

No description available.

🔎 image-feature-extraction 20,006

bioptimus

bioptimus/H-optimus-0

- image-feature-extraction - timm - pathology - histology - medical imaging - self-supervised learning - vision transformer - foundation mod...

🔎 image-feature-extraction 19,876

timm

timm/convnext_small.dinov3_lvd1689m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 49.5 - GMACs: 8.7 - Activations (M): 21.6 - Image size: 224...

🔎 image-feature-extraction 18,813

timm

timm/naflexvit_so400m_patch16_siglip.v2_webli

No description available.

🔎 image-feature-extraction 18,510

Lin-Chen

Lin-Chen/ShareGPT4V-7B_Pretrained_vit-large336-l12

Model type: This is the vision tower of ShareGPT4V-7B fine-tuned with our ShareGPT4V dataset. Model date: This vision tower was trained in N...

🔎 image-feature-extraction 17,755

py-feat

py-feat/img2pose

img2pose uses Faster R-CNN to predict 6 Degree of Freedom Pose (DoF) for all faces in the photo. An interesting property of this model is th...

🔎 image-feature-extraction 17,476

py-feat

py-feat/resmasknet

resmasknet combines residual masking with unet architecture to predict 7 facial emotion categories from images....

🔎 image-feature-extraction 16,932

timm

timm/aimv2_large_patch14_224.apple_pt_dist

No description available.

🔎 image-feature-extraction 16,862

facebook

facebook/dinov2-with-registers-large

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) originally introduced to do supervised image classification on Image...

🔎 image-feature-extraction 16,758

MahmoodLab

MahmoodLab/UNI

Developed by: Mahmood Lab AI for Pathology @ Harvard/BWH - Model type: Pretrained vision backbone (ViT-L/16 via DINOv2) for multi-purpose ev...

🔎 image-feature-extraction 16,291

timm

timm/vit_base_patch16_224.mae

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 85.8 - GMACs: 17.6 - Activations (M): 23.9 - Image size: 22...

🔎 image-feature-extraction 15,941

DAMO-NLP-SG

DAMO-NLP-SG/VL3-SigLIP-NaViT

No description available.

🔎 image-feature-extraction 15,932

microsoft

microsoft/rad-dino-maira-2

RAD-DINO-MAIRA-2 is a vision transformer model trained to encode chest X-rays using the self-supervised learning method DINOv2. RAD-DINO-MAI...

🔎 image-feature-extraction 15,462

timm

timm/vit_base_patch16_siglip_256.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 15,189

owkin

owkin/phikon-v2

Developed by: Owkin, Inc - Model type: Pretrained vision backbone (ViT-L/16 via DINOv2) - Pretraining dataset: PANCAN-XL, sourced from publi...

🔎 image-feature-extraction 14,813

timm

timm/vit_so400m_patch14_siglip_384.webli

No description available.

🔎 image-feature-extraction 14,479

timm

timm/vit_small_plus_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 28.7 - GMACs: 8.1 - Activations (M): 21.8 - Image size: 256 x 256 - Original:...

🔎 image-feature-extraction 14,037

Xenova

Xenova/dinov2-small

No description available.

🔎 image-feature-extraction 13,213

timm

timm/convnext_base.dinov3_lvd1689m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 87.6 - GMACs: 15.4 - Activations (M): 28.8 - Image size: 22...

🔎 image-feature-extraction 12,768

facebook

facebook/dinov3-vitl16-pretrain-sat493m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 12,478

timm

timm/vit_so400m_patch16_siglip_256.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 12,225

timm

timm/vit_small_patch8_224.dino

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 21.7 - GMACs: 16.8 - Activations (M): 32.9 - Image size: 22...

🔎 image-feature-extraction 12,066

facebook

facebook/dinov2-with-registers-small

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) originally introduced to do supervised image classification on Image...

🔎 image-feature-extraction 11,868

facebook

facebook/dinov3-vit7b16-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 11,671

timm

timm/vit_base_patch16_siglip_224.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 10,971

timm

timm/vit_huge_patch14_clip_224.laion2b

No description available.

🔎 image-feature-extraction 10,935

paige-ai

paige-ai/Virchow

Developed by: Paige, NYC, USA and Microsoft Research, Cambridge, MA USA - Model Type: Image feature backbone - Model Stats: - Params (M): 63...

🔎 image-feature-extraction 10,092

gwkrsrch2

gwkrsrch2/siglip2-so400m-patch16-384

No description available.

🔎 image-feature-extraction 9,465

timm

timm/eva02_base_patch14_224.mim_in22k

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 85.8 - GMACs: 23.2 - Activations (M): 36.6 - Image size: 22...

🔎 image-feature-extraction 8,498

timm

timm/vit_base_patch16_dinov3_qkvb.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 85.7 - GMACs: 23.6 - Activations (M): 34.1 - Image size: 256 x 256 - Original...

🔎 image-feature-extraction 8,241

timm

timm/vit_so400m_patch14_siglip_224.v2_webli

Dataset: webli - Papers: - SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Fea...

🔎 image-feature-extraction 8,182

timm

timm/vit_large_patch16_224.mae

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 303.3 - GMACs: 61.6 - Activations (M): 63.5 - Image size: 2...

🔎 image-feature-extraction 7,581

timm

timm/vit_7b_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 6716.0 - GMACs: 1775.1 - Activations (M): 515.9 - Image size: 256 x 256 - Ori...

🔎 image-feature-extraction 7,511

timm

timm/convnextv2_base.fcmae

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 87.7 - GMACs: 15.4 - Activations (M): 28.8 - Image size: 22...

🔎 image-feature-extraction 7,313

yujiepan

yujiepan/tiny-random-swin-patch4-window7-224

No description available....

🔎 image-feature-extraction 6,907

timm

timm/vit_huge_plus_patch16_dinov3.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 840.5 - GMACs: 224.9 - Activations (M): 193.6 - Image size: 256 x 256 - Origi...

🔎 image-feature-extraction 6,557

bioptimus

bioptimus/H-optimus-1

- image-feature-extraction - timm - pathology - histology - medical imaging - self-supervised learning - vision transformer - foundation mod...

🔎 image-feature-extraction 6,456

timm

timm/vit_base_patch8_224.dino

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 85.8 - GMACs: 66.9 - Activations (M): 65.7 - Image size: 22...

🔎 image-feature-extraction 6,327

timm

timm/sam2_hiera_small.fb_r896_2pt1

No description available.

🔎 image-feature-extraction 6,290

facebook

facebook/dinov3-convnext-large-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 6,006

google

google/vit-base-patch32-224-in21k

The Vision Transformer (ViT) is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, ...

🔎 image-feature-extraction 5,964

hi-wesley

hi-wesley/gemma3-vision-encoder

No description available.

🔎 image-feature-extraction 5,598

camenduru

camenduru/dinov3-vitl16-pretrain-lvd1689m

These are Vision Transformer and ConvNeXt models trained following the method described in the DINOv3 paper. 12 models are provided: - 10 mo...

🔎 image-feature-extraction 5,212

facebook

facebook/ijepa_vith14_1k

No description available.

🔎 image-feature-extraction 4,892

timm

timm/convnext_large_mlp.clip_laion2b_ft_soup_320

No description available.

🔎 image-feature-extraction 4,824

timm

timm/vit_small_patch16_dinov3_qkvb.lvd1689m

Model Type: Image Feature Encoder - Model Stats: - Params (M): 21.6 - GMACs: 6.3 - Activations (M): 17.0 - Image size: 256 x 256 - Original:...

🔎 image-feature-extraction 4,709

bioptimus

bioptimus/H0-mini

- image-feature-extraction - timm - pathology - histology - medical imaging - self-supervised learning - vision transformer - foundation mod...

🔎 image-feature-extraction 4,671

timm

timm/convnext_large.dinov3_lvd1689m

Model Type: Image classification / feature backbone - Model Stats: - Params (M): 196.2 - GMACs: 34.4 - Activations (M): 43.1 - Image size: 2...

🔎 image-feature-extraction 4,602

Xenova

Xenova/dino-vits16

No description available.

🔎 image-feature-extraction 4,554

timm

timm/vit_base_patch32_clip_224.laion2b

No description available.

🔎 image-feature-extraction 4,077

OpenGVLab

OpenGVLab/InternViT-300M-448px

Model Type: vision foundation model, feature backbone - Model Stats: - Params (M): 304 - Image size: 448 x 448, training with 1 - 12 tiles -...

🔎 image-feature-extraction 4,002

Results for "image-feature-extraction"

facebook/dinov2-small

google/vit-base-patch16-224-in21k

facebook/dinov2-base

facebook/dinov2-large

facebook/dinov3-vitb16-pretrain-lvd1689m

facebook/dinov3-vitl16-pretrain-lvd1689m

timm/vit_small_patch14_reg4_dinov2.lvd142m

timm/vit_base_patch14_dinov2.lvd142m

facebook/dino-vitb16

nomic-ai/nomic-embed-vision-v1.5

timm/vit_large_patch14_reg4_dinov2.lvd142m

timm/vit_base_patch14_reg4_dinov2.lvd142m

timm/vit_base_patch16_clip_224.openai

facebook/dinov3-vits16-pretrain-lvd1689m

timm/vit_base_patch16_dinov3.lvd1689m

facebook/dinov2-giant

timm/vit_large_patch16_dinov3.lvd1689m

facebook/dinov3-vith16plus-pretrain-lvd1689m

facebook/dino-vits16

facebook/dinov2-with-registers-base

paige-ai/Virchow2

timm/vit_large_patch14_dinov2.lvd142m

microsoft/rad-dino

timm/vit_small_patch14_dinov2.lvd142m

timm/samvit_base_patch16.sa1b

facebook/dinov3-vits16plus-pretrain-lvd1689m

timm/vit_base_patch16_224.orig_in21k

google/vit-large-patch16-224-in21k

timm/vit_small_patch16_dinov3.lvd1689m

prov-gigapath/prov-gigapath

timm/convnext_tiny.dinov3_lvd1689m

MahmoodLab/TITAN

MahmoodLab/UNI2-h

facebook/dinov3-convnext-base-pretrain-lvd1689m

MahmoodLab/CONCH

timm/vit_base_patch16_224.dino

TTPlanet/TTPLanet_SDXL_Controlnet_Tile_Realistic

google/vit-huge-patch14-224-in21k

StanfordAIMI/dinov2-base-xray-224

timm/vit_large_patch16_siglip_256.v2_webli

histai/hibou-L

facebook/dinov3-convnext-small-pretrain-lvd1689m

timm/vit_base_patch16_siglip_512.v2_webli

PIA-SPACE-LAB/dinov3-vitl-pretrain-lvd1689m

facebook/dinov3-convnext-tiny-pretrain-lvd1689m

timm/vit_small_patch16_224.dino

timm/vit_large_patch16_dinov3.sat493m

timm/vit_so400m_patch16_siglip_512.v2_webli

timm/convnextv2_tiny.fcmae

nvidia/RADIO-L

bioptimus/H-optimus-0

timm/convnext_small.dinov3_lvd1689m

timm/naflexvit_so400m_patch16_siglip.v2_webli

Lin-Chen/ShareGPT4V-7B_Pretrained_vit-large336-l12

py-feat/img2pose

py-feat/resmasknet

timm/aimv2_large_patch14_224.apple_pt_dist

facebook/dinov2-with-registers-large

MahmoodLab/UNI

timm/vit_base_patch16_224.mae

DAMO-NLP-SG/VL3-SigLIP-NaViT

microsoft/rad-dino-maira-2

timm/vit_base_patch16_siglip_256.v2_webli

owkin/phikon-v2

timm/vit_so400m_patch14_siglip_384.webli

timm/vit_small_plus_patch16_dinov3.lvd1689m

Xenova/dinov2-small

timm/convnext_base.dinov3_lvd1689m

facebook/dinov3-vitl16-pretrain-sat493m

timm/vit_so400m_patch16_siglip_256.v2_webli

timm/vit_small_patch8_224.dino

facebook/dinov2-with-registers-small

facebook/dinov3-vit7b16-pretrain-lvd1689m

timm/vit_base_patch16_siglip_224.v2_webli

timm/vit_huge_patch14_clip_224.laion2b

paige-ai/Virchow

gwkrsrch2/siglip2-so400m-patch16-384

timm/eva02_base_patch14_224.mim_in22k

timm/vit_base_patch16_dinov3_qkvb.lvd1689m