Results for "video-classification"
51 matches found.
microsoft/xclip-base-patch32
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
google/videoprism-base-f16r288
We release the following model variants: | Model Name | Configuration Name | Model Type | Backbone | #Params | File Size | Checkpoint | | --...
ai-forever/kandinsky-videomae-large-camera-motion
VideoMAE model(`large`) variant that has been finetuned for multi-label video classification (a video can belong to multiple classes simulta...
google/videoprism-lvt-base-f16r288
We release the following model variants: | Model Name | Configuration Name | Model Type | Backbone | #Params | File Size | Checkpoint | | --...
facebook/vjepa2-vitg-fpc64-256
No description available.
facebook/timesformer-base-finetuned-k400
No description available.
facebook/vjepa2-vitl-fpc64-256
No description available.
MCG-NJU/videomae-base
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
google/vivit-b-16x2-kinetics400
ViViT is an extension of the Vision Transformer (ViT) to video. We refer to the paper for details....
MCG-NJU/videomae-base-finetuned-kinetics
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
facebook/vjepa2-vitg-fpc64-384-ssv2
No description available.
microsoft/xclip-base-patch16-16-frames
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
facebook/vjepa2-vitg-fpc64-384
No description available.
facebook/vjepa2-vitl-fpc16-256-ssv2
No description available.
OpenGVLab/VideoMAEv2-Base
No description available.
microsoft/xclip-base-patch32-16-frames
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
google/vivit-b-16x2
ViViT is an extension of the Vision Transformer (ViT) to video. We refer to the paper for details....
google/videoprism-lvt-large-f8r288
We release the following model variants: | Model Name | Configuration Name | Model Type | Backbone | #Params | File Size | Checkpoint | | --...
OpenGVLab/VideoMAEv2-Large
No description available.
microsoft/xclip-large-patch14
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
microsoft/xclip-base-patch16
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
facebook/timesformer-base-finetuned-k600
No description available.
microsoft/xclip-base-patch16-zero-shot
X-CLIP is a minimal extension of CLIP for general video-language understanding. The model is trained in a contrastive way on (video, text) p...
facebook/vjepa2-vith-fpc64-256
No description available.
MCG-NJU/videomae-large
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
OpenGVLab/VideoMAEv2-Huge
No description available.
OpenGVLab/InternVideo2-Stage2_6B
No description available.
MCG-NJU/videomae-small-finetuned-kinetics
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
MCG-NJU/videomae-large-finetuned-kinetics
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
MCG-NJU/videomae-base-finetuned-ssv2
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
MCG-NJU/videomae-huge-finetuned-kinetics
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
google/videoprism-large-f8r288
We release the following model variants: | Model Name | Configuration Name | Model Type | Backbone | #Params | File Size | Checkpoint | | --...
facebook/timesformer-base-finetuned-ssv2
No description available.
qubvel-hf/vjepa2-vitl-fpc16-256-ssv2
No description available.
OpenGVLab/VideoMAEv2-giant
No description available.
MCG-NJU/videomae-small-finetuned-ssv2
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
facebook/vjepa2-vitl-fpc32-256-diving48
No description available.
ttyh/videomae-base-finetuned-ucf101-subset
More information needed...
mitegvg/videomae-tiny-92-kinetics-binary-finetuned-xd-violence
More information needed...
MCG-NJU/videomae-base-ssv2
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
MCG-NJU/videomae-base-short
VideoMAE is an extension of Masked Autoencoders (MAE) to video. The architecture of the model is very similar to that of a standard Vision T...
Nikeytas/videomae-crime-detector-ultra-v1
No description available.
KhoiBui/tiktok-video-safety-classifier
Base Model: VideoMAE (MCG-NJU/videomae-base-finetuned-kinetics) - Task: Binary classification (safe/harmful) - Input: 16 frames, 224x224...
qualcomm/ResNet-Mixed-Convolution
Model Type: Modelusecase.videoclassification Model Stats: - Model checkpoint: Kinetics-400 - Input resolution: 112x112 - Number of parameter...
DanJoshua/videomae-base-finetuned-rwf2000-subset
More information needed...
Naman712/Deep-fake-detection
No description available.
Shawon16/timesformer_wlasl_100_200ep_coR_
More information needed...
nateraw/videomae-base-finetuned-ucf101-subset
More information needed...
Ammar2k/videomae-base-finetuned-deepfake-subset
No description available....
muneeb1812/videomae-base-fake-video-classification
More information needed...
TanAlexanderlz/ALL_RGBCROP_ori16F-8B16F-GACWD1
More information needed...