iitolstykh

iitolstykh/mivolo_v2

No description available.

Model Documentation

Model Card for MiVOLO V2 model



🤗 Space | 🌐 Github | 📜 MiVOLO Paper (2023) 📜 MiVOLO Paper (2024)



We introduce state-of-the-art multi-input transformer for age and gender estimation.

This model was trained on proprietary and open-source datasets.

MiVOLO V1 (224x224) architecture:

img

Inference Requirements and Model Introduction



+ Resolution: Width and height of face/body crops must be 384px + Precision: FP32 / FP16 + mivolo library
shell
pip install git+https://github.com/WildChlamydia/MiVOLO.git
+ transformers==4.51.0 + accelerate==1.8.1

Quick start



python
from transformers import AutoModelForImageClassification, AutoConfig, AutoImageProcessor
import torch
import cv2
import numpy as np
import requests

load model and image processor

config = AutoConfig.from_pretrained( "iitolstykh/mivolo_v2", trust_remote_code=True ) mivolo_model = AutoModelForImageClassification.from_pretrained( "iitolstykh/mivolo_v2", trust_remote_code=True, torch_dtype=torch.float16 ) image_processor = AutoImageProcessor.from_pretrained( "iitolstykh/mivolo_v2", trust_remote_code=True )

download test image

resp = requests.get('https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg') arr = np.asarray(bytearray(resp.content), dtype=np.uint8) image = cv2.imdecode(arr, -1)

face crops

x1, y1, x2, y2 = [625, 46, 686, 121] faces_crops = [image[y1:y2, x1:x2]]

may be [None] if bodies_crops is not None



body crops

x1, y1, x2, y2 = [534, 16, 790, 559] bodies_crops = [image[y1:y2, x1:x2]]

may be [None] if faces_crops is not None



prepare BGR inputs

faces_input = image_processor(images=faces_crops)["pixel_values"] body_input = image_processor(images=bodies_crops)["pixel_values"]

faces_input = faces_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device) body_input = body_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device)

inference

output = mivolo_model(faces_input=faces_input, body_input=body_input)

print results

age = output.age_output[0].item() print(f"age: {round(age, 2)}")

id2label = config.gender_id2label gender = id2label[output.gender_class_idx[0].item()] gender_prob = output.gender_probs[0].item() print(f"gender: {gender} [{int(gender_prob * 100)}%]")



Model Metrics



Model Test Dataset Age Accuracy Gender Accuracy
mivolov2_384x384 (fp16) Adience 70.2 97.3


Citation



🌟 If you find our work helpful, please consider citing our papers and leaving valuable stars

bibtex
@article{mivolo2023,
   Author = {Maksim Kuprashevich and Irina Tolstykh},
   Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
   Year = {2023},
   Eprint = {arXiv:2307.04616},
}
bibtex
@article{mivolo2024,
   Author = {Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh},
   Title = {Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation},
   Year = {2024},
   Eprint = {arXiv:2403.02302},
}


License



Please, see here.

Files & Weights

FilenameSizeAction
model.safetensors 0.11 GB