Model card for vit_small_patch16_256.tcga_brca_dino
A Vision Transformer (ViT) image classification model.
Trained on 2M histology patches from TCGA-BRCA.
Model Details
- Model Type: Feature backbone
- Model Stats:
- Params (M): 21.7
- Image size: 256 x 256 x 3
- Papers:
- Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology: https://arxiv.org/abs/2203.00585
- Dataset: TCGA BRCA: https://portal.gdc.cancer.gov/
- Original: https://github.com/Richarizardd/Self-Supervised-ViT-Path/
- License: GPLv3
Model Usage
Image Embeddings
from urllib.request import urlopen
from PIL import Image
import timm
# get example histology image
img = Image.open(
urlopen(
"https://github.com/owkin/HistoSSLscaling/raw/main/assets/example.tif"
)
)
# load model from the hub
model = timm.create_model(
model_name="hf-hub:1aurent/vit_small_patch16_256.tcga_brca_dino",
pretrained=True,
).eval()
# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor
Citation
@misc{chen2022selfsupervised,
title = {Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
author = {Richard J. Chen and Rahul G. Krishnan},
year = {2022},
eprint = {2203.00585},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}