convnext convolutional neural network simpool dino computer vision deep learning

Self-supervised ConvNeXt-S model

ConvNeXt-S official model trained on ImageNet-1k for 100 epochs. Self-supervision with DINO. Reproduced for ICCV 2023 SimPool paper.

SimPool is a simple attention-based pooling method at the end of network, released in this repository. Disclaimer: This model card is written by the author of SimPool, i.e. Bill Psomas.

Evaluation with k-NN

k top1 top5
10 59.342 80.058
20 59.224 82.252
100 56.468 83.256
200 54.878 82.754

BibTeX entry and citation info

@misc{psomas2023simpool,
      title={Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?}, 
      author={Bill Psomas and Ioannis Kakogeorgiou and Konstantinos Karantzalos and Yannis Avrithis},
      year={2023},
      eprint={2309.06891},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
@inproceedings{liu2022convnet,
  title={A convnet for the 2020s},
  author={Liu, Zhuang and Mao, Hanzi and Wu, Chao-Yuan and Feichtenhofer, Christoph and Darrell, Trevor and Xie, Saining},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={11976--11986},
  year={2022}
}
@inproceedings{caron2021emerging,
  title={Emerging properties in self-supervised vision transformers},
  author={Caron, Mathilde and Touvron, Hugo and Misra, Ishan and J{\'e}gou, Herv{\'e} and Mairal, Julien and Bojanowski, Piotr and Joulin, Armand},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={9650--9660},
  year={2021}
}