Model Card for Model ID
This model is vit-base-patch16-224-in21k fine-tuned with a subset of the 2021 Kaggle Google Landmark Dataset competition, including only the top 51 categories. The dataset is available as Hugginface dataset on: https://huggingface.co/datasets/pemujo/GLDv2_Top_51_Categories
- Developed by: Pedro Melendez
- Model type: Vision transformer
- Finetuned from model: vit-base-patch16-224-in21k
Training Data
Classes with more than 500 images in the 2021 Kaggle Google Landmark competition https://huggingface.co/datasets/pemujo/GLDv2_Top_51_Categories
Results
epoch | 4 |
eval_accuracy | 0.97411 |
eval_loss | 0.11560 |
eval_runtime (secs) | 79.0939 |
eval_samples_per_second | 115.255 |
eval_steps_per_second | 14.413 |
train_runtime (secs) | 4082.92 |
train_samples_per_second | 35.722 |
train_steps_per_second | 2.233 |
Environmental Impact
- Hardware Type: Nvidia V100
- Minutes used: 68 Minutes
- Cloud Provider: Google Cloud
- Compute Region: us-central
Compute Infrastructure
Google Cloud Workbench Instance
Hardware
GCP Workbench n1-highmem-8 instance with Nvidia V100 GPU
Software
Python 3.9 Pytorch 2.0.1+cu117