segformer-b0-finetuned-segments-stamp-verification

This model is a fine-tuned version of nvidia/mit-b0 on the bilal01/stamp-verification dataset. It achieves the following results on the evaluation set:

Loss: 0.0372
Mean Iou: 0.1908
Mean Accuracy: 0.3817
Overall Accuracy: 0.3817
Accuracy Unlabeled: nan
Accuracy Stamp: 0.3817
Iou Unlabeled: 0.0
Iou Stamp: 0.3817

Model description

The StampSegNet is a semantic segmentation model fine-tuned on a custom dataset specifically designed for stamp segmentation. It is based on the powerful Hugging Face framework and utilizes deep learning techniques to accurately and efficiently segment stamps from images.

The model has been trained to identify and classify different regions of an image as either belonging to a stamp or not. By leveraging its understanding of stamp-specific features such as intricate designs, borders, and distinct colors, the StampSegNet is capable of producing pixel-level segmentation maps that highlight the exact boundaries of stamps within an image.

Intended uses & limitations

Stamp Collection Management: The StampSegNet model can be used by stamp collectors and enthusiasts to automatically segment stamps from images. It simplifies the process of organizing and cataloging stamp collections by accurately identifying and isolating stamps, saving time and effort.

E-commerce Platforms: Online marketplaces and auction platforms catering to stamp sellers and buyers can integrate the StampSegNet model to enhance their user experience. Sellers can easily upload images of stamps, and the model can automatically extract and display segmented stamps, facilitating search, categorization, and valuation for potential buyers.

While the StampSegNet exhibits high performance in stamp segmentation, it may encounter challenges in scenarios with heavily damaged or obscured stamps, unusual stamp shapes, or images with poor lighting conditions. Furthermore, as with any AI-based model, biases present in the training data could potentially influence the segmentation results, necessitating careful evaluation and mitigation of any ethical implications.

Training and evaluation data

The dataset used was taken from kaggle. Link is provided below: {dataset}(https://www.kaggle.com/datasets/rtatman/stamp-verification-staver-dataset)

We used 60 samples and annotated them on Segments.ai

Training procedure

Data Collection and Preparation:

Collect a diverse dataset of stamp images along with their corresponding pixel-level annotations, indicating the regions of stamps within the images. Ensure the dataset includes a wide variety of stamp designs, sizes, colors, backgrounds, and lighting conditions. Split the dataset into training, validation set

Model Selection and Configuration:

Choose a semantic segmentation model architecture suitable for stamp segmentation tasks. We used nvidia/mit-b0 as a pretrained model and fined tuned on it Configure the model architecture and any necessary hyperparameters, such as learning rate, batch size, and optimizer.

Training:

Train the model on the labeled stamp dataset using the initialized weights. Use a suitable loss function for semantic segmentation tasks, such as cross-entropy loss or Dice loss. Perform mini-batch stochastic gradient descent (SGD) or an optimizer like Adam to update the model's parameters. Monitor the training progress by calculating metrics such as pixel accuracy, mean Intersection over Union (IoU), or F1 score.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Mean Iou	Mean Accuracy	Overall Accuracy	Accuracy Unlabeled	Accuracy Stamp	Iou Stamp
0.3384	0.83	20	0.2769	0.0335	0.0670	0.0670	nan	0.0670	0.0670
0.2626	1.67	40	0.2201	0.1256	0.2512	0.2512	nan	0.2512	0.2512
0.1944	2.5	60	0.1918	0.2030	0.4060	0.4060	nan	0.4060	0.4060
0.2665	3.33	80	0.1564	0.1574	0.3148	0.3148	nan	0.3148	0.3148
0.1351	4.17	100	0.1194	0.1817	0.3634	0.3634	nan	0.3634	0.3634
0.1156	5.0	120	0.1035	0.1334	0.2668	0.2668	nan	0.2668	0.2668
0.1103	5.83	140	0.0895	0.1819	0.3638	0.3638	nan	0.3638	0.3638
0.0882	6.67	160	0.0746	0.0833	0.1665	0.1665	nan	0.1665	0.1665
0.0778	7.5	180	0.0655	0.1927	0.3854	0.3854	nan	0.3854	0.3854
0.0672	8.33	200	0.0585	0.1327	0.2654	0.2654	nan	0.2654	0.2654
0.0612	9.17	220	0.0615	0.1640	0.3279	0.3279	nan	0.3279	0.3279
0.0611	10.0	240	0.0546	0.2466	0.4933	0.4933	nan	0.4933	0.4933
0.0537	10.83	260	0.0499	0.1129	0.2258	0.2258	nan	0.2258	0.2258
0.0504	11.67	280	0.0502	0.1857	0.3713	0.3713	nan	0.3713	0.3713
0.0707	12.5	300	0.0442	0.1710	0.3419	0.3419	nan	0.3419	0.3419
0.0508	13.33	320	0.0434	0.2003	0.4006	0.4006	nan	0.4006	0.4006
0.0396	14.17	340	0.0420	0.1409	0.2818	0.2818	nan	0.2818	0.2818
0.0395	15.0	360	0.0417	0.1640	0.3280	0.3280	nan	0.3280	0.3280
0.0387	15.83	380	0.0397	0.1827	0.3655	0.3655	nan	0.3655	0.3655
0.0458	16.67	400	0.0387	0.1582	0.3165	0.3165	nan	0.3165	0.3165
0.0363	17.5	420	0.0390	0.1724	0.3449	0.3449	nan	0.3449	0.3449
0.0401	18.33	440	0.0382	0.2018	0.4036	0.4036	nan	0.4036	0.4036
0.0355	19.17	460	0.0382	0.2032	0.4064	0.4064	nan	0.4064	0.4064
0.0447	20.0	480	0.0372	0.1908	0.3817	0.3817	nan	0.3817	0.3817

Framework versions

Transformers 4.28.0
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3