art

The model used to predict the types of anime images, which includes the following four categories:

Model FLOPs Accuracy Confusion Matrix Description
caformer_s36 22.10G 88.19% Confusion Matrix Model: caformer_s36 from timm
caformer_s36_plus 22.10G 93.47% Confusion Matrix Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm
mobilenetv3 0.63G 88.96% Confusion Matrix Model: mobilenetv3_large_100 from timm
mobilenetv3_dist 0.63G 91.98% Confusion Matrix Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with focal loss
mobilenetv3_sce 0.63G 89.92% Confusion Matrix Model: mobilenetv3_large_100 from timm, use SCELoss as loss function
mobilenetv3_sce_dist 0.63G 92.35% Confusion Matrix Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with SCELoss
mobilevitv2_150 9.09G 88.21% Confusion Matrix Model: mobilevitv2_150 from timm