aistrova/safesearch-v3.1 - AI Model Zoo

v3.1 may not have been the world's most accurate Safesearch model, but the v5.0 series will significantly outperform Google Safesearch and perhaps Bing Safesearch (at least on the benchmark). I will make sure that anyone can reproduce the results in the v5.0 series reports. Please look here for updates: https://huggingface.co/datasets/aistrova/cmad.

<details> <summary><strong> 🌐 Language </strong></summary>

</div>

</div>

</div>

</details>

AIstrova Safesearch v3.1 is an ultra-precise & efficient multi-class image classifier that accurately detects sexually suggestive or gory images & videos with near-zero false positives.

For website classification, please run the demo on your local computer and change `interactive=False` to `interactive=True`. Using the `requests` library produced incorrect output on the demo running on HuggingFace Space. In other words, the demo works perfectly fine on my laptop (in Canada) but not on the demo. This is because adult websites look like this on the demo: <img src="./image.png" alt="blocked" width="20%">

Click here for an official demo of this state-of-the-art model that can be customized based on personal preferences & sensitivity.

	Statistics
Epochs	4
Optimizer	AdamW<br>lr=1e-5<br>weight_decay=1e-2
Training Images and GIFs	4,210,000+
Architecture	efficientnet_b1_pruned
Training Accuracy	99.0%
Validation Accuracy	98.5%
Training F1-score	96.8%
Training CrossEntropyLoss Loss	0.104
Human Evaluation	The training accuracy and F1-score underestimates this model's performance on the dataset.<br>When evaluating the model on the test set, we found that incorrect predictions on the test set were mostly a result of mislabeled data.<br>In other words, this model identified mislabelled data during training and did not overfit. Instead, the model predicted what's actually correct.
Drastic Improvements from v2	1. a more balanced training dataset (before data argumentation: ≈326,000 for `nsfw_suggestive`, ≈345,000 for `safe`, and ≈3,600 for `nsfw_gore`) <br>2. a more robust optimization algorithm<br>3. a careful selection of the base model<br>4. the use of ≈5,000 handpicked AI-generated images for training

<br>

Data Argumentation

I applied this transforms function to the training set to enhance the model's ability to generalize on NSFW patterns, particularly with respect to rotated images/GIFs.

transform_train = transforms.Compose([
    transforms.Resize((299, 299)),
    transforms.RandomChoice([
        transforms.RandomRotation(90),
        transforms.RandomRotation(180),
        transforms.RandomRotation(270),
        transforms.RandomHorizontalFlip(p=0.1),
        transforms.Lambda(lambda x: x),
        transforms.Lambda(lambda x: x),
        transforms.Lambda(lambda x: x)
    ]),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

<br>

PyTorch

You must request access by signing up for a HuggingFace account and agreeing to share your contact information with us at the top.

pip install huggingface_hub

import timm
from huggingface_hub import login

HUGGINGFACE_TOKEN = "" # https://huggingface.co/settings/tokens

login(HUGGINGFACE_TOKEN)
model = timm.create_model("hf_hub:aistrova/safesearch-v3.1", pretrained=True)

<br>

TensorFlow

Here's how to convert a timm model to TensorFlow's SavedModel format.

pip install onnx onnx_tf

import timm
import torch
import tensorflow as tf
import onnx
from onnx_tf.backend import prepare

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = timm.create_model("hf_hub:aistrova/safesearch-v3.1", pretrained=True)
model.to(device)

# Export the model to ONNX
batch_size = 1
img_size = 299
sample_input = torch.rand((batch_size, 3, img_size, img_size)).to(device)
onnx_model_path = 'model.onnx'
torch.onnx.export(
    model,
    sample_input,
    onnx_model_path,
    verbose=False,
    input_names=['input'],
    output_names=['output'],
    opset_version=12
)

# Convert the ONNX model to TensorFlow 2
tf_model_path = 'model_tf'
onnx_model = onnx.load(onnx_model_path)
tf_rep = prepare(onnx_model)
tf_rep.export_graph(tf_model_path)

# Load the converted model
model_tf = tf.saved_model.load(tf_model_path)

<br>

For a less accurate deep learning model with a more permissive license, please see v2.

Attribution

For personal or research projects, please cite aistrova/safesearch-v3.1 or "AIstrova Technologies Inc." in your README.md or project description.

Attribution-NonCommercial-ShareAlike (cc-by-nc-sa-4.0) lets others remix, adapt, and build upon your work non-commercially, as long as they credit you and license their new creations under the identical terms.

License

The license is currently strictly cc-by-nc-sa-4.0, but may change to apache-2.0 in the future.
When the license is changed, the apache-2.0 tag will replace cc-by-nc-sa-4.0 at the top of the page, below the model name aistrova/safesearch-v3.1.

Data Argumentation

PyTorch

TensorFlow

Attribution

License

NSDT 3DConvert

UnrealSynth

DreamTexture.js