text-classfication nlp neural-compressor PostTrainingsStatic int8 Intel® Neural Compressor

Statically quantized DistilBERT base uncased finetuned MPRC

Table of Contents

Model Details

Model Description: This model is a DistilBERT fine-tuned on MPRC statically quantized with optimum-intel through the usage of huggingface/optimum-intel through the usage of Intel® Neural Compressor.

How to Get Started With the Model

PyTorch

To load the quantized model, you can do as follows:

from optimum.intel.neural_compressor.quantization import IncQuantizedModelForSequenceClassification

model = IncQuantizedModelForSequenceClassification.from_pretrained("Intel/distilbert-base-uncased-MRPC-int8-static")

Test result

INT8 FP32
Accuracy (eval-f1) 0.9007 0.9027
Model size (MB) 242 268