text-classfication nlp neural-compressor PostTrainingsDynamic int8 Intel® Neural Compressor albert

Dynamically quantized Albert base finetuned MPRC

Table of Contents

Model Details

Model Description: This model is a Albert fine-tuned on MPRC dynamically quantized with optimum-intel through the usage of huggingface/optimum-intel through the usage of Intel® Neural Compressor.

How to Get Started With the Model

PyTorch

To load the quantized model, you can do as follows:

from optimum.intel.neural_compressor.quantization import IncQuantizedModelForSequenceClassification

model = IncQuantizedModelForSequenceClassification.from_pretrained("Intel/albert-base-v2-MRPC-int8")

Test result

INT8 FP32
Accuracy (eval-f1) 0.9193 0.9263
Model size (MB) 45.0 46.7