task: text-classification
Backend: sagemaker-training
Backend args: {'instance_type': 'ml.m5.2xlarge', 'supported_instructions': 'avx512'}
Number of evaluation samples: All dataset
Fixed parameters:
- dataset: [{'path': 'glue', 'eval_split': 'validation', 'data_keys': {'primary': 'sentence'}, 'ref_keys': ['label'], 'name': 'sst2', 'calibration_split': None}]
- name_or_path:
distilbert-base-uncased-finetuned-sst-2-english
- from_transformers:
True
- quantization_approach:
dynamic
- node_exclusion:
[]
Benchmarked parameters:
- framework:
onnxruntime
,pytorch
- operators_to_quantize:
['Add', 'MatMul']
,['Add']
- per_channel:
False
,True
- framework_args:
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4}
,{}
- apply_quantization:
True
,False
Evaluation
Non-time metrics
framework | operators_to_quantize | per_channel | framework_args | apply_quantization | accuracy | |
---|---|---|---|---|---|---|
onnxruntime |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
| | 0.911 |
onnxruntime |
['Add', 'MatMul'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 0.898 |
onnxruntime |
['Add', 'MatMul'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 0.490 |
onnxruntime |
['Add'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 0.911 |
onnxruntime |
['Add'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 0.911 |
pytorch |
None |
None |
{} |
None |
| | 0.911 |
Time metrics
Time benchmarks were run for 15 seconds per config.
Below, time metrics for batch size = 1, input length = 224.
framework | operators_to_quantize | per_channel | framework_args | apply_quantization | latency_mean (ms) | throughput (/s) | ||
---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
| | 83.23 | | | 12.07 |
onnxruntime |
['Add', 'MatMul'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 64.31 | | | 15.60 |
onnxruntime |
['Add', 'MatMul'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 64.78 | | | 15.47 |
onnxruntime |
['Add'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 82.63 | | | 12.13 |
onnxruntime |
['Add'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
| | 83.82 | | | 11.93 |
pytorch |
None |
None |
{} |
None |
| | 84.34 | | | 11.87 |