BERT-base compressed by JPQD with Regularization Factor 0.03
F1: 87.66
EM: 80.23
Description of important files
├── r0.030-squad-bert-b-mvmt-8bit
│ ├── 8bit_ref_bert_squad_nncf_mvmt.json (nncf config used with ssbs-feb branch)
│ ├── checkpoint-110000 (trained checkpoint for generation)
│ ├── ir
│ │ ├── sparsity_structures.csv
│ │ ├── sparsity_structures.md (layer wise sparsity reporting, for linear layer in transformer block only)
│ │ ├── sparsity_structures.pkl (containing pruned structure id, e.g. particular head in MHSA or dimension in FFN, useful for debug)
│ │ └── squad-BertForQuestionAnswering.cropped.8bit.xml (custom discard of pruned dimension and onnx export, followed by ir translation
│ ├── ir_uncropped
│ │ ├── mo-pruned-ir
│ │ │ ├── mo.log (see Model Optimizer version here)
│ │ │ └── squad-BertForQuestionAnswering.8bit.xml (pruned structures are removed using Model Optimier --transform=Pruning)
│ │ └── squad-BertForQuestionAnswering.8bit.xml (pruned structures are sparsified/zero-ed only)