bert-tiny-Massive-intent-KD-BERT_and_distilBERT

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 2.3729
Accuracy: 0.8470

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 33
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
15.1159	1.0	720	12.8257	0.2253
12.9949	2.0	1440	10.9891	0.4304
11.3865	3.0	2160	9.5622	0.5032
10.0553	4.0	2880	8.3700	0.5539
8.9431	5.0	3600	7.4127	0.6104
8.0135	6.0	4320	6.6185	0.6286
7.1987	7.0	5040	5.9517	0.6818
6.5168	8.0	5760	5.3879	0.7118
5.9352	9.0	6480	4.9426	0.7275
5.4299	10.0	7200	4.5637	0.7413
5.0017	11.0	7920	4.2379	0.7585
4.5951	12.0	8640	3.9699	0.7678
4.2849	13.0	9360	3.7416	0.7737
3.991	14.0	10080	3.5502	0.7865
3.7455	15.0	10800	3.4090	0.7900
3.5315	16.0	11520	3.3053	0.7914
3.345	17.0	12240	3.1670	0.8003
3.1767	18.0	12960	3.0739	0.8013
3.0322	19.0	13680	2.9927	0.8047
2.8864	20.0	14400	2.9366	0.8037
2.7728	21.0	15120	2.8666	0.8091
2.6732	22.0	15840	2.8146	0.8126
2.5726	23.0	16560	2.7588	0.8195
2.493	24.0	17280	2.7319	0.8273
2.4183	25.0	18000	2.6847	0.8249
2.3526	26.0	18720	2.6317	0.8323
2.2709	27.0	19440	2.6071	0.8288
2.2125	28.0	20160	2.5982	0.8323
2.1556	29.0	20880	2.5546	0.8337
2.1042	30.0	21600	2.5278	0.8318
2.054	31.0	22320	2.5005	0.8411
2.0154	32.0	23040	2.4891	0.8347
1.9785	33.0	23760	2.4633	0.8367
1.9521	34.0	24480	2.4451	0.8421
1.9247	35.0	25200	2.4370	0.8416
1.8741	36.0	25920	2.4197	0.8446
1.8659	37.0	26640	2.4081	0.8406
1.8367	38.0	27360	2.3979	0.8426
1.8153	39.0	28080	2.3758	0.8451
1.7641	40.0	28800	2.3729	0.8470
1.7608	41.0	29520	2.3683	0.8460
1.7647	42.0	30240	2.3628	0.8446
1.7656	43.0	30960	2.3492	0.8470

Framework versions

Transformers 4.22.1
Pytorch 1.12.1+cu113
Datasets 2.5.1
Tokenizers 0.12.1

bert-tiny-Massive-intent-KD-BERT_and_distilBERT

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js