mobilebert_sa_GLUE_Experiment_logit_kd_data_aug_mrpc_128

This model is a fine-tuned version of google/mobilebert-uncased on the GLUE MRPC dataset. It achieves the following results on the evaluation set:

Loss: 0.1262
Accuracy: 0.9877
F1: 0.9911
Combined Score: 0.9894

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 128
eval_batch_size: 128
seed: 10
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Combined Score
0.3021	1.0	1959	0.2407	0.9510	0.9636	0.9573
0.2326	2.0	3918	0.2009	0.9779	0.9841	0.9810
0.224	3.0	5877	0.1790	0.9730	0.9807	0.9769
0.219	4.0	7836	0.1789	0.9804	0.9859	0.9831
0.2153	5.0	9795	0.1804	0.9804	0.9859	0.9831
0.2121	6.0	11754	0.1754	0.9755	0.9824	0.9789
0.2088	7.0	13713	0.1661	0.9804	0.9859	0.9831
0.2056	8.0	15672	0.1654	0.9779	0.9841	0.9810
0.2031	9.0	17631	0.1714	0.9828	0.9876	0.9852
0.2012	10.0	19590	0.1617	0.9828	0.9876	0.9852
0.1993	11.0	21549	0.1610	0.9828	0.9876	0.9852
0.1978	12.0	23508	0.1507	0.9853	0.9894	0.9873
0.1964	13.0	25467	0.1496	0.9902	0.9929	0.9915
0.1953	14.0	27426	0.1569	0.9828	0.9876	0.9852
0.1943	15.0	29385	0.1524	0.9877	0.9911	0.9894
0.1934	16.0	31344	0.1492	0.9877	0.9911	0.9894
0.1926	17.0	33303	0.1480	0.9902	0.9929	0.9915
0.1918	18.0	35262	0.1416	0.9828	0.9876	0.9852
0.1912	19.0	37221	0.1420	0.9853	0.9894	0.9873
0.1905	20.0	39180	0.1396	0.9853	0.9894	0.9873
0.1899	21.0	41139	0.1458	0.9853	0.9894	0.9873
0.1893	22.0	43098	0.1484	0.9877	0.9911	0.9894
0.1888	23.0	45057	0.1407	0.9877	0.9911	0.9894
0.1883	24.0	47016	0.1372	0.9804	0.9858	0.9831
0.1878	25.0	48975	0.1354	0.9877	0.9911	0.9894
0.1873	26.0	50934	0.1368	0.9902	0.9929	0.9915
0.1869	27.0	52893	0.1378	0.9926	0.9947	0.9936
0.1865	28.0	54852	0.1381	0.9853	0.9894	0.9873
0.1861	29.0	56811	0.1329	0.9877	0.9911	0.9894
0.1857	30.0	58770	0.1342	0.9902	0.9929	0.9915
0.1854	31.0	60729	0.1346	0.9902	0.9929	0.9915
0.1849	32.0	62688	0.1323	0.9902	0.9929	0.9915
0.1846	33.0	64647	0.1317	0.9877	0.9911	0.9894
0.1843	34.0	66606	0.1318	0.9877	0.9911	0.9894
0.1839	35.0	68565	0.1311	0.9926	0.9947	0.9936
0.1837	36.0	70524	0.1291	0.9877	0.9911	0.9894
0.1834	37.0	72483	0.1313	0.9853	0.9893	0.9873
0.1831	38.0	74442	0.1262	0.9877	0.9911	0.9894
0.1828	39.0	76401	0.1288	0.9877	0.9911	0.9894
0.1825	40.0	78360	0.1295	0.9926	0.9947	0.9936
0.1823	41.0	80319	0.1277	0.9902	0.9929	0.9915
0.182	42.0	82278	0.1265	0.9902	0.9929	0.9915
0.1818	43.0	84237	0.1273	0.9902	0.9929	0.9915

Framework versions

Transformers 4.26.0
Pytorch 1.14.0a0+410ce96
Datasets 2.9.0
Tokenizers 0.13.2

mobilebert_sa_GLUE_Experiment_logit_kd_data_aug_mrpc_128

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js