20230829213636

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3345
Accuracy: 0.6731

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 44
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 80.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	35	0.3387	0.6346
No log	2.0	70	0.8919	0.4135
No log	3.0	105	0.9639	0.5962
No log	4.0	140	0.4387	0.5673
No log	5.0	175	0.4726	0.3846
No log	6.0	210	0.5719	0.4038
No log	7.0	245	0.4199	0.5962
No log	8.0	280	0.3522	0.625
No log	9.0	315	0.4153	0.4038
No log	10.0	350	0.5049	0.6346
No log	11.0	385	0.6337	0.4135
No log	12.0	420	0.4518	0.6346
No log	13.0	455	0.3821	0.6346
No log	14.0	490	0.7306	0.4038
0.5573	15.0	525	0.3550	0.625
0.5573	16.0	560	0.4895	0.375
0.5573	17.0	595	0.4166	0.4519
0.5573	18.0	630	0.3761	0.4904
0.5573	19.0	665	0.5975	0.3654
0.5573	20.0	700	0.3852	0.3942
0.5573	21.0	735	0.3488	0.5577
0.5573	22.0	770	0.3618	0.5
0.5573	23.0	805	0.5302	0.3942
0.5573	24.0	840	0.3431	0.5481
0.5573	25.0	875	0.4614	0.3942
0.5573	26.0	910	0.3930	0.4615
0.5573	27.0	945	0.7360	0.3654
0.5573	28.0	980	0.3691	0.5
0.4445	29.0	1015	0.4560	0.3942
0.4445	30.0	1050	0.3417	0.6346
0.4445	31.0	1085	0.4385	0.3846
0.4445	32.0	1120	0.3404	0.5962
0.4445	33.0	1155	0.3330	0.6346
0.4445	34.0	1190	0.3392	0.5481
0.4445	35.0	1225	0.3633	0.4519
0.4445	36.0	1260	0.3393	0.6058
0.4445	37.0	1295	0.3710	0.4423
0.4445	38.0	1330	0.4183	0.6346
0.4445	39.0	1365	0.3844	0.4135
0.4445	40.0	1400	0.4395	0.3846
0.4445	41.0	1435	0.7268	0.3654
0.4445	42.0	1470	0.4637	0.3942
0.4262	43.0	1505	0.3329	0.6346
0.4262	44.0	1540	0.3329	0.6154
0.4262	45.0	1575	0.4193	0.3846
0.4262	46.0	1610	0.3363	0.6154
0.4262	47.0	1645	0.3300	0.6538
0.4262	48.0	1680	0.3834	0.6346
0.4262	49.0	1715	0.3301	0.6346
0.4262	50.0	1750	0.3967	0.4231
0.4262	51.0	1785	0.4372	0.4038
0.4262	52.0	1820	0.3447	0.5288
0.4262	53.0	1855	0.4897	0.3942
0.4262	54.0	1890	0.3612	0.4423
0.4262	55.0	1925	0.3329	0.6346
0.4262	56.0	1960	0.3318	0.6731
0.4262	57.0	1995	0.3795	0.4327
0.3947	58.0	2030	0.3331	0.6827
0.3947	59.0	2065	0.3366	0.6346
0.3947	60.0	2100	0.3655	0.6346
0.3947	61.0	2135	0.3894	0.6346
0.3947	62.0	2170	0.3788	0.4327
0.3947	63.0	2205	0.4001	0.4135
0.3947	64.0	2240	0.3433	0.6346
0.3947	65.0	2275	0.3433	0.6346
0.3947	66.0	2310	0.3581	0.4327
0.3947	67.0	2345	0.3345	0.6731
0.3947	68.0	2380	0.3419	0.5769
0.3947	69.0	2415	0.3355	0.6346
0.3947	70.0	2450	0.3444	0.6346
0.3947	71.0	2485	0.3301	0.6346
0.3718	72.0	2520	0.3370	0.6346
0.3718	73.0	2555	0.3849	0.4231
0.3718	74.0	2590	0.3484	0.5
0.3718	75.0	2625	0.3336	0.6442
0.3718	76.0	2660	0.3313	0.6635
0.3718	77.0	2695	0.4030	0.4135
0.3718	78.0	2730	0.3389	0.5962
0.3718	79.0	2765	0.3336	0.6538
0.3718	80.0	2800	0.3345	0.6731

Framework versions

Transformers 4.26.1
Pytorch 2.0.1+cu118
Datasets 2.12.0
Tokenizers 0.13.3

20230829213636

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js