nlewins/mt5-small-finetuned-ceb-to-en-tfD

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 2.0379
Validation Loss: 3.0766
Train Bleu: 8.3032
Train Gen Len: 33.95
Epoch: 64

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'ExponentialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 10000, 'decay_rate': 0.4, 'staircase': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.0001}
training_precision: float32

Training results

Train Loss	Validation Loss	Train Bleu	Train Gen Len	Epoch
10.2106	5.0508	0.0137	446.0648	0
6.7509	4.4073	0.0139	429.4630	1
6.0227	4.0907	0.0365	350.5889	2
5.6141	3.9654	0.1205	188.5870	3
5.3489	3.9050	0.3894	87.8630	4
5.1413	3.8607	0.5328	58.3167	5
4.9827	3.8213	1.1469	34.4963	6
4.8325	3.7860	1.0320	36.2833	7
4.7153	3.7475	1.2818	42.4463	8
4.6216	3.7136	1.3223	44.1611	9
4.5247	3.6768	0.9889	55.2259	10
4.4525	3.6533	1.3691	49.3574	11
4.3604	3.6222	1.7751	40.1722	12
4.2940	3.5930	1.5683	49.0833	13
4.2214	3.5694	1.4980	45.8537	14
4.1482	3.5523	1.7241	44.6130	15
4.0861	3.5342	2.0267	39.1963	16
4.0126	3.5052	2.1943	40.3019	17
3.9553	3.4909	2.2466	42.2815	18
3.8973	3.4679	2.7648	34.0519	19
3.8397	3.4552	2.9543	38.1130	20
3.7934	3.4253	2.2250	48.2963	21
3.7257	3.4040	2.3509	45.2778	22
3.6643	3.3869	2.4464	46.6926	23
3.6209	3.3674	2.3087	48.7630	24
3.5603	3.3488	2.6122	43.2630	25
3.5105	3.3272	2.5595	46.8556	26
3.4645	3.3142	2.5999	47.6870	27
3.4201	3.3032	2.9416	45.0204	28
3.3736	3.2811	3.1105	43.2167	29
3.3306	3.2684	3.6797	41.7667	30
3.2658	3.2508	3.3509	49.2778	31
3.2222	3.2394	3.7258	44.6444	32
3.1719	3.2292	3.7031	45.4259	33
3.1219	3.2112	4.3785	38.5259	34
3.0806	3.2003	4.7949	38.2796	35
3.0467	3.1884	4.7402	39.1185	36
2.9929	3.1804	4.3355	42.3037	37
2.9471	3.1695	4.5699	41.0426	38
2.9078	3.1574	4.3787	44.4778	39
2.8716	3.1511	4.8370	39.7185	40
2.8198	3.1458	5.2962	35.9556	41
2.7842	3.1398	5.3283	38.1611	42
2.7432	3.1309	5.0445	38.8870	43
2.7043	3.1204	5.0695	40.3889	44
2.6696	3.1227	5.3477	43.0963	45
2.6259	3.1164	6.5346	35.9074	46
2.5843	3.1004	5.2047	44.6037	47
2.5448	3.1094	5.4300	37.8963	48
2.5129	3.0964	5.3997	41.5537	49
2.4697	3.0940	5.9519	38.7037	50
2.4409	3.0933	5.5973	41.1463	51
2.4077	3.0867	5.9751	40.1815	52
2.3705	3.0898	6.4699	36.8537	53
2.3397	3.0841	6.5144	38.2815	54
2.3118	3.0873	7.5425	33.0611	55
2.2658	3.0801	7.2862	35.8667	56
2.2407	3.0803	7.3595	34.6611	57
2.2027	3.0791	7.2377	35.7130	58
2.1805	3.0794	7.5672	35.1981	59
2.1504	3.0772	8.1746	33.9963	60
2.1208	3.0751	7.8803	35.0185	61
2.0915	3.0801	7.6175	37.2796	62
2.0573	3.0817	8.3303	33.8241	63
2.0379	3.0766	8.3032	33.95	64

Framework versions

Transformers 4.33.3
TensorFlow 2.14.0
Datasets 2.14.5
Tokenizers 0.13.3

nlewins/mt5-small-finetuned-ceb-to-en-tfD

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js