chile-gpt

This model is a fine-tuned version of DeepESP/gpt2-spanish on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 9.4320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 512
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
10.6676	0.98	6	9.5748
9.6237	1.98	12	9.2470
9.2815	2.98	18	8.8724
8.8097	3.98	24	8.3629
8.2296	4.98	30	7.8407
7.6891	5.98	36	7.4161
7.3013	6.98	42	7.1598
7.0671	7.98	48	7.0080
6.9404	8.98	54	6.9133
6.7543	9.98	60	6.7723
6.5845	10.98	66	6.6619
6.4193	11.98	72	6.5965
6.2554	12.98	78	6.5185
6.0993	13.98	84	6.4632
5.93	14.98	90	6.4155
5.7684	15.98	96	6.4183
5.6242	16.98	102	6.3981
5.4577	17.98	108	6.4609
5.2898	18.98	114	6.4577
5.1113	19.98	120	6.5617
4.9319	20.98	126	6.5827
4.7464	21.98	132	6.6961
4.5505	22.98	138	6.8359
4.341	23.98	144	6.9193
4.1324	24.98	150	7.0325
3.8938	25.98	156	7.1993
3.6691	26.98	162	7.3179
3.4316	27.98	168	7.4708
3.2041	28.98	174	7.5654
2.9614	29.98	180	7.7535
2.7189	30.98	186	7.8551
2.4944	31.98	192	8.0094
2.2624	32.98	198	8.0527
2.0292	33.98	204	8.1857
1.809	34.98	210	8.3468
1.597	35.98	216	8.4307
1.3849	36.98	222	8.6230
1.2081	37.98	228	8.6666
1.0273	38.98	234	8.7926
0.8661	39.98	240	8.8861
0.7308	40.98	246	8.9042
0.6189	41.98	252	8.9202
0.5335	42.98	258	9.0861
0.459	43.98	264	9.1198
0.3958	44.98	270	9.2129
0.3587	45.98	276	9.2434
0.3222	46.98	282	9.3005
0.2948	47.98	288	9.3961
0.2677	48.98	294	9.4605
0.2348	49.98	300	9.4320

Framework versions

Transformers 4.24.0
Pytorch 1.13.0+rocm5.2
Datasets 2.6.1
Tokenizers 0.13.2

chile-gpt

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js