aa15fd4b9e269f6768f85d427a3033bb

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [de-it] dataset. It achieves the following results on the evaluation set:

Loss: 2.1325
Data Size: 1.0
Epoch Runtime: 158.1838
Bleu: 7.2676

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.6234	0	13.2256	0.2972
No log	1	684	11.7676	0.0078	15.3219	0.3002
No log	2	1368	12.4365	0.0156	16.1003	0.3039
No log	3	2052	12.0688	0.0312	18.9124	0.2997
No log	4	2736	10.7945	0.0625	24.3546	0.2572
11.099	5	3420	6.9095	0.125	33.0252	0.3361
5.3897	6	4104	3.4165	0.25	50.3619	1.7199
3.8268	7	4788	2.7978	0.5	85.5902	3.5503
3.3088	8.0	5472	2.5462	1.0	156.3922	4.4769
3.0566	9.0	6156	2.4595	1.0	158.7401	4.8906
2.9327	10.0	6840	2.3955	1.0	156.9191	5.1621
2.8014	11.0	7524	2.3356	1.0	158.2888	5.4440
2.7264	12.0	8208	2.2981	1.0	157.3085	5.6917
2.6004	13.0	8892	2.2708	1.0	160.3785	5.9098
2.6119	14.0	9576	2.2506	1.0	158.7296	6.0467
2.4926	15.0	10260	2.2214	1.0	158.8821	6.2256
2.434	16.0	10944	2.2053	1.0	157.5861	6.2842
2.3756	17.0	11628	2.1938	1.0	159.5473	6.4503
2.3385	18.0	12312	2.1808	1.0	158.1535	6.5505
2.3189	19.0	12996	2.1685	1.0	158.8618	6.5684
2.2347	20.0	13680	2.1548	1.0	159.2732	6.6397
2.1471	21.0	14364	2.1555	1.0	160.8164	6.7127
2.1344	22.0	15048	2.1503	1.0	159.3493	6.8116
2.1269	23.0	15732	2.1489	1.0	158.3381	6.8459
2.0743	24.0	16416	2.1386	1.0	156.9267	6.9422
2.0148	25.0	17100	2.1297	1.0	157.7105	7.0030
1.9918	26.0	17784	2.1318	1.0	157.4208	7.0410
1.983	27.0	18468	2.1334	1.0	163.4428	7.0885
1.9478	28.0	19152	2.1261	1.0	158.6249	7.0952
1.9123	29.0	19836	2.1297	1.0	157.3881	7.1305
1.8438	30.0	20520	2.1306	1.0	158.8061	7.1907
1.8258	31.0	21204	2.1328	1.0	158.6794	7.2350
1.8598	32.0	21888	2.1325	1.0	158.1838	7.2676

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 2

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/aa15fd4b9e269f6768f85d427a3033bb

Base model

google/umt5-base

Finetuned

(47)

this model

contemmcm
/

aa15fd4b9e269f6768f85d427a3033bb

aa15fd4b9e269f6768f85d427a3033bb

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/aa15fd4b9e269f6768f85d427a3033bb

Evaluation results