aa15fd4b9e269f6768f85d427a3033bb
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [de-it] dataset. It achieves the following results on the evaluation set:
- Loss: 2.1325
- Data Size: 1.0
- Epoch Runtime: 158.1838
- Bleu: 7.2676
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.6234 | 0 | 13.2256 | 0.2972 |
| No log | 1 | 684 | 11.7676 | 0.0078 | 15.3219 | 0.3002 |
| No log | 2 | 1368 | 12.4365 | 0.0156 | 16.1003 | 0.3039 |
| No log | 3 | 2052 | 12.0688 | 0.0312 | 18.9124 | 0.2997 |
| No log | 4 | 2736 | 10.7945 | 0.0625 | 24.3546 | 0.2572 |
| 11.099 | 5 | 3420 | 6.9095 | 0.125 | 33.0252 | 0.3361 |
| 5.3897 | 6 | 4104 | 3.4165 | 0.25 | 50.3619 | 1.7199 |
| 3.8268 | 7 | 4788 | 2.7978 | 0.5 | 85.5902 | 3.5503 |
| 3.3088 | 8.0 | 5472 | 2.5462 | 1.0 | 156.3922 | 4.4769 |
| 3.0566 | 9.0 | 6156 | 2.4595 | 1.0 | 158.7401 | 4.8906 |
| 2.9327 | 10.0 | 6840 | 2.3955 | 1.0 | 156.9191 | 5.1621 |
| 2.8014 | 11.0 | 7524 | 2.3356 | 1.0 | 158.2888 | 5.4440 |
| 2.7264 | 12.0 | 8208 | 2.2981 | 1.0 | 157.3085 | 5.6917 |
| 2.6004 | 13.0 | 8892 | 2.2708 | 1.0 | 160.3785 | 5.9098 |
| 2.6119 | 14.0 | 9576 | 2.2506 | 1.0 | 158.7296 | 6.0467 |
| 2.4926 | 15.0 | 10260 | 2.2214 | 1.0 | 158.8821 | 6.2256 |
| 2.434 | 16.0 | 10944 | 2.2053 | 1.0 | 157.5861 | 6.2842 |
| 2.3756 | 17.0 | 11628 | 2.1938 | 1.0 | 159.5473 | 6.4503 |
| 2.3385 | 18.0 | 12312 | 2.1808 | 1.0 | 158.1535 | 6.5505 |
| 2.3189 | 19.0 | 12996 | 2.1685 | 1.0 | 158.8618 | 6.5684 |
| 2.2347 | 20.0 | 13680 | 2.1548 | 1.0 | 159.2732 | 6.6397 |
| 2.1471 | 21.0 | 14364 | 2.1555 | 1.0 | 160.8164 | 6.7127 |
| 2.1344 | 22.0 | 15048 | 2.1503 | 1.0 | 159.3493 | 6.8116 |
| 2.1269 | 23.0 | 15732 | 2.1489 | 1.0 | 158.3381 | 6.8459 |
| 2.0743 | 24.0 | 16416 | 2.1386 | 1.0 | 156.9267 | 6.9422 |
| 2.0148 | 25.0 | 17100 | 2.1297 | 1.0 | 157.7105 | 7.0030 |
| 1.9918 | 26.0 | 17784 | 2.1318 | 1.0 | 157.4208 | 7.0410 |
| 1.983 | 27.0 | 18468 | 2.1334 | 1.0 | 163.4428 | 7.0885 |
| 1.9478 | 28.0 | 19152 | 2.1261 | 1.0 | 158.6249 | 7.0952 |
| 1.9123 | 29.0 | 19836 | 2.1297 | 1.0 | 157.3881 | 7.1305 |
| 1.8438 | 30.0 | 20520 | 2.1306 | 1.0 | 158.8061 | 7.1907 |
| 1.8258 | 31.0 | 21204 | 2.1328 | 1.0 | 158.6794 | 7.2350 |
| 1.8598 | 32.0 | 21888 | 2.1325 | 1.0 | 158.1838 | 7.2676 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/aa15fd4b9e269f6768f85d427a3033bb
Base model
google/umt5-base