3ea0ede778568b172acdd1c30d586250
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [de-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2399
- Data Size: 1.0
- Epoch Runtime: 89.5701
- Bleu: 9.0155
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.7590 | 0 | 7.7775 | 0.2918 |
| No log | 1 | 390 | 11.4207 | 0.0078 | 9.5688 | 0.2437 |
| No log | 2 | 780 | 10.5088 | 0.0156 | 9.8202 | 0.2587 |
| No log | 3 | 1170 | 9.8258 | 0.0312 | 11.8070 | 0.3061 |
| No log | 4 | 1560 | 8.6899 | 0.0625 | 14.5774 | 0.2775 |
| 0.7945 | 5 | 1950 | 7.3758 | 0.125 | 19.7967 | 0.4291 |
| 1.3976 | 6 | 2340 | 4.5789 | 0.25 | 30.7220 | 2.1343 |
| 4.6232 | 7 | 2730 | 3.1818 | 0.5 | 48.7698 | 3.9774 |
| 3.5482 | 8.0 | 3120 | 2.7705 | 1.0 | 89.3951 | 5.5113 |
| 3.2315 | 9.0 | 3510 | 2.6142 | 1.0 | 89.0712 | 6.2093 |
| 3.0672 | 10.0 | 3900 | 2.5261 | 1.0 | 90.1843 | 6.6777 |
| 2.9518 | 11.0 | 4290 | 2.4727 | 1.0 | 88.6460 | 6.9571 |
| 2.8355 | 12.0 | 4680 | 2.4302 | 1.0 | 88.9321 | 7.2129 |
| 2.7882 | 13.0 | 5070 | 2.4019 | 1.0 | 89.4192 | 7.4252 |
| 2.7007 | 14.0 | 5460 | 2.3723 | 1.0 | 88.1363 | 7.6036 |
| 2.572 | 15.0 | 5850 | 2.3563 | 1.0 | 88.8658 | 7.7266 |
| 2.5441 | 16.0 | 6240 | 2.3282 | 1.0 | 87.2663 | 7.8832 |
| 2.5123 | 17.0 | 6630 | 2.3119 | 1.0 | 89.0349 | 7.9350 |
| 2.4375 | 18.0 | 7020 | 2.3041 | 1.0 | 88.9797 | 8.0609 |
| 2.3963 | 19.0 | 7410 | 2.2749 | 1.0 | 89.8775 | 8.1993 |
| 2.3644 | 20.0 | 7800 | 2.2755 | 1.0 | 88.7370 | 8.2685 |
| 2.3253 | 21.0 | 8190 | 2.2645 | 1.0 | 88.8354 | 8.4198 |
| 2.2549 | 22.0 | 8580 | 2.2536 | 1.0 | 89.4994 | 8.4095 |
| 2.2268 | 23.0 | 8970 | 2.2532 | 1.0 | 90.1168 | 8.4829 |
| 2.1798 | 24.0 | 9360 | 2.2398 | 1.0 | 89.9066 | 8.5593 |
| 2.1351 | 25.0 | 9750 | 2.2460 | 1.0 | 88.8773 | 8.6257 |
| 2.1097 | 26.0 | 10140 | 2.2371 | 1.0 | 90.3088 | 8.6773 |
| 2.0499 | 27.0 | 10530 | 2.2315 | 1.0 | 88.5118 | 8.7752 |
| 2.0556 | 28.0 | 10920 | 2.2307 | 1.0 | 88.4823 | 8.8357 |
| 2.0289 | 29.0 | 11310 | 2.2231 | 1.0 | 89.2264 | 8.8561 |
| 1.9845 | 30.0 | 11700 | 2.2203 | 1.0 | 89.0804 | 8.9314 |
| 1.9124 | 31.0 | 12090 | 2.2312 | 1.0 | 89.6350 | 8.9321 |
| 1.8824 | 32.0 | 12480 | 2.2291 | 1.0 | 88.9485 | 8.9577 |
| 1.8683 | 33.0 | 12870 | 2.2373 | 1.0 | 90.5498 | 9.0160 |
| 1.8397 | 34.0 | 13260 | 2.2399 | 1.0 | 89.5701 | 9.0155 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/3ea0ede778568b172acdd1c30d586250
Base model
google/umt5-base