Salesforce/wikitext
Viewer • Updated • 3.71M • 1.35M • 695
This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| No log | 0.0083 | 100 | 2.3376 | 0.5511 |
| No log | 0.0166 | 200 | 2.1736 | 0.5765 |
| No log | 0.0248 | 300 | 2.0679 | 0.5930 |
| No log | 0.0331 | 400 | 1.9839 | 0.6056 |
| 2.2761 | 0.0414 | 500 | 1.9611 | 0.6085 |
| 2.2761 | 0.0497 | 600 | 1.9054 | 0.6203 |
| 2.2761 | 0.0580 | 700 | 1.8838 | 0.6242 |
| 2.2761 | 0.0662 | 800 | 1.8403 | 0.6296 |
| 2.2761 | 0.0745 | 900 | 1.8235 | 0.6300 |
| 1.8887 | 0.0828 | 1000 | 1.7920 | 0.6351 |