whisper-large-no-is-fo-fo_parl-60k-steps

This model is a fine-tuned version of davidilag/whisper-large-no-is-fo-100h-30k-steps on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5757
  • Wer: 26.1232
  • Cer: 13.6139

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 48
  • eval_batch_size: 20
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 192
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 3000

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.486 14.4928 500 0.5783 26.3128 13.5818
0.4159 28.9855 1000 0.5665 25.9526 13.4001
0.4045 43.4783 1500 0.5611 26.1232 13.4678
0.39 57.9710 2000 0.5983 26.0474 13.5141
0.3986 72.4638 2500 0.5706 26.1611 13.5533
0.3941 86.9565 3000 0.5757 26.1232 13.6139

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.9.0+cu128
  • Datasets 3.0.1
  • Tokenizers 0.20.3
Downloads last month
1
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for davidilag/whisper-large-no-is-fo-fo_parl-3k-steps