Whisper child-adult training data ratios for child ASR
Collection
Models that have all been trained with 30 hours of speech, but using different ratios of child-adult speech
•
5 items
•
Updated
This model is a fine-tuned version of openai/whisper-large-v2 on the JASMIN-CGN dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 1.1164 | 0.1225 | 25 | 1.2201 | 38.0716 |
| 1.0913 | 0.2451 | 50 | 1.1869 | 37.5885 |
| 1.0308 | 0.3676 | 75 | 1.1215 | 36.4646 |
| 0.912 | 0.4902 | 100 | 1.0354 | 35.3105 |
| 0.8776 | 0.6127 | 125 | 0.9431 | 34.6496 |
| 0.7639 | 0.7353 | 150 | 0.8479 | 32.7541 |
| 0.7541 | 0.8578 | 175 | 0.7468 | 33.1332 |
| 0.6172 | 0.9804 | 200 | 0.6488 | 30.8485 |
| 0.6094 | 1.1029 | 225 | 0.5804 | 27.7217 |
| 0.5589 | 1.2255 | 250 | 0.5284 | 25.5578 |
| 0.5294 | 1.3480 | 275 | 0.4859 | 25.1552 |
| 0.5191 | 1.4706 | 300 | 0.4579 | 23.9172 |
| 0.5059 | 1.5931 | 325 | 0.4427 | 22.6021 |
| 0.4534 | 1.7157 | 350 | 0.4323 | 21.9512 |
| 0.4812 | 1.8382 | 375 | 0.4247 | 21.0924 |
| 0.4884 | 1.9608 | 400 | 0.4199 | 22.2565 |
| 0.4994 | 2.0833 | 425 | 0.4158 | 22.6188 |
| 0.4257 | 2.2059 | 450 | 0.4127 | 22.5652 |
| 0.4264 | 2.3284 | 475 | 0.4098 | 22.0452 |
| 0.4577 | 2.4510 | 500 | 0.4076 | 20.7065 |
| 0.4442 | 2.5735 | 525 | 0.4062 | 20.6629 |
| 0.4432 | 2.6961 | 550 | 0.4051 | 20.6394 |
| 0.4703 | 2.8186 | 575 | 0.4044 | 20.6428 |
| 0.4313 | 2.9412 | 600 | 0.4041 | 20.6394 |
Base model
openai/whisper-large-v2