router-mmBERT-base-3e-5-batch32

This model is a fine-tuned version of jhu-clsp/mmBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6479
  • Accuracy: 0.6212
  • Precision: 0.6206
  • Recall: 0.6212
  • F1: 0.6194

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
0.7063 0.0232 50 0.7010 0.5262 0.5706 0.5262 0.4763
0.7301 0.0465 100 0.6931 0.5765 0.5767 0.5765 0.5766
0.6609 0.0697 150 0.7301 0.4848 0.5680 0.4848 0.3430
0.6551 0.0929 200 0.7030 0.5395 0.6050 0.5395 0.4140
0.6929 0.1162 250 0.6861 0.5417 0.5651 0.5417 0.5212
0.666 0.1394 300 0.6632 0.6102 0.6094 0.6102 0.6080
0.6708 0.1626 350 0.7310 0.5671 0.6685 0.5671 0.4676
0.6805 0.1859 400 0.6653 0.5897 0.5944 0.5897 0.5891
0.6947 0.2091 450 0.7120 0.5157 0.5950 0.5157 0.4295
0.6248 0.2323 500 0.6713 0.5953 0.6370 0.5953 0.5453
0.6988 0.2556 550 0.6682 0.5848 0.6411 0.5848 0.5193
0.6843 0.2788 600 0.7089 0.5174 0.6174 0.5174 0.4230
0.7576 0.3020 650 0.7033 0.5130 0.6062 0.5130 0.4165
0.6426 0.3253 700 0.6584 0.5964 0.5974 0.5964 0.5966
0.6772 0.3485 750 0.6716 0.5682 0.5997 0.5682 0.5476
0.6944 0.3717 800 0.6593 0.6168 0.6175 0.6168 0.6170
0.6491 0.3950 850 0.6546 0.6113 0.6293 0.6113 0.5861
0.6491 0.4182 900 0.6616 0.5958 0.5986 0.5958 0.5958
0.6337 0.4414 950 0.6718 0.6085 0.6116 0.6085 0.6084
0.6789 0.4647 1000 0.6569 0.6140 0.6272 0.6140 0.5940
0.7058 0.4879 1050 0.6596 0.6041 0.6039 0.6041 0.6040
0.7226 0.5112 1100 0.6544 0.5991 0.6087 0.5991 0.5786
0.6477 0.5344 1150 0.6754 0.5511 0.5936 0.5511 0.5167
0.6556 0.5576 1200 0.6955 0.5461 0.5982 0.5461 0.5023
0.6757 0.5809 1250 0.6546 0.6030 0.6027 0.6030 0.6028
0.6885 0.6041 1300 0.6620 0.5919 0.6076 0.5919 0.5853
0.6325 0.6273 1350 0.6538 0.6057 0.6221 0.6057 0.5803
0.6451 0.6506 1400 0.6691 0.5693 0.6067 0.5693 0.5448
0.6791 0.6738 1450 0.6569 0.5980 0.6005 0.5980 0.5981
0.6814 0.6970 1500 0.6572 0.6074 0.6106 0.6074 0.6073
0.6363 0.7203 1550 0.6777 0.5748 0.5983 0.5748 0.5613
0.6725 0.7435 1600 0.6482 0.6173 0.6175 0.6173 0.6133
0.6086 0.7667 1650 0.6557 0.6052 0.6080 0.6052 0.6052
0.6532 0.7900 1700 0.6549 0.6080 0.6295 0.6080 0.5788
0.6432 0.8132 1750 0.6547 0.6085 0.6108 0.6085 0.5997
0.6259 0.8364 1800 0.6517 0.6091 0.6100 0.6091 0.6093
0.6557 0.8597 1850 0.6462 0.6151 0.6154 0.6151 0.6106
0.6279 0.8829 1900 0.6455 0.6118 0.6125 0.6118 0.6062
0.6886 0.9061 1950 0.6548 0.6124 0.6157 0.6124 0.6122
0.6182 0.9294 2000 0.6472 0.6184 0.6177 0.6184 0.6171
0.6937 0.9526 2050 0.6459 0.6113 0.6200 0.6113 0.5950
0.6525 0.9758 2100 0.6493 0.6146 0.6204 0.6146 0.6021
0.6354 0.9991 2150 0.6480 0.6146 0.6139 0.6146 0.6136
0.7016 1.0223 2200 0.6455 0.6251 0.6285 0.6251 0.6169
0.6634 1.0455 2250 0.6465 0.6157 0.6203 0.6157 0.6049
0.6256 1.0688 2300 0.6448 0.6201 0.6194 0.6201 0.6183
0.6227 1.0920 2350 0.6451 0.6195 0.6226 0.6195 0.6113
0.6402 1.1152 2400 0.6571 0.6085 0.6250 0.6085 0.5838
0.6157 1.1385 2450 0.6561 0.6057 0.6061 0.6057 0.6059
0.6129 1.1617 2500 0.6549 0.6201 0.6250 0.6201 0.6097
0.6632 1.1849 2550 0.6468 0.6140 0.6133 0.6140 0.6127
0.6002 1.2082 2600 0.6535 0.6074 0.6076 0.6074 0.6075
0.6406 1.2314 2650 0.6536 0.6024 0.6016 0.6024 0.6013
0.6015 1.2546 2700 0.6603 0.5997 0.6049 0.5997 0.5988
0.6212 1.2779 2750 0.6595 0.6251 0.6283 0.6251 0.6172
0.6146 1.3011 2800 0.6656 0.5875 0.6011 0.5875 0.5819
0.6407 1.3243 2850 0.6646 0.6063 0.6090 0.6063 0.6063
0.6172 1.3476 2900 0.6722 0.5964 0.6072 0.5964 0.5927
0.5796 1.3708 2950 0.6527 0.6201 0.6197 0.6201 0.6173
0.6513 1.3941 3000 0.6570 0.6080 0.6072 0.6080 0.6070
0.6471 1.4173 3050 0.6524 0.6245 0.6301 0.6245 0.6139
0.6176 1.4405 3100 0.6563 0.6289 0.6367 0.6289 0.6168
0.5867 1.4638 3150 0.6567 0.6218 0.6233 0.6218 0.6157
0.6221 1.4870 3200 0.6566 0.6102 0.6095 0.6102 0.6094
0.5836 1.5102 3250 0.6544 0.6063 0.6058 0.6063 0.6059
0.6173 1.5335 3300 0.6542 0.6041 0.6042 0.6041 0.6041
0.5963 1.5567 3350 0.6557 0.6234 0.6276 0.6234 0.6142
0.6362 1.5799 3400 0.6521 0.6223 0.6240 0.6223 0.6161
0.6366 1.6032 3450 0.6492 0.6234 0.6267 0.6234 0.6153
0.6035 1.6264 3500 0.6525 0.6074 0.6085 0.6074 0.6076
0.6701 1.6496 3550 0.6485 0.6223 0.6243 0.6223 0.6156
0.6376 1.6729 3600 0.6483 0.6207 0.6201 0.6207 0.6186
0.5751 1.6961 3650 0.6474 0.6223 0.6229 0.6223 0.6178
0.6204 1.7193 3700 0.6492 0.6201 0.6194 0.6201 0.6187
0.6822 1.7426 3750 0.6488 0.6190 0.6183 0.6190 0.6175
0.6743 1.7658 3800 0.6477 0.6223 0.6220 0.6223 0.6196
0.6085 1.7890 3850 0.6474 0.6251 0.6257 0.6251 0.6207
0.5896 1.8123 3900 0.6482 0.6195 0.6188 0.6195 0.6182
0.6382 1.8355 3950 0.6472 0.6218 0.6212 0.6218 0.6197
0.6346 1.8587 4000 0.6478 0.6212 0.6206 0.6212 0.6192
0.5711 1.8820 4050 0.6482 0.6218 0.6211 0.6218 0.6201
0.6398 1.9052 4100 0.6483 0.6223 0.6217 0.6223 0.6206
0.5947 1.9284 4150 0.6480 0.6207 0.6200 0.6207 0.6191
0.7037 1.9517 4200 0.6480 0.6218 0.6211 0.6218 0.6201
0.5602 1.9749 4250 0.6478 0.6223 0.6217 0.6223 0.6206
0.6186 1.9981 4300 0.6479 0.6212 0.6206 0.6212 0.6194

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
43
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AmirMohseni/router-mmBERT-base-3e-5-batch32

Finetuned
(26)
this model

Evaluation results