ethicalabs
/

xLSTM-7b-Instruct

@@ -132,6 +132,87 @@ You can also try seeking support from a therapist or counselor if you are strugg
 This model was trained with SFT.
 ### Framework versions
 - PEFT 0.17.1
@@ -142,8 +223,6 @@ This model was trained with SFT.
 - Tokenizers: 0.22.1
 ## Citations
-Cite TRL as:
 ```bibtex
 @misc{vonwerra2022trl,
@@ -154,4 +233,14 @@ Cite TRL as:
 	publisher    = {GitHub},
 	howpublished = {\url{https://github.com/huggingface/trl}}
 }
 ```

 This model was trained with SFT.
+## Evaluation
+This model has been loaded in 4-bit and evaluated with [lighteval](https://github.com/huggingface/lighteval)
+|                         Task                         |Version|                                                           Metric                                                           |Value |   |Stderr|
+|------------------------------------------------------|-------|----------------------------------------------------------------------------------------------------------------------------|-----:|---|-----:|
+|all                                                   |       |acc                                                                                                                         |0.4450|±  |0.1503|
+|                                                      |       |acc:logprob_normalization=LogProbCharNorm(name='norm', ignore_first_space=True)                                             |0.7000|±  |0.1528|
+|                                                      |       |acc:logprob_normalization=LogProbCharNorm(name='norm', ignore_first_space=False)                                            |0.8000|±  |0.1333|
+|                                                      |       |truthfulqa_mc1                                                                                                              |0.4000|±  |0.1633|
+|                                                      |       |truthfulqa_mc2                                                                                                              |0.5256|±  |0.1573|
+|                                                      |       |em:normalize_gold=<function gsm8k_normalizer at 0x7d2a8a2a0fe0>&normalize_pred=<function gsm8k_normalizer at 0x7d2a8a2a0fe0>|0.4000|±  |0.1633|
+|leaderboard:arc:challenge:25                          |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|                                                      |       |acc:logprob_normalization=LogProbCharNorm(name='norm', ignore_first_space=True)                                             |0.7000|±  |0.1528|
+|leaderboard:gsm8k:5                                   |       |em:normalize_gold=<function gsm8k_normalizer at 0x7d2a8a2a0fe0>&normalize_pred=<function gsm8k_normalizer at 0x7d2a8a2a0fe0>|0.4000|±  |0.1633|
+|leaderboard:hellaswag:10                              |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|                                                      |       |acc:logprob_normalization=LogProbCharNorm(name='norm', ignore_first_space=False)                                            |0.8000|±  |0.1333|
+|leaderboard:mmlu:_average:5                           |       |acc                                                                                                                         |0.4386|±  |0.1498|
+|leaderboard:mmlu:abstract_algebra:5                   |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:anatomy:5                            |       |acc                                                                                                                         |0.2000|±  |0.1333|
+|leaderboard:mmlu:astronomy:5                          |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:business_ethics:5                    |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:clinical_knowledge:5                 |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:college_biology:5                    |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:college_chemistry:5                  |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:college_computer_science:5           |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:college_mathematics:5                |       |acc                                                                                                                         |0.6000|±  |0.1633|
+|leaderboard:mmlu:college_medicine:5                   |       |acc                                                                                                                         |0.6000|±  |0.1633|
+|leaderboard:mmlu:college_physics:5                    |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:computer_security:5                  |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:conceptual_physics:5                 |       |acc                                                                                                                         |0.2000|±  |0.1333|
+|leaderboard:mmlu:econometrics:5                       |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:electrical_engineering:5             |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:elementary_mathematics:5             |       |acc                                                                                                                         |0.1000|±  |0.1000|
+|leaderboard:mmlu:formal_logic:5                       |       |acc                                                                                                                         |0.2000|±  |0.1333|
+|leaderboard:mmlu:global_facts:5                       |       |acc                                                                                                                         |0.6000|±  |0.1633|
+|leaderboard:mmlu:high_school_biology:5                |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:high_school_chemistry:5              |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:high_school_computer_science:5       |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:high_school_european_history:5       |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:high_school_geography:5              |       |acc                                                                                                                         |0.8000|±  |0.1333|
+|leaderboard:mmlu:high_school_government_and_politics:5|       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:high_school_macroeconomics:5         |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:high_school_mathematics:5            |       |acc                                                                                                                         |0.1000|±  |0.1000|
+|leaderboard:mmlu:high_school_microeconomics:5         |       |acc                                                                                                                         |0.6000|±  |0.1633|
+|leaderboard:mmlu:high_school_physics:5                |       |acc                                                                                                                         |0.2000|±  |0.1333|
+|leaderboard:mmlu:high_school_psychology:5             |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:high_school_statistics:5             |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:high_school_us_history:5             |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:high_school_world_history:5          |       |acc                                                                                                                         |0.9000|±  |0.1000|
+|leaderboard:mmlu:human_aging:5                        |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:human_sexuality:5                    |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:international_law:5                  |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:jurisprudence:5                      |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:logical_fallacies:5                  |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:machine_learning:5                   |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:management:5                         |       |acc                                                                                                                         |0.6000|±  |0.1633|
+|leaderboard:mmlu:marketing:5                          |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:medical_genetics:5                   |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:miscellaneous:5                      |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:moral_disputes:5                     |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:moral_scenarios:5                    |       |acc                                                                                                                         |0.0000|±  |0.0000|
+|leaderboard:mmlu:nutrition:5                          |       |acc                                                                                                                         |0.8000|±  |0.1333|
+|leaderboard:mmlu:philosophy:5                         |       |acc                                                                                                                         |0.3000|±  |0.1528|
+|leaderboard:mmlu:prehistory:5                         |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:professional_accounting:5            |       |acc                                                                                                                         |0.1000|±  |0.1000|
+|leaderboard:mmlu:professional_law:5                   |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:professional_medicine:5              |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:professional_psychology:5            |       |acc                                                                                                                         |0.1000|±  |0.1000|
+|leaderboard:mmlu:public_relations:5                   |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:security_studies:5                   |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:sociology:5                          |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:mmlu:us_foreign_policy:5                  |       |acc                                                                                                                         |0.4000|±  |0.1633|
+|leaderboard:mmlu:virology:5                           |       |acc                                                                                                                         |0.5000|±  |0.1667|
+|leaderboard:mmlu:world_religions:5                    |       |acc                                                                                                                         |0.7000|±  |0.1528|
+|leaderboard:truthfulqa:mc:0                           |       |truthfulqa_mc1                                                                                                              |0.4000|±  |0.1633|
+|                                                      |       |truthfulqa_mc2                                                                                                              |0.5256|±  |0.1573|
+|leaderboard:winogrande:5                              |       |acc                                                                                                                         |0.6000|±  |0.1633|
 ### Framework versions
 - PEFT 0.17.1
 - Tokenizers: 0.22.1
 ## Citations
 ```bibtex
 @misc{vonwerra2022trl,
 	publisher    = {GitHub},
 	howpublished = {\url{https://github.com/huggingface/trl}}
 }
+```
+```bibtex
+@misc{lighteval,
+  author = {Habib, Nathan and Fourrier, Clémentine and Kydlíček, Hynek and Wolf, Thomas and Tunstall, Lewis},
+  title = {LightEval: A lightweight framework for LLM evaluation},
+  year = {2023},
+  version = {0.11.0},
+  url = {https://github.com/huggingface/lighteval}
+}
 ```