Text Generation
Transformers
Safetensors
xlstm
sft
trl
conversational
🇪🇺 Region: EU
mrs83 commited on
Commit
2c2ed15
·
verified ·
1 Parent(s): 679d842

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -223,7 +223,43 @@ This model has been loaded in 4-bit and evaluated with [lighteval](https://githu
223
  - Tokenizers: 0.22.1
224
 
225
  ## Citations
226
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
  ```bibtex
228
  @misc{vonwerra2022trl,
229
  title = {{TRL: Transformer Reinforcement Learning}},
 
223
  - Tokenizers: 0.22.1
224
 
225
  ## Citations
226
+
227
+ ```bibtext
228
+ @misc{beck2024xlstmextendedlongshortterm,
229
+ title={xLSTM: Extended Long Short-Term Memory},
230
+ author={Maximilian Beck and Korbinian Pöppel and Markus Spanring and Andreas Auer and Oleksandra Prudnikova and Michael Kopp and Günter Klambauer and Johannes Brandstetter and Sepp Hochreiter},
231
+ year={2024},
232
+ eprint={2405.04517},
233
+ archivePrefix={arXiv},
234
+ primaryClass={cs.LG},
235
+ url={https://arxiv.org/abs/2405.04517},
236
+ }
237
+ ```
238
+
239
+ ```
240
+ @misc{han2024parameterefficientfinetuninglargemodels,
241
+ title={Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey},
242
+ author={Zeyu Han and Chao Gao and Jinyang Liu and Jeff Zhang and Sai Qian Zhang},
243
+ year={2024},
244
+ eprint={2403.14608},
245
+ archivePrefix={arXiv},
246
+ primaryClass={cs.LG},
247
+ url={https://arxiv.org/abs/2403.14608},
248
+ }
249
+ ```
250
+
251
+ ```
252
+ @misc{liu2024doraweightdecomposedlowrankadaptation,
253
+ title={DoRA: Weight-Decomposed Low-Rank Adaptation},
254
+ author={Shih-Yang Liu and Chien-Yi Wang and Hongxu Yin and Pavlo Molchanov and Yu-Chiang Frank Wang and Kwang-Ting Cheng and Min-Hung Chen},
255
+ year={2024},
256
+ eprint={2402.09353},
257
+ archivePrefix={arXiv},
258
+ primaryClass={cs.CL},
259
+ url={https://arxiv.org/abs/2402.09353},
260
+ }
261
+ ```
262
+
263
  ```bibtex
264
  @misc{vonwerra2022trl,
265
  title = {{TRL: Transformer Reinforcement Learning}},