Update README.md
Browse files
README.md
CHANGED
|
@@ -12,14 +12,14 @@ library_name: nemo
|
|
| 12 |
|
| 13 |
# Model Overview
|
| 14 |
|
| 15 |
-
Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
|
| 16 |
Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
|
| 17 |
|
| 18 |
This model is for research and development only.
|
| 19 |
|
| 20 |
**Model Developer:** NVIDIA
|
| 21 |
|
| 22 |
-
**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and
|
| 23 |
|
| 24 |
## License
|
| 25 |
|
|
@@ -93,7 +93,7 @@ print(output_text)
|
|
| 93 |
|
| 94 |
## Evaluation Results
|
| 95 |
|
| 96 |
-
*Zero-shot performance.* Evaluated using select datasets from the [
|
| 97 |
|
| 98 |
| MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
|
| 99 |
| :------------- | :------------- | :------------- | :------------- | :------------- |
|
|
|
|
| 12 |
|
| 13 |
# Model Overview
|
| 14 |
|
| 15 |
+
Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
|
| 16 |
Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
|
| 17 |
|
| 18 |
This model is for research and development only.
|
| 19 |
|
| 20 |
**Model Developer:** NVIDIA
|
| 21 |
|
| 22 |
+
**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Sept 2024.
|
| 23 |
|
| 24 |
## License
|
| 25 |
|
|
|
|
| 93 |
|
| 94 |
## Evaluation Results
|
| 95 |
|
| 96 |
+
*Zero-shot performance.* Evaluated using select Hindi datasets from the [Airavata Evaluation Framework](https://github.com/AI4Bharat/IndicInstruct) with additions:
|
| 97 |
|
| 98 |
| MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
|
| 99 |
| :------------- | :------------- | :------------- | :------------- | :------------- |
|