nexaml commited on
Commit
3ab81f5
Β·
verified Β·
1 Parent(s): ed47d3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -2
README.md CHANGED
@@ -1,4 +1,11 @@
1
  ---
 
 
 
 
 
 
 
2
  base_model:
3
  - deepseek-ai/DeepSeek-OCR
4
  ---
@@ -7,6 +14,15 @@ base_model:
7
  > [!NOTE]
8
  > Note currently only [NexaSDK](https://github.com/NexaAI/nexa-sdk) supports this model's GGUF.
9
 
 
 
 
 
 
 
 
 
 
10
  ## Model Description
11
  **DeepSeek OCR** is a high-accuracy optical character recognition model built for extracting text from complex visual inputs such as documents, screenshots, receipts, and natural scenes.
12
  It combines vision-language modeling with efficient visual encoders to achieve superior recognition of multi-language and multi-layout text while remaining lightweight enough for edge or on-device deployment.
@@ -15,7 +31,7 @@ It combines vision-language modeling with efficient visual encoders to achieve s
15
  - **Multilingual OCR** β€” recognizes printed and handwritten text across major global languages.
16
  - **Document Layout Understanding** β€” preserves structure such as tables, paragraphs, and titles.
17
  - **Scene Text Recognition** β€” robust against lighting, distortion, and low-quality captures.
18
- - **Lightweight & Fast** β€” optimized for CPU, GPU, and NPU acceleration.
19
  - **End-to-End Pipeline** β€” supports image-to-text and structured JSON output.
20
 
21
  ## Use Cases
@@ -38,7 +54,6 @@ It combines vision-language modeling with efficient visual encoders to achieve s
38
  DeepSeek OCR can be integrated through:
39
  - Python API (`pip install deepseek-ocr`)
40
  - REST or gRPC endpoints for server deployment
41
- - On-device SDKs optimized for NPUs (via NexaSDK, OpenVINO, or TensorRT)
42
 
43
  ## License
44
  This model is released under the **Apache 2.0 License**, allowing commercial use, modification, and redistribution with attribution.
 
1
  ---
2
+ language:
3
+ - multilingual
4
+ tags:
5
+ - deepseek
6
+ - vision-language
7
+ - ocr
8
+ - document-parse
9
  base_model:
10
  - deepseek-ai/DeepSeek-OCR
11
  ---
 
14
  > [!NOTE]
15
  > Note currently only [NexaSDK](https://github.com/NexaAI/nexa-sdk) supports this model's GGUF.
16
 
17
+ ## Quickstart
18
+
19
+ 1. **Install [NexaSDK](https://github.com/NexaAI/nexa-sdk)**
20
+ 2. Run the model locally with one line of code:
21
+
22
+ ```bash
23
+ nexa infer NexaAI/DeepSeek-OCR-GGUF
24
+ ```
25
+
26
  ## Model Description
27
  **DeepSeek OCR** is a high-accuracy optical character recognition model built for extracting text from complex visual inputs such as documents, screenshots, receipts, and natural scenes.
28
  It combines vision-language modeling with efficient visual encoders to achieve superior recognition of multi-language and multi-layout text while remaining lightweight enough for edge or on-device deployment.
 
31
  - **Multilingual OCR** β€” recognizes printed and handwritten text across major global languages.
32
  - **Document Layout Understanding** β€” preserves structure such as tables, paragraphs, and titles.
33
  - **Scene Text Recognition** β€” robust against lighting, distortion, and low-quality captures.
34
+ - **Lightweight & Fast** β€” optimized for CPU and GPU acceleration.
35
  - **End-to-End Pipeline** β€” supports image-to-text and structured JSON output.
36
 
37
  ## Use Cases
 
54
  DeepSeek OCR can be integrated through:
55
  - Python API (`pip install deepseek-ocr`)
56
  - REST or gRPC endpoints for server deployment
 
57
 
58
  ## License
59
  This model is released under the **Apache 2.0 License**, allowing commercial use, modification, and redistribution with attribution.