Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,44 @@
|
|
| 1 |
-
---
|
| 2 |
-
license:
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model:
|
| 4 |
+
- Qwen/Qwen3-32B
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Qwen3-32B Abliterated Model
|
| 8 |
+
### Model Overview
|
| 9 |
+
Note: **all legal rights belong to Qwen!**
|
| 10 |
+
Qwen3-32B is a powerful causal language model with the following specifications:
|
| 11 |
+
|
| 12 |
+
**Type:** Causal Language Model
|
| 13 |
+
Training Stages: Pretraining & Post-training
|
| 14 |
+
Parameters: 32.8B (31.2B non-embedding)
|
| 15 |
+
Architecture: 64 layers with 64 query attention heads and 8 key-value heads (GQA)
|
| 16 |
+
Context Length: 32,768 tokens natively, expandable to 131,072 tokens with YaRN
|
| 17 |
+
|
| 18 |
+
### Abliteration Process
|
| 19 |
+
This model was abliterated using the proportional scaling technique, which applies different abliteration strengths to different layers based on their refusal factors. The process used the following parameters:
|
| 20 |
+
|
| 21 |
+
--proportional-scaling: Enabled variable abliteration strength across layers
|
| 22 |
+
--max-scale-factor: Set to 2.25 to control the maximum abliteration intensity
|
| 23 |
+
|
| 24 |
+
### Abliteration Results
|
| 25 |
+
- abliterated by: *https://github.com/JanRoslein/Abliteration-by-Transformers.git*
|
| 26 |
+
The abliteration process primarily affected specific layers, with the most significant changes in:
|
| 27 |
+
|
| 28 |
+
- Performance Characteristics
|
| 29 |
+
The abliterated model demonstrates a balanced trade-off between reduced refusal behavior and quality preservation:
|
| 30 |
+
|
| 31 |
+
- Refusal Behavior: The model still refuses some harmful requests but is generally more open to responding to a wider range of prompts
|
| 32 |
+
- Quality Impact: Some minor degradation in overall quality, particularly noticeable in:
|
| 33 |
+
Responses in less common languages
|
| 34 |
+
- Nuanced reasoning tasks
|
| 35 |
+
- Complex instruction following
|
| 36 |
+
|
| 37 |
+
### Recommended Use
|
| 38 |
+
This model represents a middle ground between safety and capability. It's suitable for:
|
| 39 |
+
- Research purposes where reduced refusal behavior is beneficial
|
| 40 |
+
- Applications where some safety guardrails are still desired
|
| 41 |
+
- Scenarios where slight quality degradation is acceptable
|
| 42 |
+
|
| 43 |
+
### Future Improvements
|
| 44 |
+
For optimal performance, this model would benefit from a full retraining phase to restore the weights affected by abliteration while maintaining the reduced refusal behavior.
|