roslein
/

Qwen3-32B-abliterated

Model card Files Files and versions

roslein commited on May 2

Commit

5c0b746

·

verified ·

1 Parent(s): 3953c0c

Update README.md

Files changed (1) hide show

README.md +44 -3

README.md CHANGED Viewed

@@ -1,3 +1,44 @@
----
-license: cc-by-nc-4.0
----

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen3-32B
+---
+## Qwen3-32B Abliterated Model
+### Model Overview
+Note: **all legal rights belong to Qwen!**
+Qwen3-32B is a powerful causal language model with the following specifications:
+**Type:** Causal Language Model
+Training Stages: Pretraining & Post-training
+Parameters: 32.8B (31.2B non-embedding)
+Architecture: 64 layers with 64 query attention heads and 8 key-value heads (GQA)
+Context Length: 32,768 tokens natively, expandable to 131,072 tokens with YaRN
+### Abliteration Process
+This model was abliterated using the proportional scaling technique, which applies different abliteration strengths to different layers based on their refusal factors. The process used the following parameters:
+--proportional-scaling: Enabled variable abliteration strength across layers
+--max-scale-factor: Set to 2.25 to control the maximum abliteration intensity
+### Abliteration Results
+- abliterated by: *https://github.com/JanRoslein/Abliteration-by-Transformers.git*
+The abliteration process primarily affected specific layers, with the most significant changes in:
+- Performance Characteristics
+The abliterated model demonstrates a balanced trade-off between reduced refusal behavior and quality preservation:
+- Refusal Behavior: The model still refuses some harmful requests but is generally more open to responding to a wider range of prompts
+- Quality Impact: Some minor degradation in overall quality, particularly noticeable in:
+Responses in less common languages
+- Nuanced reasoning tasks
+- Complex instruction following
+### Recommended Use
+This model represents a middle ground between safety and capability. It's suitable for:
+- Research purposes where reduced refusal behavior is beneficial
+- Applications where some safety guardrails are still desired
+- Scenarios where slight quality degradation is acceptable
+### Future Improvements
+For optimal performance, this model would benefit from a full retraining phase to restore the weights affected by abliteration while maintaining the reduced refusal behavior.