You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Claude 3.7 Sonnet Reasoning - Llama 3.2 Fine-tuned Model

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the rahmanazhar/claude-3.7-sonnet-reasoning dataset. It's designed to mimic Claude 3.7's reasoning capabilities, particularly the "thinking out loud" process that Claude uses to solve complex problems.

Model Details

Base Model: Llama-3.2-3B-Instruct
Training Type: LoRA fine-tuning (Parameter-Efficient Fine-Tuning)
Dataset: claude-3.7-sonnet-reasoning
Training Samples: 189 high-quality reasoning examples
Training Method: Supervised Fine-Tuning (SFT)
Context Length: 4096 tokens
LoRA Parameters:
- r=16
- lora_alpha=32
- lora_dropout=0.05
- target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

About the Dataset

The dataset contains 189 examples of Claude 3.7 Sonnet's reasoning process, captured in its "" tags. These examples showcase Claude's step-by-step reasoning approach to various questions and problems across multiple domains, including:

Programming
Mathematics
Philosophy
Logic
Critical thinking
Problem-solving

Model Capabilities

This model has been fine-tuned to emulate Claude 3.7's reasoning style, specifically:

Step-by-step thinking: Breaking down complex problems into manageable pieces
Self-questioning: Posing questions to guide the reasoning process
Consideration of alternatives: Exploring multiple approaches to problems
Structured analysis: Methodically working through problems with clear organization

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "rahmanazhar/claude-3.7-sonnet-reasoning-finetuned" # Llama 3.2 based model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example prompt
prompt = "[INST] Is the fear of death rational, or is it primarily driven by the unknown? [/INST]"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    inputs.input_ids,
    max_length=1024,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1
)

# Decode the response
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)

Limitations

The training dataset is limited to 189 examples, so the model has less variety in its reasoning patterns compared to the original Claude model.
The model might generate reasoning that appears plausible but contains factual errors or logical fallacies.
As with all language models, it may produce biased or harmful content in certain contexts.
Performance depends on hardware capabilities due to the 3B parameter size of the base model.

Training Process

The model was trained using LoRA (Low-Rank Adaptation) fine-tuning, which allows for efficient adaptation of the base model without modifying all parameters. This approach preserves the general capabilities of Llama 3.2 while adapting its reasoning style to match Claude 3.7.

Running the Training

This repository includes scripts for fine-tuning the model on your own hardware:

Standard Training (Any Platform)

# Setup environment and train the model
./run.sh finetune

# Test the model with a custom prompt
./run.sh test "Your prompt here"

Mac with Apple Silicon (M1/M2/M3)

This repository includes optimized support for training on Mac with Metal acceleration:

# Run with Metal GPU acceleration on Apple Silicon
./run_mac.sh finetune

# Test the model with Metal acceleration
./run_mac.sh test "Your prompt here"

The run_mac.sh script automatically:

Configures optimal Metal Performance Shaders (MPS) settings
Sets environment variables to improve compatibility
Verifies PyTorch is properly configured for Metal
Adapts training parameters for better performance on Apple Silicon

Hardware Requirements

GPU (CUDA): NVIDIA GPU with at least 8GB VRAM recommended
Apple Silicon: M1/M2/M3 Mac with at least 16GB RAM recommended
CPU-only: Possible but very slow, 32GB+ RAM recommended

Software Requirements

Python 3.8+
PyTorch 2.0.0+ (with CUDA or MPS support)
All dependencies listed in requirements.txt

Acknowledgements

Thanks to Meta for the base model
Thanks to rahmanazhar for creating and sharing the Claude 3.7 reasoning dataset
Thanks to Anthropic for creating Claude 3.7 Sonnet

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for rahmanazhar/meta-claude-3.7-finetuned

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(561)

this model

Dataset used to train rahmanazhar/meta-claude-3.7-finetuned

Evaluation results

ROUGE-L
self-reported

0.850
Semantic Similarity
self-reported

0.920

Metadata error: specify a dataset to view leaderboard