You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Claude 3.7 Sonnet Reasoning - Llama 3.2 Fine-tuned Model

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the rahmanazhar/claude-3.7-sonnet-reasoning dataset. It's designed to mimic Claude 3.7's reasoning capabilities, particularly the "thinking out loud" process that Claude uses to solve complex problems.

Model Details

  • Base Model: Llama-3.2-3B-Instruct
  • Training Type: LoRA fine-tuning (Parameter-Efficient Fine-Tuning)
  • Dataset: claude-3.7-sonnet-reasoning
  • Training Samples: 189 high-quality reasoning examples
  • Training Method: Supervised Fine-Tuning (SFT)
  • Context Length: 4096 tokens
  • LoRA Parameters:
    • r=16
    • lora_alpha=32
    • lora_dropout=0.05
    • target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

About the Dataset

The dataset contains 189 examples of Claude 3.7 Sonnet's reasoning process, captured in its "" tags. These examples showcase Claude's step-by-step reasoning approach to various questions and problems across multiple domains, including:

  • Programming
  • Mathematics
  • Philosophy
  • Logic
  • Critical thinking
  • Problem-solving

Model Capabilities

This model has been fine-tuned to emulate Claude 3.7's reasoning style, specifically:

  1. Step-by-step thinking: Breaking down complex problems into manageable pieces
  2. Self-questioning: Posing questions to guide the reasoning process
  3. Consideration of alternatives: Exploring multiple approaches to problems
  4. Structured analysis: Methodically working through problems with clear organization

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "rahmanazhar/claude-3.7-sonnet-reasoning-finetuned" # Llama 3.2 based model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example prompt
prompt = "[INST] Is the fear of death rational, or is it primarily driven by the unknown? [/INST]"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    inputs.input_ids,
    max_length=1024,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1
)

# Decode the response
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)

Limitations

  • The training dataset is limited to 189 examples, so the model has less variety in its reasoning patterns compared to the original Claude model.
  • The model might generate reasoning that appears plausible but contains factual errors or logical fallacies.
  • As with all language models, it may produce biased or harmful content in certain contexts.
  • Performance depends on hardware capabilities due to the 3B parameter size of the base model.

Training Process

The model was trained using LoRA (Low-Rank Adaptation) fine-tuning, which allows for efficient adaptation of the base model without modifying all parameters. This approach preserves the general capabilities of Llama 3.2 while adapting its reasoning style to match Claude 3.7.

Running the Training

This repository includes scripts for fine-tuning the model on your own hardware:

Standard Training (Any Platform)

# Setup environment and train the model
./run.sh finetune

# Test the model with a custom prompt
./run.sh test "Your prompt here"

Mac with Apple Silicon (M1/M2/M3)

This repository includes optimized support for training on Mac with Metal acceleration:

# Run with Metal GPU acceleration on Apple Silicon
./run_mac.sh finetune

# Test the model with Metal acceleration
./run_mac.sh test "Your prompt here"

The run_mac.sh script automatically:

  • Configures optimal Metal Performance Shaders (MPS) settings
  • Sets environment variables to improve compatibility
  • Verifies PyTorch is properly configured for Metal
  • Adapts training parameters for better performance on Apple Silicon

Hardware Requirements

  • GPU (CUDA): NVIDIA GPU with at least 8GB VRAM recommended
  • Apple Silicon: M1/M2/M3 Mac with at least 16GB RAM recommended
  • CPU-only: Possible but very slow, 32GB+ RAM recommended

Software Requirements

  • Python 3.8+
  • PyTorch 2.0.0+ (with CUDA or MPS support)
  • All dependencies listed in requirements.txt

Acknowledgements

  • Thanks to Meta for the base model
  • Thanks to rahmanazhar for creating and sharing the Claude 3.7 reasoning dataset
  • Thanks to Anthropic for creating Claude 3.7 Sonnet
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rahmanazhar/meta-claude-3.7-finetuned

Adapter
(561)
this model

Dataset used to train rahmanazhar/meta-claude-3.7-finetuned

Evaluation results