---
title: Gdrive Pdf Translator Mcp
emoji: ⚡
colorFrom: gray
colorTo: yellow
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: true
license: mit
short_description: Translate PDFs in GDrive.Preserves layouts, translate images
tags:
- building-mcp-track-enterprise
- mcp-server
- google-drive
- pdf-translation
- claude-mcp
- agentic-workflow
- document-automation
- layout-preservation
- vision-ai
- anthropic
- modal
- gradio
---
# ⚡ Gdrive Pdf Translator Mcp
> **Intelligent PDF translation system** that preserves complex layouts and translates text in images. Claude MCP orchestrates a multi-pass pipeline combining layout analysis, context-aware translation, and vision AI. Full Google Drive workflow automation included.
**Built for Anthropic MCP 1st Birthday Hackathon - Track 1: Enterprise MCP Servers**
### Social Media Announcement
[Linkedin](https://www.linkedin.com/posts/hervengisse_mcp-agenticai-googledrive-activity-7401037520680988674-XCw-?utm_source=share&utm_medium=member_desktop&rcm=ACoAADzHZK8BNqjyQYj0pXM2_1DndQT7iSRSkyY)
---
## Demo Video
you can found it in ./assets/demo.mp4
**Watch Claude Desktop MCP in action**: Search → Download → Translate → Upload to Drive - all in one command!
[google drive link of video](https://drive.google.com/file/d/1072WWEbLgX5upj6DJw10_MgFWapJR-59/view?usp=sharing)
---
## Key Features
### **One-Command Google Drive Workflow**
```text
"Translate my Q3 report from Drive to French and save it back"
Claude MCP automatically:
• Searches your Drive for "Q3 report"
• Downloads the PDF (OAuth authenticated)
• Analyzes content (20 pages, technical diagrams)
• Translates text + images (multi-pass pipeline)
• Uploads result to Drive (same folder or custom location)
• Provides direct link to translated document
```
### **Layout Preservation**
- Maintains original document structure
- Preserves fonts, sizes, positions, spacing
- Handles complex layouts: tables, columns, headers, footers
- Perfect for academic papers, contracts, technical reports
### **Image Text Translation**
- Detects text in diagrams, charts, screenshots
- Uses vision OCR + translation
- Overlays translated text while preserving image layout
- Handles medical diagrams, flowcharts, architectural plans
### **Context-Aware Translation**
- Generates document summary for terminology consistency
- Multi-pass pipeline: context analysis → text → images → layout rebuild
- Handles technical jargon, domain-specific terminology
- Dual output: monolingual + bilingual side-by-side
### 🌍 **11 Languages Supported**
English • French • Spanish • German • Italian • Portuguese • Chinese • Japanese • Korean • Russian • Arabic
### **Limits (Test Phase)**
- **Gradio Web Interface**: 20 pages max per PDF
- **Claude Desktop MCP**: 10 pages max (timeout constraints)
- Automatic validation prevents token abuse
## How It Works
### Translation Pipeline
```
┌─────────────────────────────────────────────────────────────┐
│ TRANSLATION PIPELINE │
└─────────────────────────────────────────────────────────────┘
1️⃣ RECEIVE & ANALYZE
└── Receive PDF file (local or Google Drive)
└── Extract document structure
└── Generate context summary for consistency
└── Generate glossary for consistency
2️⃣ PASS 1: TEXT TRANSLATION
└── Extract text blocks with positions
└── Translate page by page (preserving context)
└── Maintain fonts, sizes, and layout
└── Output: Mono (translated) + Dual (bilingual)
3️⃣ PASS 2: IMAGE TRANSLATION
└── Detect images containing text
└── OCR extraction using vision model
└── Translate detected text
└── Overlay translated text on images
└── Output: *_images_translated.pdf
4️⃣ OUTPUT FILES
└── mono.images_translated.pdf
└── dual.images_translated.pdf
```
### MCP Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ CLAUDE DESKTOP (MCP Client) │
│ "Translate my Q3 report to French and save to Drive" │
└──────────────────────-──────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────────┐
│ MCP SERVER (server.py) │
│ Tools: │
│ • search_gdrive(query) │
│ • download_from_gdrive(file_id) │
│ • translate_pdf(source, target_lang) │
│ • upload_to_gdrive(file_path, folder_id) │
│ • translate_and_upload() [all-in-one] │
└──────────────────────-──────────────────────────────────────┘
│
↓
┌─────────────────────────────────────────────────────────────┐
│ MODAL SERVERLESS (BabelDOC) │
│ • Layout-preserving PDF translation │
│ • Nebius AI (GPT-OSS-120B for text) │
└─────────────────────────────────────────────────────────────┘
```
---
## Technologies Used
- **Translation Engine**: BabelDOC for layout-preserving PDF translation
- **Text LLM**: Nebius AI Studio (GPT-OSS-120B)
- **Vision Model**: Qwen2.5-VL-72B-Instruct for image text translation
- **MCP Framework**: FastMCP for Claude Desktop integration
- **Cloud Platform**: Modal for serverless deployment
- **Storage Integration**: Google Drive API with OAuth 2.0
- **UI Framework**: Gradio 6.0.1 for web interface
- **Language**: Python 3.11
---
## Use Cases
### Academic Research
- Translate research papers while preserving citations, formulas, diagrams
- Maintain scientific notation and mathematical expressions
- Keep figure captions and references intact
### Legal Documents
- Translate contracts with preserved formatting
- Maintain clause structure and numbering
- Keep signatures and stamps in place
### Technical Documentation
- Translate manuals with diagrams and flowcharts
- Preserve code snippets and technical specifications
- Keep architectural diagrams readable
### Business Reports
- Translate quarterly reports with charts and graphs
- Maintain table formatting and data visualization
- Preserve branding elements and logos
---
## Sample Commands (Claude Desktop MCP)
### Basic Translation
```
"Translate this PDF to French"
(with file attached)
```
### Google Drive Integration
```
"List my PDF files in Google Drive"
"Search for 'quarterly report' in my Drive"
"Translate the Q3 report from my Drive to German"
"Translate report.pdf and save it to my Translations folder"
```
### All-in-One Workflow
```
"Find my cardiovascular research paper in Drive,
translate it to Spanish, and save the result back"
```
### Check & Manage
```
"What languages are supported?"
"Check if the translation service is running"
"Show me my Google Drive folders"
```
---
## Setup Guide
### For End Users (Gradio Web Interface)
1. **Visit the Space**: [https://huggingface.co/spaces/MCP-1st-Birthday/gdrive-pdf-translator-mcp](https://huggingface.co/spaces/MCP-1st-Birthday/gdrive-pdf-translator-mcp)
2. **Upload a PDF** or **paste a public Google Drive link**
3. **Select target language**
4. **Click "Translate PDF"**
5. **Download** the translated document
**Note**: The web interface supports public Google Drive links only (no OAuth).
### For Claude Desktop (Full MCP Experience)
#### Prerequisites
- Claude Desktop installed
- Python 3.11+
- Google Cloud account (for Drive OAuth)
- Modal account (backend already deployed)
#### Step 1: Google Cloud Setup
1. Create project on [Google Cloud Console](https://console.cloud.google.com/)
2. Enable **Google Drive API**
3. Create **OAuth 2.0 credentials** (Desktop app)
4. Download `gcp-oauth.keys.json`
5. Save to: `C:\Users\YourName\Downloads\gcp-oauth.keys.json`
#### Step 2: Clone MCP Server
```bash
git clone https://huggingface.co/spaces/MCP-1st-Birthday/gdrive-pdf-translator-mcp
cd gdrive-pdf-translator-mcp
pip install -r requirements.txt
```
#### Step 3: Configure Claude Desktop
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
**macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
```json
{
"mcpServers": {
"babeldocs": {
"command": "C:\\Python311\\python.exe",
"args": [
"C:\\path\\to\\gdrive-pdf-translator-mcp\\server.py"
],
"env": {
"BABELDOCS_MODAL_URL": "https://h-xml--mcp1stann-babeldocs",
"GDRIVE_OAUTH_CREDENTIALS": "C:\\Users\\YourName\\Downloads\\gcp-oauth.keys.json"
}
}
}
}
```
**Important**: Use double backslashes `\\` in Windows paths.
#### Step 4: Restart Claude Desktop
Close **all** Claude Desktop windows, wait 5 seconds, then reopen.
#### Step 5: First Use
On first Google Drive access, a browser will open for OAuth authorization. Sign in and allow access.
---
## Project Structure
```
gdrive-pdf-translator-mcp/
├── app.py # Gradio web application
├── server.py # MCP Server for Claude Desktop
├── modal_deploy.py # Modal serverless deployment
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── __init__.py # Module marker
└── README.md # This file
```
---
## Troubleshooting
### Gradio Web Interface
| Problem | Solution |
|---------|----------|
| Interface stuck on "Starting..." | Hard refresh (Ctrl+F5 or Cmd+Shift+R) |
| Google Drive download fails | Ensure the link is **public** (no OAuth in web version) |
| Translation timeout | PDF may exceed 20-page limit |
| "Invalid Google Drive URL" | Use format: `https://drive.google.com/file/d/...` |
### Claude Desktop MCP
| Problem | Solution |
|---------|----------|
| MCP server not appearing | **Restart Claude Desktop completely** (close all windows) |
| "BABELDOCS_MODAL_URL required" | Check `claude_desktop_config.json` env section |
| Google Drive auth fails | Delete `gdrive_token.json` and re-authorize |
| Translation timeout | PDF may exceed 10-page MCP limit |
| Tools not showing | Click hammer icon 🔨 in Claude Desktop |
### Common Errors
```
- "PDF has X pages. Maximum allowed: 20 pages" (Gradio)
- "PDF has X pages. Maximum allowed: 10 pages" (MCP)
→ Test phase limits. Split your PDF or use web interface for larger files.
- "BABELDOCS_MODAL_URL required"
→ Environment variable missing. Check configuration.
```
---
## Roadmap
- [ ] Increase page limits after test phase
- [ ] Add glossary/terminology management in claude desktop
- [ ] Translation memory for consistency
- [ ] Integration with more cloud storage (Dropbox, OneDrive)
---
## Links
- **Live Demo**: [https://huggingface.co/spaces/MCP-1st-Birthday/gdrive-pdf-translator-mcp](https://huggingface.co/spaces/MCP-1st-Birthday/gdrive-pdf-translator-mcp)
- **MCP Server Code**: Available in this Space's Files tab server.py
- **Modal Backend (`BABELDOCS_MODAL_URL`)**: [Gist](https://gist.github.com/h-mbl/b0ec9ecc17d61fb3049759d7b2152ca2)
- **BabelDOC**: [https://github.com/funstory-ai/BabelDOC](https://github.com/funstory-ai/BabelDOC)
---
## Acknowledgments
- [BabelDOC](https://github.com/funstory-ai/BabelDOC) - Core PDF translation engine with layout preservation
- [manga-image-translator](https://github.com/zyddnys/manga-image-translator) - Image text detection and translation toolkit
- [Modal](https://modal.com) - Serverless infrastructure
- [Nebius AI](https://studio.nebius.com) - LLM & Vision APIs
- [FastMCP](https://github.com/jlowin/fastmcp) - MCP server framework
- [Gradio](https://gradio.app) - Web interface
- [Anthropic Claude](https://anthropic.com) - MCP protocol
- [Hugging Face](https://huggingface.co) - Hosting platform
---
## License
This project is licensed under the MIT License.
---
Built by **herve** for the Anthropic MCP 1st Birthday Hackathon 2025