Python: Auto-Translate Content
Automate translation of your programmatic SEO content to multiple languages. Preserve SEO terms, HTML structure, and maintain quality at scale.
Overview
Automate content translation for multilingual programmatic SEO sites. This example shows how to translate HTML content while preserving SEO terms, HTML structure, and maintaining quality.
The Code
import os
import json
from openai import OpenAI
from typing import Optional
# Initialize OpenAI client
client = OpenAI(
api_key=os.environ.get('OPENAI_API_KEY'),
base_url=os.environ.get('OPENAI_API_BASE') # Optional: for xAI, local models
)
def translate_content(
title: str,
content_html: str,
meta_description: str,
slug: str,
target_lang: str = 'de'
) -> Optional[dict]:
"""
Translate blog post content from English to target language.
Preserves HTML structure and keeps SEO terms in English.
Args:
title: Post title in English
content_html: HTML content in English
meta_description: Meta description in English
slug: URL slug (will be translated)
target_lang: Target language code (default: 'de' for German)
Returns:
Dict with translated: title, content_html, meta_description, slug
"""
lang_names = {
'de': 'German',
'en': 'English',
'fr': 'French',
'es': 'Spanish'
}
target_lang_name = lang_names.get(target_lang, target_lang)
# System prompt for translation
system_prompt = f"""You are a professional translator specializing in SEO and technical content.
Translate the provided blog post content from English to {target_lang_name}.
KEEP THESE TERMS IN ENGLISH (commonly used in {target_lang_name} tech/SEO industry):
- SEO, pSEO, Programmatic SEO, Keywords, Long-tail Keywords
- Content, Content Marketing, Ranking, Rankings, SERP
- Crawling, Crawl Budget, Crawler, Indexing, Index
- Backlinks, Link Building, CTR, Bounce Rate
- Landing Page, CMS, API, URL, HTML, CSS, JavaScript
- Template, Meta Tags, Meta Description, Schema Markup
- Core Web Vitals, E-E-A-T, Canonical, Redirect, Sitemap
- Browser, Cache, Hard Refresh, Soft Refresh, Cookies
- Any brand names (Google, Chrome, Firefox, Safari, etc.)
Rules:
1. Preserve ALL HTML tags and structure exactly
2. Only translate the text content, not HTML attributes
3. Keep the English terms listed above - they're standard in German tech writing
4. ACTUALLY TRANSLATE the slug to {target_lang_name} words (not just append -de)
Example: "browser-hard-refresh-chrome" → "browser-cache-leeren-chrome"
5. Maintain the same tone and style
6. Ensure meta_description stays under 160 characters
Return a JSON object with these keys:
- title: translated title
- content_html: translated HTML content (preserve all tags)
- meta_description: translated meta description (max 160 chars)
- slug: TRANSLATED URL slug with {target_lang_name} words (lowercase, hyphens, no special chars)"""
user_prompt = f"""Translate this blog post to {target_lang_name}:
TITLE: {title}
META_DESCRIPTION: {meta_description}
SLUG: {slug}
CONTENT_HTML:
{content_html}"""
try:
response = client.chat.completions.create(
model=os.environ.get('OPENAI_API_MODEL', 'gpt-4o-mini'),
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
max_completion_tokens=32000
)
raw_content = response.choices[0].message.content
# Clean up response - remove markdown code blocks if present
content = raw_content.strip()
if content.startswith("```json"):
content = content[7:]
if content.startswith("```"):
content = content[3:]
if content.endswith("```"):
content = content[:-3]
content = content.strip()
result = json.loads(content)
# Validate required fields
required = ['title', 'content_html', 'meta_description', 'slug']
for field in required:
if field not in result:
raise ValueError(f"Translation missing required field: {field}")
return result
except json.JSONDecodeError as e:
print(f"Failed to parse translation JSON: {e}")
return None
except Exception as e:
print(f"Translation failed: {e}")
return None
# Example usage
english_content = {
"title": "How to Clear Browser Cache in Chrome",
"content_html": "How to Clear Browser Cache
Clearing your browser cache can help resolve loading issues.
",
"meta_description": "Learn how to clear browser cache in Chrome to fix loading problems.",
"slug": "clear-browser-cache-chrome"
}
result = translate_content(**english_content, target_lang='de')
if result:
print(f"Translated title: {result['title']}")
print(f"Translated slug: {result['slug']}")
else:
print("Translation failed")
Key Features
- SEO Term Preservation: Keeps technical terms in English
- HTML Structure: Preserves all HTML tags and attributes
- Slug Translation: Translates URL slugs naturally
- Meta Description: Ensures proper length for SEO
- Error Handling: Graceful handling of API failures
Scaling Tips
- Batch translate multiple pages at once
- Cache translations to avoid re-translating
- Use async/await for concurrent translations
- Monitor translation quality with samples
- Set up retry logic for failed translations