Skip to content

Internationalization (i18n) Guide

This document covers internationalization implementation, translation workflows, and management tools for the Waldur HomePort project.

Developer Guide

Basic Translation Usage

Use the translate function to mark strings for translation:

1
2
3
4
5
6
7
import { translate } from '@waldur/i18n';

// Basic usage
const message = translate('User details');

// With variables
const welcomeMessage = translate('Welcome, {name}!', { name: user.name });

JSX Integration

Rendering JSX Elements Inside Translated Strings

Use formatJsxTemplate for embedding React components:

1
2
3
4
5
6
import { translate, formatJsxTemplate } from '@waldur/i18n';

const message = translate('Opened by {user} at {date}', {
  user: <IssueUser item={item} />,
  date: formatDate(item.created),
}, formatJsxTemplate);

Use formatJsx for translatable text with embedded links:

1
2
3
4
5
6
7
import { translate, formatJsx } from '@waldur/i18n';

const message = translate(
  'You have not added any SSH keys to your <Link>profile</Link>.',
  { Link: (s) => <Link state="profile.keys">{s}</Link> },
  formatJsx,
);

Translation Guidelines

βœ… Do:

  • Always mark complete sentences for translation
  • Use descriptive variable names in templates
  • Provide context when the same word has different meanings
  • Use proper capitalization in source strings

❌ Don't:

  • Combine string fragments at runtime
  • Translate technical terms that are universally understood
  • Use overly complex nested interpolations
  • Ignore cultural adaptation needs
1
2
3
4
5
// ❌ Bad - fragments can't be properly translated
const message = translate('User') + ' ' + translate('saved');

// βœ… Good - complete sentence with context
const message = translate('User {name} saved successfully', { name });

Translation Extraction

Extract translatable strings with rich context for better translations:

1
yarn i18n:extract

Features:

  • Scans all .ts and .tsx files in src/
  • Finds translate() function calls and extracts string literals
  • Updates template.json with rich context information
  • UI Element Detection: Identifies buttons, forms, titles, error messages
  • Variable Analysis: Detects variable types (number, date, string, url)
  • Text Characteristics: Analyzes sentence structure, markup, length
  • Semantic Context: Captures React components and function names
  • Translator Notes: Generates contextual guidance automatically

Example output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
{
  "Save": {
    "message": "Save",
    "context": {
      "primary_ui_type": "submit_button",
      "ui_types": ["submit_button", "action_button"],
      "is_user_facing": true,
      "text_characteristics": {
        "length": 4,
        "wordCount": 1,
        "isSentence": false,
        "startsWithCapital": true
      },
      "usage_count": 53
    },
    "translator_notes": [
      "This text appears on a button. Keep it short and action-oriented."
    ]
  }
}

Translation Management Tools

Available Commands

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Core workflow commands
yarn i18n:extract           # Extract translations from code to template.json
yarn i18n:check               # Full analysis of translation usage and quality
yarn i18n:clean               # Remove deprecated translations from locale files

# Analysis and validation
yarn i18n:check:dry           # Quick summary without detailed lists
yarn i18n:check:locales       # Focus on locale file analysis only
yarn i18n:validate            # CI/CD validation with configurable thresholds

# Language-specific analysis
yarn i18n:analyze <lang>      # Unified language analyzer (e.g., yarn i18n:analyze et)
                             # Supports 27+ languages with specialized analyzers

# LLM-powered improvements
yarn i18n:llm:prepare         # Generate LLM prompts for translation improvements

Comprehensive Analysis

Analyze translation usage and quality issues:

1
2
3
4
5
6
7
8
# Full analysis with detailed output
yarn i18n:check

# Quick summary without detailed lists
yarn i18n:check:dry

# Focus on locale file analysis only
yarn i18n:check:locales

Reports provided:

  • Unused translations: Keys in template.json not found in code
  • Missing translations: Hardcoded strings that should use translate()
  • Locale coverage: Translation completeness for each language
  • Deprecated translations: Translations in locale files but not used in code
  • Empty translations: Locale entries that need content
  • Context information: Where strings are used and in what context

Translation Cleanup

Remove deprecated translations from locale files:

1
2
# Remove deprecated translations from all locale files
yarn i18n:clean

What it does:

  • Compares locale files against the current template.json (generated from code)
  • Removes translations that exist in locale files but are no longer used in the codebase
  • Preserves all valid translations that are still referenced in the code
  • Processes all locale files (*.json) in the /locales/ directory

Safety features:

  • Only removes translations that don't exist in the generated template
  • Maintains proper JSON formatting and structure
  • Shows detailed report of what was removed from each file
  • Preserves all currently used translations

Example output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
🧹 Removing deprecated translations from locale files...

πŸ“‹ Found 4703 keys in template.json
🌍 Processing 29 locale files...

πŸ—‘οΈ  nb.json: Found 199 deprecated keys
   - API secret code:
   - Accept booking request
   - Add offering endpoint
   - All plans comes with
   - Allowed file types
   ... and 194 more
βœ… Updated nb.json

πŸ“Š Summary:
   Total deprecated keys removed: 4201
   Files processed: 29
   Files with changes: 25

CI/CD Validation

Validate translation quality in build pipelines:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Basic validation with warnings
yarn i18n:validate

# Strict validation that fails CI on issues
yarn i18n:validate --fail-on-unused --fail-on-missing

# Custom thresholds
yarn i18n:validate --max-unused 10 --max-missing 50

# Warning-only mode for gradual adoption
yarn i18n:validate --warn-only

Command line options:

  • --max-unused N: Maximum allowed unused translations (default: 50)
  • --max-missing N: Maximum allowed missing translations (default: 100)
  • --fail-on-unused: Fail CI if unused translations exceed limit
  • --fail-on-missing: Fail CI if missing translations exceed limit
  • --warn-only: Show warnings but don't fail CI

Enhanced Context Analysis

UI Element Detection

The enhanced extraction automatically detects UI context:

Button Types:

  • button, submit_button, action_button, delete_button
  • cancel_button, primary_button

Form Elements:

  • input_field, select_field, textarea
  • label, field_label, placeholder

Messages:

  • error_message, success_message, warning_message
  • notification_message, tooltip

Navigation:

  • title, page_title, section_title
  • menu_item, link, navigation_link

Variable Type Analysis

Automatically infers variable types for better translation guidance:

Detected Types:

  • number: count, total, amount, size, index
  • date: date, time, created, updated, expires
  • string: name, title, label, text, description
  • url: url, link, href, path
  • email: email, mail

Example with variables:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
{
  "User {name} created {count} items": {
    "context": {
      "variables": {
        "name": {"type": "string", "position": 5},
        "count": {"type": "number", "position": 20}
      }
    },
    "translator_notes": [
      "Contains variables: name, count. Variable types: string, number.",
      "Variable 'count' may need plural handling in your language."
    ]
  }
}

Automatic Translator Notes

The system generates contextual guidance:

  • UI Context: "This text appears on a button. Keep it short and action-oriented."
  • Variable Handling: "Contains variables that may need reordering in your language."
  • Markup Preservation: "Contains HTML/JSX markup. Preserve all tags and structure."
  • Text Format: "Original text is in ALL CAPS. Consider if this emphasis is appropriate."
  • Grammar: "This is a question. Ensure question format is appropriate in your language."

Translation Workflow

Local Translation Management

This project uses a local-first approach to translation management. All translations are managed directly in the repository without external translation services.

Key Benefits

  • Direct Control: All translations are in the repository and version-controlled
  • Enhanced Context: Translators have access to rich context information from enhanced extraction
  • Quality Tools: Built-in analysis tools for translation quality and completeness
  • No Dependencies: No reliance on external translation platforms or services

Development Workflow

  1. Development: Use translate() function in code
  2. Extraction: Run yarn i18n:extract to update template.json with rich context
  3. Translation: Translators work directly with locale files in /locales/ directory using enhanced context information
  4. Validation: Run yarn i18n:check to verify quality and completeness
  5. Review: Use context-aware analysis tools to improve translation quality

Translation Process

For New Strings

  1. Developer adds translate('New string') in code
  2. Run yarn i18n:extract to add to template.json with rich context
  3. Translator adds translation to relevant locale files (e.g., /locales/nb.json)
  4. Use yarn i18n:check:locales to verify completeness

Quality Assurance

  1. Regular cleanup: Monthly yarn i18n:clean to remove deprecated translations from locale files
  2. Coverage monitoring: yarn i18n:check:locales for translation completeness
  3. Context validation: Rich context is provided by default in template.json
  4. Cultural review: Native speaker validation of context-aware translations
  5. Direct locale editing: Translators work directly with locale files in /locales/ directory

Before cleanup workflow:

  1. Run yarn i18n:extract to update template.json with current code translations
  2. Run yarn i18n:clean to remove deprecated translations from locale files
  3. Run yarn i18n:check to verify the cleanup was successful

Continuous Integration

The local translation management approach integrates seamlessly with CI/CD:

Add to .gitlab-ci.yml:

1
2
3
4
5
6
7
8
translation-check:
  stage: test
  script:
    - yarn install
    - yarn i18n:validate --max-unused 20 --warn-only
  only:
    - merge_requests
    - develop

Best Practices

For Developers

  1. Complete Sentences: Always translate complete, meaningful units
1
2
3
4
5
// ❌ Bad
const message = translate("User") + " " + translate("saved");

// βœ… Good
const message = translate("User saved successfully");
  1. Meaningful Context: Provide context for translators
1
2
3
4
5
6
// ❌ Ambiguous
translate("Save")

// βœ… Better with context from UI element detection
// The enhanced extraction will automatically detect this is a button
<SaveButton>{translate("Save")}</SaveButton>
  1. Variable Naming: Use descriptive variable names
1
2
3
4
5
// ❌ Unclear
translate("Welcome {x}", { x: name })

// βœ… Clear
translate("Welcome {userName}", { userName: name })

For Translators

  1. Use Enhanced Context: Leverage UI type information and translator notes
  2. Preserve Markup: Keep all HTML/JSX tags intact
  3. Consider Variables: Pay attention to variable types and positioning
  4. Cultural Adaptation: Adapt tone and formality to target culture
  5. Consistency: Use established terminology from context analysis

For Maintainers

  1. Regular Cleanup: Use yarn i18n:clean monthly
  2. Monitor Quality: Track translation coverage and issues
  3. Enhanced Extraction: Use enhanced extraction for new features
  4. Documentation: Keep translation guidelines updated

CI/CD Integration

GitLab CI Integration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Translation validation
translation-check:
  stage: test
  script:
    - yarn install
    - yarn i18n:validate --max-unused 20 --warn-only
  only:
    - merge_requests
    - develop

# Enhanced extraction for releases (already default)
translation-extract:
  stage: build
  script:
    - yarn i18n:extract
  artifacts:
    paths:
      - template.json
  only:
    - tags

Pre-commit Hooks

Add to .husky/pre-commit:

1
2
3
4
5
#!/usr/bin/env sh
. "$(dirname -- "$0")/_/husky.sh"

# Check for translation issues
yarn i18n:validate --warn-only

Troubleshooting

Common Issues

"Found more used translations than in template"

  • Run yarn i18n:extract to update the template with new strings

"Too many false positives for missing translations"

  • Adjust ignorePatterns in checkTranslations.cjs
  • Review context where strings appear

"Backup files accumulating"

  • Cleanup script creates timestamped backups
  • Remove old backups manually: rm template.backup.*.json

"Extraction taking too long"

  • The extraction tool includes comprehensive analysis which may take longer on large codebases
  • Consider running extraction only when translation strings have been modified

Configuration

Customize detection rules in checkTranslations.cjs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
const CONFIG = {
  // Ignore patterns for non-translatable strings
  ignorePatterns: [
    /^[a-z_]+$/,           // snake_case variables
    /^\d+$/,               // numbers
    /^[A-Z_]+$/,           // CONSTANTS
    /^https?:\/\//,        // URLs
    /^\/api\//,            // API paths
  ],

  // UI string detection patterns
  uiStringPatterns: [
    /^[A-Z][a-z].*[.!?]?$/,  // Sentences
    /^[A-Z][a-z]+ [a-z]/,    // Multi-word phrases
  ],

  minStringLength: 3,
  maxStringLength: 200,
};

Getting Help

  1. View detailed analysis: yarn i18n:check
  2. Test changes safely: Use --dry-run flags
  3. Check configuration: Review ignore patterns and limits
  4. Gradual adoption: Use --warn-only during transition

File Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
β”œβ”€β”€ template.json                          # Main translation template
β”œβ”€β”€ template-enhanced.json                 # Alternative output file (when using --output option)
β”œβ”€β”€ template.backup.*.json                 # Automatic backups
└── locales/                               # Translated files and tools
    β”œβ”€β”€ en.json
    β”œβ”€β”€ de.json
    β”œβ”€β”€ nb.json
    β”œβ”€β”€ ...
    └── tools/                               # Translation management tools
        β”œβ”€β”€ extractLiteralsFromFiles.cjs     # Translation extraction with rich context
        β”œβ”€β”€ checkTranslations.cjs            # Analysis and detection tool
        β”œβ”€β”€ removeDeprecatedTranslations.cjs # Remove deprecated translations from locale files
        β”œβ”€β”€ validateTranslations.cjs         # CI/CD validation tool
        β”œβ”€β”€ analyzeLanguageTranslations.cjs  # Unified language analyzer (27+ languages)
        β”œβ”€β”€ analyze*Translations.cjs         # Individual language analyzers
        └── simpleLLMProcessor.cjs           # LLM prompt generation tool

Advanced Features

Context-Aware Translation Analysis

The enhanced tools can analyze specific locale quality using the unified analyzer:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Use unified analyzer for any supported language
yarn i18n:analyze <language_code>

# Examples:
yarn i18n:analyze et       # Estonian
yarn i18n:analyze ru       # Russian
yarn i18n:analyze nb       # Norwegian
yarn i18n:analyze de       # German
yarn i18n:analyze es       # Spanish
yarn i18n:analyze fr       # French
yarn i18n:analyze bg       # Bulgarian
yarn i18n:analyze sl       # Slovenian
yarn i18n:analyze el       # Greek

# See all available languages
yarn i18n:analyze --help

Supported Languages: The analyzer supports 27+ languages including European (German, Spanish, French, Italian, Polish, Czech, Estonian, Lithuanian, Latvian, Dutch, Danish, Swedish, Finnish, Norwegian, Slovenian, Bulgarian, Greek), Slavic (Russian, Ukrainian), Middle Eastern (Arabic, Persian), Asian (Thai, Bengali), and Turkic (Azerbaijani, Kyrgyz) languages.

LLM-Powered Translation Improvements

Generate focused prompts for Language Model (LLM) tools like Claude to improve translations:

1
2
# Generate focused improvement prompts for all languages
yarn i18n:llm:prepare

Features:

  • Categorized Prompts: Groups translations by type (missing, problematic, buttons, errors, titles)
  • Batch Processing: ~30 translations per prompt to avoid context overload
  • Direct Editing: Prompts designed for LLMs to directly edit locale files
  • Git Versioning: Changes tracked automatically through git
  • Context-Rich: Uses enhanced template analysis for better translation quality

Generated Structure:

1
2
3
4
5
6
7
8
locales/llm-prompts/
β”œβ”€β”€ et-README.md                     # Estonian prompts overview
β”œβ”€β”€ et-missing-improvements.txt      # Missing translations prompt
β”œβ”€β”€ et-problematic-improvements.txt  # Quality issues prompt
β”œβ”€β”€ et-buttons-improvements.txt      # Button text improvements
β”œβ”€β”€ et-errors-improvements.txt       # Error message improvements
β”œβ”€β”€ et-titles-improvements.txt       # Title/heading improvements
└── ...                             # Other languages (et, ru, etc.)

Workflow:

  1. Generate prompts: yarn i18n:llm:prepare
  2. Copy prompt: From relevant .txt file in locales/llm-prompts/
  3. Paste to LLM: In Claude Code or similar tool
  4. Direct editing: LLM updates locales/{lang}.json directly
  5. Review changes: Use git to see what was modified
  6. Commit improvements: Standard git workflow

Recommended Order:

  1. missing - Adds missing translations
  2. problematic - Fixes obvious quality issues
  3. errors - Improves user experience
  4. buttons - Enhances UI consistency
  5. titles - Polishes terminology

This provides:

  • UI-specific translation recommendations: Button text, form labels, navigation elements
  • Cultural adaptation suggestions: Appropriate formality levels and local conventions
  • Technical terminology optimization: Native terms vs. loanwords, compound words
  • Error message tone analysis: Clarity and user-friendliness
  • Language-specific grammar checks: Case agreement, verb forms, number handling

Estonian-Specific Analysis

The Estonian analyzer provides specialized checks for:

  • Case Agreement: Estonian has 14 cases - the tool checks for proper case usage with variables
  • Imperative Forms: Verifies action buttons use appropriate Estonian imperative (da-infinitive)
  • Number-Noun Agreement: Estonian has complex singular/plural rules with numbers
  • Compound Words: Suggests Estonian compound construction for technical terms
  • Error Message Clarity: Ensures error messages are clear and user-friendly
  • Native Terminology: Identifies loanwords that could be replaced with native Estonian terms
  • Sentence Case: Checks that titles use Estonian sentence case, not English title case

Russian-Specific Analysis

The Russian analyzer focuses on:

  • Verb Forms: Imperative mood and aspect selection for action buttons
  • Formality Levels: Appropriate formal register for error messages and professional contexts
  • Technical Terminology: Balance between international terms and Russian equivalents
  • Error Message Clarity: Clear and appropriate error message phrasing

Norwegian-Specific Analysis

The Norwegian analyzer provides specialized checks for:

  • Imperative Forms: Norwegian button actions use appropriate imperative forms (lagre, slett, rediger)
  • Compound Words: Norwegian preference for compound word construction over separate words
  • Gender Agreement: Proper use of Norwegian gender articles (en/ei/et) with nouns
  • Error Message Clarity: Clear and appropriate Norwegian error message phrasing
  • Native Terminology: Identifies English loanwords that could be replaced with native Norwegian terms
  • Sentence Case: Ensures Norwegian sentence case capitalization rather than English title case

Multi-Language Analysis Support

The unified analyzer provides specialized analysis for 27+ languages, each with language-specific grammar and cultural checks:

Germanic Languages:

  • German (de): Noun capitalization, compound words, formal/informal address (Sie/du), case system
  • Dutch (nl): V2 rule compliance, diminutive forms, modal particles
  • Swedish (sv): Definite articles, pitch accent, en/ett gender system
  • Danish (da): Definite article suffixes, stΓΈd considerations

Romance Languages:

  • Spanish (es): Gender agreement, accent marks, formal/informal address (usted/tΓΊ)
  • French (fr): Elision, liaison, gender agreement, accent marks
  • Italian (it): Gender agreement, double consonants, formal address (Lei/tu)

Slavic Languages:

  • Polish (pl): 7-case system, diacriticals, aspect usage
  • Czech (cs): Complex gender system, hÑček usage, consonant clusters
  • Bulgarian (bg): Cyrillic script, definite article postfixes, aspect usage
  • Slovenian (sl): 6-case system, dual number forms (unique among Slavic)
  • Ukrainian (uk): Cyrillic specifics, apostrophe usage, avoiding Russianisms

Baltic and Finno-Ugric Languages:

  • Estonian (et): 14-case system, imperative forms, compound words, error message clarity
  • Lithuanian (lt): 7-case system, dual forms, diminutives
  • Latvian (lv): 6-case system, palatalization, long vowels

Other European:

  • Greek (el): 4-case system, monotonic accents, Greek alphabet
  • Finnish (fi): 15-case system, vowel harmony, consonant gradation

Non-European:

  • Arabic (ar): RTL text, dual forms, complex number agreement
  • Persian (fa): RTL text, ezafe construction, formal register
  • Thai (th): No word spacing, classifiers, tone considerations
  • Bengali (bn): Three formality levels, conjunct consonants
  • Azerbaijani (az): Vowel harmony, 6-case system
  • Kyrgyz (ky): Vowel harmony, Cyrillic variants

Custom Translation Validation

Create custom validation rules for your specific needs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Custom validator example
const validator = new TranslationValidator({
  maxUnusedTranslations: 10,
  failOnMissing: true,
  customRules: [
    {
      pattern: /button/,
      rule: 'Keep button text under 20 characters'
    }
  ]
});

This comprehensive internationalization system ensures high-quality translations while providing developers and translators with the tools and context needed for effective multilingual applications.