From: Jiang Xin Date: Wed, 25 Feb 2026 06:19:25 +0000 (+0800) Subject: l10n: docs: add review instructions in AGENTS.md X-Git-Tag: v2.54.0~1^2~14 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=6f8e885fd374d0133ae619fe7e327a5c56151705;p=thirdparty%2Fgit.git l10n: docs: add review instructions in AGENTS.md Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides comprehensive guidance for AI agents to review translation files. Translation diffs lose context, especially for multi-line msgid and msgstr entries. Some LLMs ignore context and cannot evaluate translations accurately; others rely on scripts to search for context in source files, making the review process time-consuming. To address this, git-po-helper implements the compare subcommand, which extracts new or modified translations with full context (complete msgid/msgstr pairs), significantly improving review efficiency. A limitation is that the extracted content lacks other already-translated content for reference, which may affect terminology consistency. This is mitigated by including a glossary in the PO file header. git-po-helper-generated review files include the header entry and glossary (if present) by default. The review workflow leverages git-po-helper subcommands: - git-po-helper compare: Extract new or changed entries between two versions of a PO file into a valid PO file for review. Supports multiple modes: * Compare HEAD with the working tree (local changes) * Compare a commit's parent with the commit (--commit) * Compare a commit with the working tree (--since) * Compare two arbitrary revisions (-r) - git-po-helper msg-select: Split large review files into smaller batches by entry index range for manageable review sessions. Supports range formats like "-50" (first 50), "51-100", "101-" (to end). Evaluation with the Qwen model: git-po-helper agent-run review --commit 2000abefba --agent qwen Benchmark results: | Metric | Value | |------------------|----------------------------------| | Turns | 22 | | Input tokens | 537263 | | Output tokens | 4397 | | API duration | 167.84 s | | Review score | 96/100 | | Total entries | 63 | | With issues | 4 (1 critical, 2 major, 1 minor) | Signed-off-by: Jiang Xin --- diff --git a/po/AGENTS.md b/po/AGENTS.md index 69de4dafdd..feb90d4d24 100644 --- a/po/AGENTS.md +++ b/po/AGENTS.md @@ -10,6 +10,7 @@ most commonly used housekeeping tasks: 1. Generating or updating po/git.pot 2. Updating po/XX.po 3. Translating po/XX.po +4. Reviewing translation quality ## Background knowledge for localization workflows @@ -669,6 +670,202 @@ step 8 after step 6. ``` +### Task 4: Review translation quality + +Review may target the full `po/XX.po`, a specific commit, or changes since a +commit. When asked to review, follow the steps below. + +**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`, +`git format-patch`, or similar to get changes—they break PO context; use **only** +`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task. +Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3). +The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing +or empty (no batch left to review), or when step 1 finds `po/review-result.json` +already present. + +1. **Check for existing review (resume support)**: Evaluate the following in order: + + - If `po/review-input.po` does **not** exist, proceed to step 2 (Extract + entries) for a fresh start. + - Else If `po/review-result.json` exists, go to step 8 (only after loop exits). + - Else If `po/review-done.json` exists, go to step 6 (Rename result). + - Else if `po/review-todo.json` exists, go to step 5 (Review the current + batch). + - Else go to step 3 (Prepare one batch). + +2. **Extract entries**: Run `git-po-helper compare` with the desired range and + redirect the output to `po/review-input.po`. See "Comparing PO files for + translation and review" under git-po-helper for options. + +3. **Prepare one batch**: Batching keeps each run small so the model can + complete review within limited context. **Directly execute** the script + below—it is authoritative; do not reimplement. + + ```shell + review_one_batch () { + min_batch_size=${1:-100} + INPUT_PO="po/review-input.po" + PENDING="po/review-pending.po" + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + + if test ! -f "$INPUT_PO" + then + rm -f "$TODO" + echo >&2 "cannot find $INPUT_PO, nothing for review" + return 1 + fi + if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING" + then + rm -f "$BATCH_FILE" "$TODO" "$DONE" + rm -f po/review-result*.json + cp "$INPUT_PO" "$PENDING" + fi + + ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0) + ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0)) + if test "$ENTRY_COUNT" -eq 0 + then + rm -f "$TODO" + echo >&2 "No entries left for review" + return 1 + fi + + if test "$ENTRY_COUNT" -gt $min_batch_size + then + if test "$ENTRY_COUNT" -gt $((min_batch_size * 8)) + then + NUM=$((min_batch_size * 2)) + elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4)) + then + NUM=$((min_batch_size + min_batch_size / 2)) + else + NUM=$min_batch_size + fi + else + NUM=$ENTRY_COUNT + fi + + BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0) + BATCH=$((BATCH + 1)) + echo "$BATCH" >"$BATCH_FILE" + + git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING" + git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING" + mv "${PENDING}.tmp" "$PENDING" + echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)" + } + # The parameter controls batch size; reduce if the batch file is too large. + review_one_batch 100 + ``` + +4. **Check todo file**: If `po/review-todo.json` does not exist or is empty, + review is complete; go to step 8 (only after loop exits). Otherwise proceed to + step 5. + +5. **Review the current batch**: Review translations in `po/review-todo.json` + and write findings to `po/review-done.json` as follows: + - Use "Background knowledge for localization workflows" for PO/JSON structure, + placeholders, and terminology. + - If `header_comment` includes a glossary, follow it for consistency. + - Do **not** review the header (`header_comment`, `header_meta`). + - For every other entry, check the entry's `msgstr` **array** (translation + forms) against `msgid` / `msgid_plural` using the "Quality checklist" above. + - Write JSON per "Review result JSON format" below; use `{"issues": []}` when + there are no issues. **Always** write `po/review-done.json`—it marks the + batch complete. + +6. **Rename result**: Rename `po/review-done.json` to `po/review-result-.json`, + where N is the value in `po/review-batch.txt` (the batch just completed). + Run the script below: + + ```shell + review_rename_result () { + TODO="po/review-todo.json" + DONE="po/review-done.json" + BATCH_FILE="po/review-batch.txt" + if test -f "$DONE" + then + N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; } + mv "$DONE" "po/review-result-$N.json" + echo "Renamed to po/review-result-$N.json" + fi + rm -f "$TODO" + } + review_rename_result + ``` + +7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle. + Do **not** skip this step or go to step 8. Step 8 is reached **only** when + step 4 finds `po/review-todo.json` missing or empty. + +8. **Only after loop exits**: **Directly execute** the command below. It merges + results, applies suggestions, and displays the report. The process ends here. + + ```shell + git-po-helper agent-run review --report po + ``` + + **Do not** run cleanup or delete intermediate files. Keep them for inspection + or resumption. + +**Review result JSON format**: + +The **Review result JSON** format defines the structure for translation +review reports. For each entry with translation issues, create an issue +object as follows: + +- Copy the original entry's `msgid`, optional `msgid_plural`, and optional + `msgstr` array (original translation forms) into the issue object. Use the + same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present + (one element singular, multiple for plural). +- Write a summary of all issues found for this entry in `description`. +- Set `score` according to the severity of issues found for this entry, + from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). + **Lower score means more severe issues.** +- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**: + one string for singular, multiple strings for plural forms in order. This is + required for `git-po-helper` to apply suggestions. +- Include only entries with issues (score less than 3). When no issues are + found in the batch, write `{"issues": []}`. + +Example review result (with issues): + +```json +{ + "issues": [ + { + "msgid": "commit", + "msgstr": ["委托"], + "score": 0, + "description": "Terminology error: 'commit' should be translated as '提交'", + "suggest_msgstr": ["提交"] + }, + { + "msgid": "repository", + "msgid_plural": "repositories", + "msgstr": ["版本库", "版本库"], + "score": 2, + "description": "Consistency issue: suggest using '仓库' consistently", + "suggest_msgstr": ["仓库", "仓库"] + } + ] +} +``` + +Field descriptions for each issue object (element of the `issues` array): + +- `msgid` (and optional `msgid_plural` for plural entries): Original source text. +- `msgstr` (optional): JSON array of original translation forms (same meaning as + in GETTEXT JSON entries). +- `suggest_msgstr`: JSON array of suggested translation forms; **must be an + array** (e.g. `["提交"]` for singular). Plural entries use multiple elements + in order. +- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues). +- `description`: Brief summary of the issue. + + ## Human translators remain in control Git translation is human-driven; language team leaders and contributors are