public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jiang Xin <worldhello.net@gmail•com>
To: Junio C Hamano <gitster@pobox•com>, Johannes Sixt <j6t@kdbg•org>,
	Git List <git@vger•kernel.org>
Cc: "Jiang Xin" <worldhello.net@gmail•com>,
	"Alexander Shopov" <ash@kambanaria•org>,
	"Mikel Forcada" <mikel.forcada@gmail•com>,
	"Ralf Thielow" <ralf.thielow@gmail•com>,
	"Jean-Noël Avila" <jn.avila@free•fr>,
	"Bagas Sanjaya" <bagasdotme@gmail•com>,
	"Dimitriy Ryazantcev" <DJm00n@mail•ru>,
	"Peter Krefting" <peter@softwolves•pp.se>,
	"Emir SARI" <bitigchi@me•com>, "Arkadii Yakovets" <ark@cho•red>,
	"Vũ Tiến Hưng" <newcomerminecraft@gmail•com>,
	"Teng Long" <dyroneteng@gmail•com>,
	"Yi-Jyun Pan" <pan93412@gmail•com>
Subject: [PATCH v4 5/5] docs(l10n): add AI agent instructions to review translations
Date: Tue, 17 Mar 2026 07:54:49 +0800	[thread overview]
Message-ID: <9388b8e9f471c68e7d08ff9f2ccf1d699f5079c1.1773704908.git.worldhello.net@gmail.com> (raw)
In-Reply-To: <cover.1773704908.git.worldhello.net@gmail.com>

Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides
comprehensive guidance for AI agents to review translation files.

Translation diffs lose context, especially for multi-line msgid and
msgstr entries. Some LLMs ignore context and cannot evaluate
translations accurately; others rely on scripts to search for context
in source files, making the review process time-consuming. To address
this, git-po-helper implements the compare subcommand, which extracts
new or modified translations with full context (complete msgid/msgstr
pairs), significantly improving review efficiency.

A limitation is that the extracted content lacks other
already-translated content for reference, which may affect terminology
consistency. This is mitigated by including a glossary in the PO file
header. git-po-helper-generated review files include the header entry
and glossary (if present) by default.

The review workflow leverages git-po-helper subcommands:

- git-po-helper compare: Extract new or changed entries between two
  versions of a PO file into a valid PO file for review. Supports
  multiple modes:

  * Compare HEAD with the working tree (local changes)
  * Compare a commit's parent with the commit (--commit)
  * Compare a commit with the working tree (--since)
  * Compare two arbitrary revisions (-r)

- git-po-helper msg-select: Split large review files into smaller
  batches by entry index range for manageable review sessions. Supports
  range formats like "-50" (first 50), "51-100", "101-" (to end).

Evaluation with the Qwen model:

    git-po-helper agent-run review --commit 2000abefba --agent qwen

Benchmark results:

    | Metric           | Value                            |
    |------------------|----------------------------------|
    | Turns            | 22                               |
    | Input tokens     | 537263                           |
    | Output tokens    | 4397                             |
    | API duration     | 167.84 s                         |
    | Review score     | 96/100                           |
    | Total entries    | 63                               |
    | With issues      | 4 (1 critical, 2 major, 1 minor) |

Signed-off-by: Jiang Xin <worldhello.net@gmail•com>
---
 po/AGENTS.md | 197 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 197 insertions(+)

diff --git a/po/AGENTS.md b/po/AGENTS.md
index 65017624f7..e9a6ffc7f1 100644
--- a/po/AGENTS.md
+++ b/po/AGENTS.md
@@ -10,6 +10,7 @@ most commonly used housekeeping tasks:
 1. Generating or updating po/git.pot
 2. Updating po/XX.po
 3. Translating po/XX.po
+4. Reviewing translation quality
 
 
 ## Background knowledge for localization workflows
@@ -664,6 +665,202 @@ step 8 after step 6.
    ```
 
 
+### Task 4: Review translation quality
+
+Review may target the full `po/XX.po`, a specific commit, or changes since a
+commit. When asked to review, follow the steps below.
+
+**Workflow**: Follow steps in order. Do **NOT** use `git show`, `git diff`,
+`git format-patch`, or similar to get changes—they break PO context; use **only**
+`git-po-helper compare` for extraction. Without `git-po-helper`, refuse the task.
+Steps 3→4→5→6→7 loop: after step 6, **always** go to step 7 (back to step 3).
+The **only** ways to step 8 are when step 4 finds `po/review-todo.json` missing
+or empty (no batch left to review), or when step 1 finds `po/review-result.json`
+already present.
+
+1. **Check for existing review (resume support)**: Evaluate the following in order:
+
+   - If `po/review-input.po` does **not** exist, proceed to step 2 (Extract
+     entries) for a fresh start.
+   - Else If `po/review-result.json` exists, go to step 8 (only after loop exits).
+   - Else If `po/review-done.json` exists, go to step 6 (Rename result).
+   - Else if `po/review-todo.json` exists, go to step 5 (Review the current
+     batch).
+   - Else go to step 3 (Prepare one batch).
+
+2. **Extract entries**: Run `git-po-helper compare` with the desired range and
+   redirect the output to `po/review-input.po`. See "Comparing PO files for
+   translation and review" under git-po-helper for options.
+
+3. **Prepare one batch**: Batching keeps each run small so the model can
+   complete review within limited context. **Directly execute** the script
+   below—it is authoritative; do not reimplement.
+
+   ```shell
+   review_one_batch () {
+       min_batch_size=${1:-100}
+       INPUT_PO="po/review-input.po"
+       PENDING="po/review-pending.po"
+       TODO="po/review-todo.json"
+       DONE="po/review-done.json"
+       BATCH_FILE="po/review-batch.txt"
+
+       if test ! -f "$INPUT_PO"
+       then
+           rm -f "$TODO"
+           echo >&2 "cannot find $INPUT_PO, nothing for review"
+           return 1
+       fi
+       if test ! -f "$PENDING" || test "$INPUT_PO" -nt "$PENDING"
+       then
+           rm -f "$BATCH_FILE" "$TODO" "$DONE"
+           rm -f po/review-result*.json
+           cp "$INPUT_PO" "$PENDING"
+       fi
+
+       ENTRY_COUNT=$(grep -c '^msgid ' "$PENDING" 2>/dev/null || echo 0)
+       ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0))
+       if test "$ENTRY_COUNT" -eq 0
+       then
+           rm -f "$TODO"
+           echo >&2 "No entries left for review"
+           return 1
+       fi
+
+       if test "$ENTRY_COUNT" -gt $min_batch_size
+       then
+           if test "$ENTRY_COUNT" -gt $((min_batch_size * 8))
+           then
+               NUM=$((min_batch_size * 2))
+           elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4))
+           then
+               NUM=$((min_batch_size + min_batch_size / 2))
+           else
+               NUM=$min_batch_size
+           fi
+       else
+           NUM=$ENTRY_COUNT
+       fi
+
+       BATCH=$(cat "$BATCH_FILE" 2>/dev/null || echo 0)
+       BATCH=$((BATCH + 1))
+       echo "$BATCH" >"$BATCH_FILE"
+
+       git-po-helper msg-select --json --head "$NUM" -o "$TODO" "$PENDING"
+       git-po-helper msg-select --since "$((NUM + 1))" -o "${PENDING}.tmp" "$PENDING"
+       mv "${PENDING}.tmp" "$PENDING"
+       echo "Processing batch $BATCH ($NUM entries out of $ENTRY_COUNT)"
+   }
+   # The parameter controls batch size; reduce if the batch file is too large.
+   review_one_batch 100
+   ```
+
+4. **Check todo file**: If `po/review-todo.json` does not exist or is empty,
+   review is complete; go to step 8 (only after loop exits). Otherwise proceed to
+   step 5.
+
+5. **Review the current batch**: Review translations in `po/review-todo.json`
+   and write findings to `po/review-done.json` as follows:
+   - Use "Background knowledge for localization workflows" for PO/JSON structure,
+     placeholders, and terminology.
+   - If `header_comment` includes a glossary, follow it for consistency.
+   - Do **not** review the header (`header_comment`, `header_meta`).
+   - For every other entry, check the entry's `msgstr` **array** (translation
+     forms) against `msgid` / `msgid_plural` using the "Quality checklist" above.
+   - Write JSON per "Review result JSON format" below; use `{"issues": []}` when
+     there are no issues. **Always** write `po/review-done.json`—it marks the
+     batch complete.
+
+6. **Rename result**: Rename `po/review-done.json` to `po/review-result-<N>.json`,
+   where N is the value in `po/review-batch.txt` (the batch just completed).
+   Run the script below:
+
+   ```shell
+   review_rename_result () {
+       TODO="po/review-todo.json"
+       DONE="po/review-done.json"
+       BATCH_FILE="po/review-batch.txt"
+       if test -f "$DONE"
+       then
+           N=$(cat "$BATCH_FILE" 2>/dev/null) || { echo "ERROR: $BATCH_FILE not found." >&2; return 1; }
+           mv "$DONE" "po/review-result-$N.json"
+           echo "Renamed to po/review-result-$N.json"
+       fi
+       rm -f "$TODO"
+   }
+   review_rename_result
+   ```
+
+7. **Loop**: **MUST** return to step 3 (Prepare one batch) and repeat the cycle.
+   Do **not** skip this step or go to step 8. Step 8 is reached **only** when
+   step 4 finds `po/review-todo.json` missing or empty.
+
+8. **Only after loop exits**: **Directly execute** the command below. It merges
+   results, applies suggestions, and displays the report. The process ends here.
+
+   ```shell
+   git-po-helper agent-run report
+   ```
+
+   **Do not** run cleanup or delete intermediate files. Keep them for inspection
+   or resumption.
+
+**Review result JSON format**:
+
+The **Review result JSON** format defines the structure for translation
+review reports. For each entry with translation issues, create an issue
+object as follows:
+
+- Copy the original entry's `msgid`, optional `msgid_plural`, and optional
+  `msgstr` array (original translation forms) into the issue object. Use the
+  same shape as GETTEXT JSON: `msgstr` is **always a JSON array** when present
+  (one element singular, multiple for plural).
+- Write a summary of all issues found for this entry in `description`.
+- Set `score` according to the severity of issues found for this entry,
+  from 0 to 3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
+  **Lower score means more severe issues.**
+- Place the suggested translation in **`suggest_msgstr`** as a **JSON array**:
+  one string for singular, multiple strings for plural forms in order. This is
+  required for `git-po-helper` to apply suggestions.
+- Include only entries with issues (score less than 3). When no issues are
+  found in the batch, write `{"issues": []}`.
+
+Example review result (with issues):
+
+```json
+{
+  "issues": [
+    {
+      "msgid": "commit",
+      "msgstr": ["委托"],
+      "score": 0,
+      "description": "Terminology error: 'commit' should be translated as '提交'",
+      "suggest_msgstr": ["提交"]
+    },
+    {
+      "msgid": "repository",
+      "msgid_plural": "repositories",
+      "msgstr": ["版本库", "版本库"],
+      "score": 2,
+      "description": "Consistency issue: suggest using '仓库' consistently",
+      "suggest_msgstr": ["仓库", "仓库"]
+    }
+  ]
+}
+```
+
+Field descriptions for each issue object (element of the `issues` array):
+
+- `msgid` (and optional `msgid_plural` for plural entries): Original source text.
+- `msgstr` (optional): JSON array of original translation forms (same meaning as
+  in GETTEXT JSON entries).
+- `suggest_msgstr`: JSON array of suggested translation forms; **must be an
+  array** (e.g. `["提交"]` for singular). Plural entries use multiple elements
+  in order.
+- `score`: 0–3 (0 = critical; 1 = major; 2 = minor; 3 = perfect, no issues).
+- `description`: Brief summary of the issue.
+
+
 ## Human translators remain in control
 
 Git translation is human-driven; language team leaders and contributors are
-- 
2.53.0.rc2.20.g532543fa46


  parent reply	other threads:[~2026-03-16 23:55 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-04  9:31 [RFC] Introducing AI Agents to Git Localization Jiang Xin
2026-02-04 11:58 ` Peter Krefting
2026-02-04 13:00   ` Michal Suchánek
2026-02-04 14:38     ` 依云
2026-02-05  2:06     ` Jiang Xin
2026-02-05  8:30       ` Michal Suchánek
2026-02-05 11:16         ` Jiang Xin
2026-02-05 13:18           ` Michal Suchánek
2026-02-05  1:04   ` Jiang Xin
2026-02-05  1:53     ` brian m. carlson
     [not found] ` <0207CD38-C811-499D-AFA6-131B0CA825CD@gmail.com>
2026-02-05 12:54   ` Jiang Xin
2026-02-05 13:00 ` [RFC PATCH 1/2] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-02-05 20:07   ` Junio C Hamano
2026-02-05 13:00 ` [RFC PATCH 2/2] l10n: README: document AI assistant guidelines Jiang Xin
2026-02-05 20:35   ` Junio C Hamano
2026-02-06  2:38     ` Jiang Xin
2026-03-03 15:33       ` [PATCH v2 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-03 15:33         ` [PATCH v2 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-03 15:33         ` [PATCH v2 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-12  2:11           ` Jiang Xin
2026-03-03 15:33         ` [PATCH v2 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-03 15:33         ` [PATCH v2 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-12  2:26           ` Jiang Xin
2026-03-03 15:33         ` [PATCH v2 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin
2026-03-12  2:34           ` Jiang Xin
2026-03-14 14:38       ` [PATCH v3 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-14 14:38         ` [PATCH v3 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-15 11:13           ` Johannes Sixt
2026-03-15 16:11             ` Junio C Hamano
2026-03-16  5:44               ` Jiang Xin
2026-03-16  3:21             ` Jiang Xin
2026-03-16 12:43               ` Johannes Sixt
2026-03-14 14:38         ` [PATCH v3 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-14 14:38         ` [PATCH v3 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-14 14:38         ` [PATCH v3 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-14 14:38         ` [PATCH v3 5/5] docs(l10n): add AI agent instructions to review translations Jiang Xin
2026-03-16 23:54       ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-16 23:54         ` [PATCH v4 1/5] l10n: add .gitattributes to simplify location filtering Jiang Xin
2026-03-16 23:54         ` [PATCH v4 2/5] docs(l10n): add AGENTS.md with optimized update-pot instructions Jiang Xin
2026-03-16 23:54         ` [PATCH v4 3/5] docs(l10n): add AI agent instructions for updating po/XX.po files Jiang Xin
2026-03-16 23:54         ` [PATCH v4 4/5] docs(l10n): add AI agent instructions for translating PO files Jiang Xin
2026-03-16 23:54         ` Jiang Xin [this message]
2026-03-31  0:52         ` [PATCH v4 0/5] docs(l10n): AI agent instructions and workflow improvements Jiang Xin
2026-03-31  3:38           ` Junio C Hamano
2026-03-31  4:37             ` Jiang Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9388b8e9f471c68e7d08ff9f2ccf1d699f5079c1.1773704908.git.worldhello.net@gmail.com \
    --to=worldhello.net@gmail$(echo .)com \
    --cc=DJm00n@mail$(echo .)ru \
    --cc=ark@cho$(echo .)red \
    --cc=ash@kambanaria$(echo .)org \
    --cc=bagasdotme@gmail$(echo .)com \
    --cc=bitigchi@me$(echo .)com \
    --cc=dyroneteng@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=j6t@kdbg$(echo .)org \
    --cc=jn.avila@free$(echo .)fr \
    --cc=mikel.forcada@gmail$(echo .)com \
    --cc=newcomerminecraft@gmail$(echo .)com \
    --cc=pan93412@gmail$(echo .)com \
    --cc=peter@softwolves$(echo .)pp.se \
    --cc=ralf.thielow@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox