Kagi Translate hampered by Anthropic Claude bug

FranziKay

This is actually a long standing problem which I have reported to Kagi previously: AI Anthropic Claude's inability to output curled quotes, especially the German closed quotation mark
My attached screenshot Kagi Translate is hampered by AI Anthropic Claude’s inability to produce the correct German closing quotation mark. “ UTF-8 code U+201C.png shows why this is a problem for me that I need fixed. I come across it daily in my work.
If I let Kagi Assistant do translation work I always make sure the back end code goes to openai GPT, not to Anthropic's Claude, which is sad, because my experience shows that Claude does a better job including with quality translations.
Kagi Translate is hampered by AI Anthropic Claude’s inability to produce the correct German closing quotation mark. “ UTF-8 code U+201C

As I fix this, I suggest that Kagi Proofread's "Best" setting doesn't use Claude, at least not in German. Previous analytical work I did points to Claude's output mode as where the error occurs.

Schließendes doppeltes Anführungszeichen (“): UTF-8 Code U+201C

Bug: Incorrect Handling of German Quotation Marks in Tool use and Claude App #561
https://github.com/anthropics/anthropic-sdk-typescript/issues/561

Correct German closing quotation mark: ” (U+201D) or „…“ (U+201E for opening, U+201C for closing, as per German typographic standards).

Giorgio

This should be fixed, apologies for taking a while to get this done.

FranziKay

Thank you for fixing it. I am grateful. 😃

FranziKay

Anthropic bug is still the same.
Claude from Anthropic has had this problem for years:
It changes “
“ U+201C LEFT DOUBLE QUOTATION MARK: double turned comma quotation mark
into this "
" U+0022 QUOTATION MARK: double quote

Here is Claude 4.7 Opus (reasoning) reply:
You're right, and this is a well-documented issue that has affected Claude and Claude Code for an extended period.

What's happening

The behavior you describe — turning " (U+201C LEFT DOUBLE QUOTATION MARK) into " (U+0022 QUOTATION MARK) — is part of a broader pattern where non-ASCII typographic characters get silently normalized to their ASCII equivalents [²][³]. The same thing happens with:

U+201D (right double quotation mark) → U+0022 [³]
U+2019 (right single quotation mark) → U+0027 [⁸]
U+201E (German low opening quote) and other typographic quotes [⁶]
German umlauts (ä, ö, ü, ß) getting replaced with ASCII digraphs (ae, oe, ue, ss) [⁵]

Where the problem appears to live

Based on investigation by users, the root cause does not appear to be me generating the wrong characters. Rather:

Claude appears to generate the correct Unicode punctuation internally, but something in Anthropic's API response pipeline normalizes these characters in the stream — without signaling this to either the user or the model [¹].
In Claude Code specifically, the Edit, Write, and Update tools' string-matching/normalization layer conflates U+201E / U+201C / U+201D with U+0022 [³][⁶].
This is particularly destructive when editing JSON files or doing OCR/document conversion, where exact character fidelity matters [²][³].

Real-world impact

It makes Claude essentially unusable for OCR of documents into structured data where original punctuation must be preserved [²].
German-language users have to run post-processing scripts like fix_quotes.py after every interaction, which defeats the purpose of AI-assisted writing [⁷].
In the Claude web app, producing German closing quotation marks (U+201C, which is the opening quote in German typography) has been reported as effectively impossible [⁴].
The umlaut-substitution regression has been ongoing since December 2025 [⁵].

What I can and can't do about it

Honestly: if the normalization is happening in the transport/tool layer rather than in my token generation, then no amount of me "trying harder" in a single response will fix it — the characters get rewritten after I emit them [¹]. If you're hitting this in Claude Code specifically, the workarounds people have used include:

Writing the file via a shell command (e.g., printf or a heredoc) with explicit \u201C escapes, bypassing the Edit/Write tool normalization.
Using base64-encoded content that gets decoded by a shell command.
Post-processing with a sed/python script.

I can't promise that this very reply preserves the U+201C you pasted, for exactly the reasons above. If you tell me the specific context (API, Claude Code, web app) and what you're trying to accomplish, I can suggest the least-broken workaround.

[¹]: the-curious-case-claudes-quotes.md - GitHub (26%)
[²]: [Bug] Claude models normalize Unicode quotation marks to ASCII ... (19%)
[³]: [BUG] Write and Update tools normalize unicode double ... - GitHub (15%)
[⁴]: Bug: Incorrect Handling of German Quotation Marks in Tool use ... - GitHub (10%)
[⁵]: Claude Code replaces German umlauts with ASCII substitutes for 3+ ... (9%)
[⁶]: Edit tool corrupts non-ASCII Unicode characters (typographic quotes ... (9%)
[⁷]: [BUG] German typographic quotes get incorrectly auto-corrected (9%)
[⁸]: [BUG] Claude cannot reliably output non-ASCII Unicode ... - GitHub (3%)