Item4074: TOC and heading anchors not multi-byte aware
Priority: Normal
Current State: No Action Required
Released In:
Target Release: n/a
Applies To: Engine
Component: I18N
Branches:
Reported in
TWiki:Support/JapaneseHeadersInIE6
Our TWiki site is mostly in English but we want to have a few pages in Japanese (we are a Japanese company). With this in mind we have set our CharSet to UTF-8. We have written a topic in Japanese and it works fine in IE7 and Firefox, but renders very poorly in IE6 (fonts enormous, Japanese not displayed, etc).
We have narrowed the problem down to the header at the top of the topic. The TWikiML reads as follows:
---+ トトロシステムのご紹介
The corresponding html generated by TWiki contains an anchor at this point, i.e.:
<h1><a name="トトロシステムのご紹�"></a> トトロシステムのご紹介 </h1>
Note that the final character in the name of the anchor seems to have been corrupted in some way. If we remove this character from the heading, the page renders fine in IE6.
Does anyone know what is going wrong here? Or is there a way to stop headers (---+ etc) generating anchors in html?
It looks like the second byte of a multi-byte character is cut off when applying the length limit to anchor names.
--
TWiki:Main/PeterThoeny - 15 May 2007
Analysis sounds credible. Another
I18N issue (I set the Component accordingly)
We really need someone with a vested interest in this area to help out.
CC
Reported in
TWiki:Codev/UtfAnchorError, with patch to fix.
--
PTh
This issue is no longer applicable after the changes to the anchor-management code for
Item1448 - wide characters are removed altogether before making the anchor unique. Changing state to "no action".
--
MichaelTempest - 27 Jun 2010