You are here: Foswiki>Tasks Web>Item4419 (19 Dec 2016, MichaelDaum)Edit Attach

Item4419: I18N: Non latin characters in field names break forms

Priority: Normal
Current State: Closed
Released In: 2.0.0
Target Release: major
Applies To: Engine
Component: I18N
Branches: master
Reported By: TWiki:Main.AndreyTkachenko
Waiting For:
Last Change By: MichaelDaum
My TWiki configured by use utf8 encoding.

I define form template:

Name Type Size Values Tooltip message Attributes
Аннотация text 85     M

New topic body:

%M ETA:TOPICINFO{author="FalKo" date="1185536256" format="1.1" version="1.1"}%

-- Main.FalKo - 27 Jul 2007

%M ETA:FORM{name="ItemTemplate"}%
%M ETA:FIELD{name="A" attributes="M" title="Aннотация" value="123"}%
%M ETA:PREFERENCE{name="VIEW_TEMPLATE" title="VIEW_TEMPLATE" type="Set" value="ItemView"}%

I try save topic which use this form template, but FORMFIELD meta data is not saved in my new topic.

Then I try mixed whith сyrillic characters and latin like this: "FАннотация". When click submit form I see error report "required fileld "FАннотация" can't be empty".

I try replace fist letter А-cyrillic to A-latin. It's right, field saved correctly, but name attribute is not "Summary", but latin character 'A' from mixed "Aннотация".

Why name attribute is name of, but not link?

-- TWiki:Main/AndreyTkachenko - 27 Jul 2007

I changed a encoding to KOI8-R, but it did not decide a problem frown, sad smile

-- TWiki:Main.AndreyTkachenko - 27 Jul 2007

Changed attribution to I18N (internationalisation) and headline.


I said that this problem does not depend on the code of Utf-8.

it is correct it will be to say that not Latin characters are unright processed

-- TWiki:Main.AndreyTkachenko - 08 Aug 2007

Andrey, please contact TWiki:Main.RichardDonkin directly on this issue, I'm afraid he is really our only I18N ressource currently.

-- TWiki:Main.SteffenPoulsen - 20 Aug 2007

Possibly related to Bugs:Item2032.

-- TWiki:Main.SteffenPoulsen - 20 Aug 2007

It's not related to Bugs:Item2032. And not UTF-8 character!!!

-- TWiki:Main.AndreyTkachenko - 21 Aug 2007

You stated that you were using utf8 encoding, this confused me. Sorry for that, I hope you will forgive my ignorance.

-- TWiki:Main.SteffenPoulsen - 21 Aug 2007

Andrey, this is not the same as Bugs:Item2032 - however, if you had read that page you'd have seen a link to TWiki:Codev.InternationalCharactersInFormFields, which is the same issue. That bug is complex to fix for all languages/scripts, but attached to that page is a patch that may help you - just a matter of commenting out one line in the file's _cleanField routine. Note that this patch is for a much older version of TWiki, so it probably won't apply directly - so just comment out the line that looks like:

   $text =~ s/[^A-Za-z0-9_\.]//go;

In more recent versions of TWiki, the routine is called fieldTitle2FieldName, and has the same function (see SVNget:lib/TWiki/ Unfortunately, in the refactoring someone managed to remove the I18N TODO comment which highlighted this issue.

One side effect of this change would be that your field titles would be used unchanged as field names. I'm not sure why this is a bad thing, but normally TWiki uses field names (i.e. internally) that are the field title with all non-alphanumeric characters removed. So any form field data created with this line commented out may not be accessible if you later install a version of TWiki where this line has not been commented-out - so be careful!

This bug is independent of language and character set incidentally, but is most serious for languages that don't use Latin character sets.

To anyone else reading this page - the patch suggested here and linked from TWiki:Codev.InternationalCharactersInFormFields was rejected by PeterThoeny so is not generally a great idea. Also, it needs more work to cleanly support East Asian languages such as Chinese and Japanese, though it does work for those too.

Next Action: other TWiki developers to comment on whether it's OK to make field name same as field title, including upward compatibility issues etc. Some move in this direction is really required to make this code work at all for I18N.

-- TWiki:Main.RichardDonkin - 23 Aug 2007 Can some other developers comment on whether my proposed Next Action above is OK? May break upward compatibility, suggestions welcome.

-- RichardDonkin - 11 Sep 2007

Confirmed in the 1.2beta2 too. The form isn't saved, when the form is defined as:
| *Name*  | *Type* | *Size* | *Values* | *Tooltip message* | *Attributes* |
| [[Summary][Аннотация]]  | text  | 85  | | | M |
Note the field-name is ascii only (Summary), only the "display" text is utf8. Also, when the form is defined as:
| *Name*  | *Type* | *Size* | *Values* | *Tooltip message* | *Attributes* |
| [[Summary][Sumar]] | text  | 85  | | | M |
e.g. all ascii, the form-data get saved as Sumar (e.g. as the display-text value) and not as Summary. The raw topic data:
%<nop>META:TOPICINFO{author="BaseUserMapping_333" comment="" date="1435145102" format="1.1" reprev="1" version="1"}%
-- Main.AdminUser - 24 Jun 2015

%<nop>META:FIELD{name="Sumar" title="[[Summary][Sumar]]" value="wedwed"}%

-- JozefMojzis

Good catch!

-- Main.CrawfordCurrie - 24 Jun 2015 - 13:48

This one causes major incompatibilites between 1.x-er and 2.x-er content, resulting in potential data loss: Item14256

-- MichaelDaum - 19 Dec 2016

ItemTemplate edit

Summary I18N: Non latin characters in field names break forms
ReportedBy TWiki:Main.AndreyTkachenko
SVN Range TWiki-4.1.2, Thu, 19 Jul 2007, build 14438
AppliesTo Engine
Component I18N
Priority Normal
CurrentState Closed
Checkins distro:bfe3afe3e63c
TargetRelease major
ReleasedIn 2.0.0
CheckinsOnBranches master
masterCheckins distro:bfe3afe3e63c
Topic revision: r16 - 19 Dec 2016, MichaelDaum
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy