You are here: Foswiki>Tasks Web>Item13575 (03 Aug 2015, GeorgeClark)Edit Attach

Item13575: WysiwygPlugin converts entities to utf-8, then cannot save topic on iso-8859-1 store

pencil
Priority: Urgent
Current State: Closed
Released In: 2.0.1
Target Release: patch
Applies To: Engine
Component: WysiwygPlugin
Branches: master Item13525
Reported By: GeorgeClark
Waiting For:
Last Change By: GeorgeClark
Configure Foswiki with {Store}{Encoding} = 'iso-8859-1';, and using the plain text editor, create a simple topic with
Entity ≤

Edit and save in the text editor works fine, and ≤ is preserved.

Edit and save in Wysiwyg, the
≤
is converted to
≤
and then save fails with:

| 2015-07-28T14:30:38-04:00 error | "\x{2264}" does not map to iso-8859-1 at /usr/lib/x86_64-linux-gnu/perl5/5.20/Encode.pm line 159. at /usr/lib/x86_64-linux-gnu/perl5/5.20/Encode.pm line 159.
Encode::encode("iso-8859-1", "%META:TOPICINFO{author=\"BaseUserMapping_333\" date=\"1438108238"..., 1) called at /var/www/data/Foswiki-iso88/lib/Foswiki/Store.pm line 276 ... |

When running on an iso-8859-1 store, wysiwyg must not convert entities into their unicode character equivalents, or foswiki will be unable to save the topic.

This is a similar issue as encountered with formfields and Item13573

-- GeorgeClark - 28 Jul 2015

The issue is that store is not forgiving. If any character sneaks in a topic that cannot be represented in the {Store}{Encoding} it croaks. Encode 2.12 and later (perl 5.8.8) accepts a subroutine ref to process characters that cannot be converted. A Possible fix is below, but it has a problem.

  1. HTML::Entities::encode_entities() is supposed to return a correctly encoded string. And it works fine in a test script. But when used in this context, it only returns the bare numeric encoding, without any &# ; wrapping.
  2. It's also documented as using named entities if available, unless the alternate encode_entities_numeric is imported and called. However in the below example, it always returns numerics.

Jomo caught my error. The sub is passed the ordinal of the character, not the character.

diff --git a/core/lib/Foswiki/Store.pm b/core/lib/Foswiki/Store.pm
index 1aea868..7ca0c4b 100644
--- a/core/lib/Foswiki/Store.pm
+++ b/core/lib/Foswiki/Store.pm
@@ -60,6 +60,7 @@ use Assert;
 
 use Foswiki          ();
 use Foswiki::Sandbox ();
+use HTML::Entities;
 
 BEGIN {
     if ( $Foswiki::cfg{UseLocale} ) {
@@ -274,7 +275,7 @@ sub encode {
     return $_[0] unless defined $_[0];
     my $s = $_[0];
     return Encode::encode( $Foswiki::cfg{Store}{Encoding} || 'utf-8',
-        $s, Encode::FB_CROAK );
+         $s,  sub{ HTML::Entities::encode_entities(chr(shift)) });
 }
 
 1;

-- GeorgeClark - 28 Jul 2015

Your preliminary fix here had a positive and a negative impact on MultiTopicSavePlugin. See my updates on Item13573. The plugin stops crashing but now think anything entity encoded is changed. It seems the conversion happens none-symmetrical when you read field values and when you store them and save the topic.

-- KennethLavrsen - 29 Jul 2015

LFTM

-- Main.CrawfordCurrie - 30 Jul 2015 - 14:30

 

ItemTemplate edit

Summary WysiwygPlugin converts entities to utf-8, then cannot save topic on iso-8859-1 store
ReportedBy GeorgeClark
Codebase 2.0.0
SVN Range
AppliesTo Engine
Component WysiwygPlugin
Priority Urgent
CurrentState Closed
WaitingFor
Checkins distro:7e7e1118ddd6
TargetRelease patch
ReleasedIn 2.0.1
CheckinsOnBranches master Item13525
trunkCheckins
masterCheckins distro:7e7e1118ddd6
ItemBranchCheckins distro:7e7e1118ddd6
Release01x01Checkins
Topic revision: r8 - 03 Aug 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy