Item9126: Wrong encoding in CompareRevisionsAddOn
Priority: Normal
Current State: Closed
Released In: 1.1.0
Target Release: minor
I use
{Languages}{'pt-br'}{Enabled}
ticked,
{Site}{Locale}
is
pt_BR.utf8
and
{Site}{CharSet}
is
utf-8
.
Everything works fine, except if I enable
CompareRevisionsAddOn and look at topic history (everything else keeps working fine): all "special" characters seem to get double-encoded. Problem is that
$entity->as_HTML()
is called without parameters and this makes HTML::Element to encode
all "unsafe" characters
(1):
Returns a string representing in HTML the element and its descendants. The optional argument $entities specifies a string of the entities to encode. For compatibility with previous versions, specify '<>&' here. If omitted or undef, all unsafe characters are encoded as HTML entities. See HTML::Entities for details. If passed an empty string, no entities are encoded.
I changed the call from:
return $element->as_HTML( undef, undef, {} );
to:
return $element->as_HTML( q|'"<>%&|, undef, {} );
Taking the "dangerous" characters from "
safe" encoding. Then everything worked as expected.
Any concern about commiting this change?
--
GilmarSantosJr - 08 Jun 2010
After more
reading, I implemented this change (relative to trunk):
$ git diff
diff --git a/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm b/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Com
index ed0a7e3..6d31949 100755
--- a/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm
+++ b/CompareRevisionsAddOn/lib/Foswiki/Contrib/CompareRevisionsAddOn/Compare.pm
@@ -322,6 +322,10 @@ sub _getTree {
my $tree = new HTML::TreeBuilder;
$tree->implicit_body_p_tag(1);
$tree->p_strict(1);
+ if ( $Foswiki::cfg{UseLocale} ) {
+ require Encode;
+ $text = Encode::decode( $Foswiki::cfg{Site}{CharSet}, $text );
+ }
$tree->parse($text);
$tree->eof;
$tree->elementify;
And it worked, without the change described at my previous comment. With the first solution,
parse()
method prints lots of messages to STDERR about parsing undecoded utf-8 strings. This solution works with no warnings.
So, what is the best fix? Any other suggestion?
--
GilmarSantosJr - 08 Jun 2010