cross is now running Foswiki 2.1.3 Beta 1. Please open a task for any issues.
You are here: Foswiki>Tasks Web>Item13563 (10 Dec 2015, JozefMojzis)Edit Attach

Item13563: Crashes in Search, and for cUID mapped from uft8 WikiName

Priority: Urgent
Current State: Closed
Released In: 2.0.1
Target Release: patch
Applies To: Engine
Component: I18N, SEARCH, UserMapping
Branches: master Item13525
Reported By: GeorgeClark
Waiting For:
Last Change By: JozefMojzis was crashing attempting to view Sandbox/WebHome. To recreate locally

  • Register a user with an umlat in the name: For ex. MikeMüller
  • Create a new topic in Sandbox as that user.
  • ... so far so good.
  • Log out
  • View the same topic. Crashes.
  • View Sandbox Crashes
  • Log back in as MikeMüller ... everything looks good again.

There appear to be multiple crashes happening.

[Mon Jul 27 16:04:45 2015] view: Malformed UTF-8 character (unexpected end of string) in substitution (s///) at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 876.
[Mon Jul 27 16:04:45 2015] view: Operation "pattern match (m//)" returns its argument for non-Unicode code point 0x2CB25CA7 at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 3123.

And with ASSERT enabled, viewing Sandbox:

Could not perform search. Error was: Malformed UTF-8 character (unexpected end of string) in substitution (s///) at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 876.
at /usr/share/perl5/CGI/ line 353.
CGI::Carp::realdie("Malformed UTF-8 character (unexpected end of string) in subst"...) called at /usr/share/perl5/CGI/ line 443
CGI::Carp::die("Malformed UTF-8 character (unexpected end of string) in subst"...) called at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 15
Assert::__ANON__("Malformed UTF-8 character (unexpected end of string) in subst"...) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 876
Foswiki::Render::renderRevisionInfo(Foswiki::Render=HASH(0x3c80528), Foswiki::Meta=HASH(0x4046460), 1, "\$wikiname") called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 970
Foswiki::Search::__ANON__("wikiname", Foswiki::Meta=HASH(0x4046460), undef) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 1224
Foswiki::Search::formatResult(Foswiki::Search=HASH(0x3e22ac0), "
\x{a}"..., Foswiki::Meta=HASH(0x4046460), "-- MikeM\x{fc}ller - 27 Jul 2015", HASH(0x3c66678), HASH(0x3c66ea0), HASH(0x3c71dd8)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 1054
Foswiki::Search::formatResults(Foswiki::Search=HASH(0x3e22ac0), Foswiki::Search::Node=HASH(0x3fbbb28), Foswiki::Iterator::PagerIterator=HASH(0x405f568), HASH(0x3f20450)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 437
Foswiki::Search::searchWeb(Foswiki::Search=HASH(0x3e22ac0), "nonoise", "on", "_RAW", "\x{a}\x{9}\".*\"\x{a}\x{9}type=\"regex\"\x{a}\x{9}nonoise=\"on\"\x{a}\x{9}order=\"modified\"\x{a}\x{9}reverse"..., "_DEFAULT", ".*", "pager", "on", ...) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/Macros/ line 39
Foswiki::__ANON__() called at /usr/share/perl5/ line 416
eval {...} called at /usr/share/perl5/ line 408
Error::subs::try(CODE(0x3e06cb8), HASH(0x39b2070)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/Macros/ line 60
Foswiki::SEARCH(Foswiki=HASH(0x29bfec0), Foswiki::Attrs=HASH(0x3e35f00), Foswiki::Meta=HASH(0x361ba30)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 3435
Foswiki::_expandMacroOnTopicRendering(Foswiki=HASH(0x29bfec0), "SEARCH", "\x{a}\x{9}\".*\"\x{a}\x{9}type=\"regex\"\x{a}\x{9}nonoise=\"on\"\x{a}\x{9}order=\"modified\"\x{a}\x{9}reverse"..., Foswiki::Meta=HASH(0x361ba30)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 3297
Foswiki::_processMacros(Foswiki=HASH(0x29bfec0), "---+!! %MAKETEXT{\"Welcome to the [_1] web\" args=\"%WEB%\"}"..., CODE(0x284b270), Foswiki::Meta=HASH(0x361ba30), 16) called at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 3094
Foswiki::innerExpandMacros(Foswiki=HASH(0x29bfec0), SCALAR(0x285e048), Foswiki::Meta=HASH(0x361ba30)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/ line 3620
Foswiki::expandMacros(Foswiki=HASH(0x29bfec0), "---+!! %MAKETEXT{\"Welcome to the [_1] web\" args=\"%WEB%\"}"..., Foswiki::Meta=HASH(0x361ba30)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 3353
Foswiki::Meta::expandMacros(Foswiki::Meta=HASH(0x361ba30), "---+!! %MAKETEXT{\"Welcome to the [_1] web\" args=\"%WEB%\"}"...) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/UI/ line 479
Foswiki::UI::View::_prepare("---+!! %MAKETEXT{\"Welcome to the [_1] web\" args=\"%WEB%\"}"..., Foswiki::Meta=HASH(0x361ba30), 0) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/UI/ line 459
Foswiki::UI::View::view(Foswiki=HASH(0x29bfec0)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 374
Foswiki::UI::__ANON__() called at /usr/share/perl5/ line 416
eval {...} called at /usr/share/perl5/ line 408
Error::subs::try(CODE(0x18f29f8), HASH(0x29bf908)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 500
Foswiki::UI::_execute(Foswiki::Request=HASH(0x28f3430), CODE(0x213f7e8), "view", 1) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/ line 326
Foswiki::UI::handleRequest(Foswiki::Request=HASH(0x28f3430)) called at /var/www/data/Foswiki-2.0.1-RC1/lib/Foswiki/Engine/ line 99
Foswiki::Engine::CGI::run(Foswiki::Engine::CGI=HASH(0x2254c90)) called at view line 29.

-- GeorgeClark - 27 Jul 2015

Maybe a red herring, but I'm wondering if there is something happening in the conversion of Wikiname to cUID and back to Wikiname.

-- GeorgeClark - 27 Jul 2015

Following fix seems to make everything work fine locally here: Bogus fix:
diff --git a/TopicUserMappingContrib/lib/Foswiki/Users/ b/TopicUserMappingContrib/lib/Foswiki/Users/
index bd7ebad..fa305ff 100755
--- a/TopicUserMappingContrib/lib/Foswiki/Users/
+++ b/TopicUserMappingContrib/lib/Foswiki/Users/
@@ -201,12 +201,9 @@ sub getLoginName {
     # Remove the mapping id in case this is a subclass
     $login =~ s/$this->{mapping_id}// if $this->{mapping_id};
-    use bytes;
     # Reverse the encoding used to generate cUIDs in login2cUID
-    # use bytes to ignore character encoding
     $login =~ s/_([0-9a-f][0-9a-f])/chr(hex($1))/gei;
-    no bytes;
     return unless _userReallyExists( $this, $login );
     return unless ( $cUID eq $this->login2cUID($login) );
diff --git a/core/lib/Foswiki/ b/core/lib/Foswiki/
index ec70dd3..3327b70 100644
--- a/core/lib/Foswiki/
+++ b/core/lib/Foswiki/
@@ -842,9 +842,7 @@ m/\$(?:year|ye|wikiusername|wikiname|week|we|web|wday|username|tz|topic|time|sec
                 $user =~ s/^[A-Z][A-Za-z]+Mapping_//;
                 #and then xform any escaped chars.
-                use bytes;
                 $user =~ s/_([0-9a-f][0-9a-f])/chr(hex($1))/ge;
-                no bytes;
             $wun ||= $user;
             $wn  ||= $user;
diff --git a/core/lib/Foswiki/ b/core/lib/Foswiki/
index 77f892e..378ab63 100644
--- a/core/lib/Foswiki/
+++ b/core/lib/Foswiki/
@@ -401,11 +401,7 @@ sub mapLogin2cUID {
     my $cUID = shift;
     ASSERT( defined($cUID) ) if DEBUG;
-    # use bytes to ignore character encoding
-    use bytes;
     $cUID =~ s/([^a-zA-Z0-9])/'_'.sprintf('%02x', ord($1))/ge;
-    no bytes;
     return $cUID;
-- GeorgeClark - 27 Jul 2015

I would to use:
s/(\P{ascii})/sprintf "_%06x", unpack "U0U*", $1/ge;
and back with
 s/_(\p{hex}{6})/pack "U0U*", hex($1)/ge;

-- JozefMojzis - 27 Jul 2015

after the IRC discussion with gac410, the above isn't correct, and need encode each unicode to _xx (2 hex) variant, e.g. for hight unicode characters like: A𝖼B we should get not "A_01d5bcB" as the above produces, but A_01_d5_bcB.

-- JozefMojzis - 27 Jul 2015

And I tested JoeA𝖼B as a WikiName in registration with my patch above. Works but the "Hi [WikiName]" link in the left bar is corrupt. Shows Hi, Joe A_1d 5bc B

-- GeorgeClark - 27 Jul 2015

The "use bytes" is definitely correct. It should encode the UNICODE "ü" to _c3_bc. Proof:
perl -CSDA -e '$x=$ARGV[0]; use bytes; print $x =~ s/([^a-zA-Z0-9])/"_".sprintf("%02x", ord($1))/ger;' "AüB"
produces A_c3_bcB

-- JozefMojzis - 27 Jul 2015

imho the problem source is than the ü got converted to _fc (e.g. iso1) and not to _c3_bc (utf8). Somewhere the perl didn't treate the ü as unicode but as iso1. (we aren't under the "use feature qw(unicode_string);")

-- JozefMojzis - 27 Jul 2015

I've confirmed locally. In the three modules I touched above in my bogus fix above I added: use feature 'unicode_string'; And everything seems to work just fine.

-- Main.GeorgeClark - 27 Jul 2015 - 23:32

No... not confirmed. The unicdoe_string feature fixes up the true utf8 name JoeA𝖼B but using an umlat in a name still ends up with crashes in search.

-- GeorgeClark - 27 Jul 2015

Jomo suggested that we edit the .txt file containg the cuid with _fc, (non-utf8 umlat) to the _c3_bc ...and that fixes the crash. So why is _fc getting into the utf8 internals.

-- GeorgeClark - 28 Jul 2015

IMHO(!) the problem's source is in the session storage. CGI::Session. It uses Data::Dumper storage by defaults and here is stored the AUTHUSER (now unicode). But because, when it reading back, it doesn't interpreted as unicode (like our LSC, before "use utf8;" at his top), so, the AUTHUSER gets interpreted as iso1.

Two different solutions.

1. replace the Data::Dumper with Storable. (also Core module).
diff --git a/core/lib/Foswiki/ b/core/lib/Foswiki/
index fe1fa2e..4e4f010 100644
--- a/core/lib/Foswiki/
+++ b/core/lib/Foswiki/
@@ -1096,7 +1096,7 @@ sub _loadCreateCGISession {
           oct(777) );
     my $newsess = Foswiki::LoginManager::Session->new(
-        undef, $sid,
+        "driver:File;serializer:Storable", $sid,
             Directory => $sessionDir,
             UMask     => $Foswiki::cfg{Session}{filePermission}
or upgrade the AUTHUSER to utf8 after the read.
diff --git a/core/lib/Foswiki/ b/core/lib/Foswiki/
index fe1fa2e..a7c0857 100644
--- a/core/lib/Foswiki/
+++ b/core/lib/Foswiki/
@@ -367,6 +367,8 @@ sub loadSession {
             my $sessionUser = Foswiki::Sandbox::untaintUnchecked(
                 $this->{_cgisession}->param('AUTHUSER') );
+            utf8::upgrade($sessionUser) if $sessionUser;
             _trace( $this, "AUTHUSER from session is $sessionUser" )
               if defined $sessionUser;

Maybe cDot will come with some better solution.

-- JozefMojzis - 28 Jul 2015

Also, not tested roughly of the above.

-- JozefMojzis - 28 Jul 2015

FIX WORKS! Changing the Session file to Storable works fine. As he mentioned on IRC, unless we use storable, the upgrade would have to be done for each variable stored in the session.

CDot, not checking this one in. Needs your review. Thanks

-- GeorgeClark - 29 Jul 2015

Patch file attached.
  • Changes to Storable
  • rewrite session remove code to use the CGI::Session::find() function instead of parsing files ourselves.

Unit tests all pass.

-- GeorgeClark - 29 Jul 2015

Looks OK to me but shouldn't you report that bug upstream to CGI::Session?

-- Main.CrawfordCurrie - 29 Jul 2015 - 10:54

If someone needs the content of the binary session file (Storable) in the old Data::Dumper format, could use the following one liner:
perl -MPath::Tiny -MStorable=thaw -MData::Dumper -E 'say Dumper(thaw(path($_)->slurp_raw)) for @ARGV'
it dumps the content of the Storable sessison files (from the arguments), .e.g. for the example session file
perl -MPath::Tiny -MStorable=thaw -MData::Dumper -E 'say Dumper(thaw(path($_)->slurp_raw)) for @ARGV' ./working/tmp/cgisess_6369bdd5d9263e534d9f4a21d96ed1ba
$VAR1 = {
          '_SESSION_ID' => '6369bdd5d9263e534d9f4a21d96ed1ba',
          'FOSWIKISTRIKEONE' => '817fe30837d89f7a6031cb31286356db',
          '_SESSION_ATIME' => 1449760757,
          '_SESSION_REMOTE_ADDR' => '',
          'VALID_ACTIONS' => {
                               '5eaded1d6a8eca9d22431b56de726970' => 1449764357,
                               'c62ed6bf3229bc4db767ecef24447aaa' => 1449764357
          '_SESSION_CTIME' => 1449760757

If need exactly the original format, the following dumps the Storable format as exactly was in the old Dumper format:
perl -MPath::Tiny -MStorable=thaw -MData::Dumper -E 'say Data::Dumper->new([thaw(path($_)->slurp_raw)],[q{D}])->Indent(0)->Purity(1)->Useqq(0)->Deepcopy(0)->Quotekeys(1)->Terse(0)->Dump . q{;$D} for @ARGV'
it prints for example:
$D = {'_SESSION_ATIME' => 1449760757,'_SESSION_REMOTE_ADDR' => '','VALID_ACTIONS' => {'c62ed6bf3229bc4db767ecef24447aaa' => 1449764357,'5eaded1d6a8eca9d22431b56de726970' => 1449764357},'FOSWIKISTRIKEONE' => '817fe30837d89f7a6031cb31286356db','_SESSION_ID' => '6369bdd5d9263e534d9f4a21d96ed1ba','_SESSION_CTIME' => 1449760757};;$D
for grepping some value, e.g. _SESSION_ATIME is possible to use:
perl -MPath::Tiny -MStorable=thaw -MData::Dumper -E 'say thaw(path($_)->slurp_raw)->{_SESSION_ATIME} for @ARGV'

-- Main.JozefMojzis - 10 Dec 2015 - 15:54

ItemTemplate edit

Summary Crashes in Search, and for cUID mapped from uft8 WikiName
ReportedBy GeorgeClark
Codebase 2.0.0, trunk
SVN Range
AppliesTo Engine
Component I18N, SEARCH, UserMapping
Priority Urgent
CurrentState Closed
Checkins distro:8eeffefcc1bd
TargetRelease patch
ReleasedIn 2.0.1
CheckinsOnBranches master Item13525
masterCheckins distro:8eeffefcc1bd
ItemBranchCheckins distro:8eeffefcc1bd
Topic attachments
I Attachment Action Size Date Who Comment
utf8-session-support.patchpatch utf8-session-support.patch manage 2 K 29 Jul 2015 - 03:02 GeorgeClark Patch to LoginManager to convert to Storable, and implement cleanup using CGI::Session::find()
Topic revision: r16 - 10 Dec 2015, JozefMojzis - This page was cached on 23 Jan 2017 - 08:26.

The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License