Item13817: EditRowPlugin performance issues with tables > 100 rows.

pencil
Priority: Urgent
Current State: Closed
Released In: 2.0.3
Target Release: patch
Applies To: Extension
Component: EditRowPlugin
Branches: master
Reported By: JacobChamplin
Waiting For:
Last Change By: GeorgeClark
We just installed a 2.0.2 and tried to used the EditRowPlugin instead of the EditTablePlugin. However, our users complained that certain pages wouldn't render. Turns out is any table with > 100 rows, was taking so long to render that the users gave up. Going back to EditTablePlugin solved this problem. From best we can tell its not a certain # of rows, but EditRowPlugin seems to get exponentially slower with each row added.

System is OpenSuSE 13.1. 32 cores, 32GB RAM. Perl 5.18.1 . I don't think its system speed. It also doesn't appear to be the data in the table, we have tried with one character per cell.

-- JacobChamplin - 14 Oct 2015

Confirmed, marking urgent.
(master)$ time ./view topic=Litterbox.TestBigEditTable > /dev/null 

real    0m0.574s
user    0m0.544s
sys     0m0.028s

(master)$ time ./view topic=Litterbox.TestBigEditRow > /dev/null 

real    0m9.154s
user    0m9.028s
sys     0m0.136s

-- GeorgeClark - 15 Oct 2015

Here's what jumps out ... rendering a page with approx. 400 rows in a table.

# spent 4.83s (59.0ms+4.77) within Foswiki::Plugins::EditRowPlugin::Table::getParams which was called 3209 times, avg 1.50ms/call: 
# 3208 times (59.0ms+4.77s) by Foswiki::Plugins::EditRowPlugin::Editor::text::jQueryMetadata at line 31 of /var/www/foswiki/distro/core/lib/Foswiki/Plugins/EditRowPlugin/Editor/text.pm, avg 1.50ms/call 
# once (9µs+989µs) by Foswiki::Plugins::EditRowPlugin::TableCell::render at line 138 of /var/www/foswiki/distro/core/lib/Foswiki/Plugins/EditRowPlugin/TableCell.pm

getParams is calling Foswiki::Func::getRevisionInfo(), which re-reads the topic from Store, ... 3209 time.

That's about 1/2 of the total time, the other big component is the ControlWikiWordPlugin, almost all of it down in putBackprotected, in the regex substituting for the protected block markers. It looks like the regex scanning for markers is a huge inefficiency. Definitely better ways to put back blocks / protected.

The same page with EditTablePlugin enabled, ControlWikiWordPlugin is approx. 3.5ms, so there is some interaction going on there.

-- GeorgeClark - 15 Oct 2015

Table is 400 rows:
%EDITTABLE{}%   (or %EDITROW{}%)
| *A* | *B* | *C* | *D* | *E* | *F* | *G* | *H* |
| A | B | C | D | E | F | G | H |
| A | B | C | D | G | F | G | H |

-- GeorgeClark - 15 Oct 2015

Just wanted to throw in that the page we had trouble with also has 700 revisions. (Yes we have been running the same wiki since the Twiki days when the logo was actually Twiki from Battlestar Galactica. smile I hadn't given much thought to this being related to revisions.

-- JacobChamplin - 15 Oct 2015

Crawford, the following cache of the topic object seems to be a significant help, but I have no idea whether or not this is the right / best place to cache the topic object. Gave up on the cache approach. It seems as though the topicObject for the table being processed should have been loaded earlier, and passed as an object rather than avoiding the load by caching the data. I'm wondering instead of calling getWeb() and getTopic(), those should be replaced by a getTopicObject(), which would return a loaded topic object.

-- GeorgeClark - 16 Oct 2015

Yes, topicObject is already loaded. Just need to expose it and use the Meta::getRevisionInfo instead of Func. Much better. Unit tests pass.

This fix seems to work quite a bit better. Rendering time of a 400 row table
real    0m0.504s
user    0m0.476s
sys     0m0.028s

And the fix:
diff --git a/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin/Table.pm b/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin/Table.pm
index ca85055..e7f31b3 100644
--- a/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin/Table.pm
+++ b/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin/Table.pm
@@ -124,6 +124,11 @@ sub getTopic {
     return $this->{meta}->topic();
 }
 
+sub getTopicObject {
+    my $this = shift;
+    return $this->{meta};
+}
+
 # Calculate row labels
 sub _assignLabels {
     my $this  = shift;
@@ -455,11 +460,12 @@ sub getParams {
 
     $prefix ||= '';
 
-    # Get the active (most recent) version number for the topic with this table
-    my @ri = Foswiki::Func::getRevisionInfo( $this->getWeb, $this->getTopic );
+    my $meta = $this->getTopicObject();
+    my $info = $meta->getRevisionInfo();
+
     return (
         "${prefix}topic"   => $this->getWeb . '.' . $this->getTopic,
-        "${prefix}version" => "$ri[2]_$ri[0]",
+        "${prefix}version" => "$info->{version}_$info->{date}",
         "${prefix}table"   => $this->getID()
     );
 }

-- GeorgeClark - 16 Oct 2015

George, I took your patch this morning to try and see if it helps. It doesn't help my particular case. So I tried to follow in your foot steps and created your exact 400 row table. Ran the timings...

# EditTablePlugin
time ./view topic=Sandbox.JacobChamplinSandbox > /dev/null

real   0m3.237s
user   0m3.188s
sys   0m0.044s

# EditRowPlugin - Patched
time ./view topic=Sandbox.JacobChamplinSandbox > /dev/null
^C
real   7m33.246s
user   7m32.720s
sys   0m0.088s

As you can see I had to kill the command after 7 minutes. So we aren't comparing apples to apples. I then decided to run a strace, and my box hangs...

open("/srv/www/htdocs/wiki/data/Sandbox/JacobChamplinSandbox.txt", O_RDONLY) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fffcd71f4c0) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=13815, ...}) = 0
fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=13815, ...}) = 0
read(4, "%META:TOPICINFO{author=\"wjacobc\""..., 8192) = 8192
read(4, " F | G | H |\n| A | B | C | D | G"..., 8192) = 5623
read(4, "", 8192)                       = 0
close(4)                                = 0
stat("/srv/www/htdocs/wiki/data/Sandbox/JacobChamplinSandbox.txt,v", {st_mode=S_IFREG|0644, st_size=62061, ...}) = 0
mmap(NULL, 1794048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30e188000
mremap(0x7ff30e188000, 1794048, 2240512, MREMAP_MAYMOVE) = 0x7ff30df65000
mmap(NULL, 1794048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30e188000
mmap(NULL, 1794048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30ddaf000
mmap(NULL, 1794048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30dbf9000
munmap(0x7ff30ddaf000, 1794048)         = 0
mremap(0x7ff30dbf9000, 1794048, 2240512, MREMAP_MAYMOVE) = 0x7ff30dbf9000
brk(0x3d2e000)                          = 0x3d2e000
mmap(NULL, 2240512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d9d6000
brk(0x3ee4000)                          = 0x3ee4000
brk(0x4099000)                          = 0x4099000
brk(0x424e000)                          = 0x424e000
brk(0x4404000)                          = 0x4404000
brk(0x45b9000)                          = 0x45b9000
brk(0x476e000)                          = 0x476e000
brk(0x4923000)                          = 0x4923000
brk(0x4ad9000)                          = 0x4ad9000
brk(0x4c8e000)                          = 0x4c8e000
mmap(NULL, 2240512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d7b3000
brk(0x4923000)                          = 0x4923000
brk(0x4ad9000)                          = 0x4ad9000
brk(0x4c8e000)                          = 0x4c8e000
munmap(0x7ff30d7b3000, 2240512)         = 0
brk(0x4e43000)                          = 0x4e43000
brk(0x4ff8000)                          = 0x4ff8000
brk(0x51ae000)                          = 0x51ae000
brk(0x538f000)                          = 0x538f000
brk(0x5571000)                          = 0x5571000
mmap(NULL, 2465792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d77c000
brk(0x5752000)                          = 0x5752000
brk(0x5934000)                          = 0x5934000
mmap(NULL, 2469888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d521000
brk(0x5b17000)                          = 0x5b17000
mmap(NULL, 2469888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d2c6000
brk(0x5cfa000)                          = 0x5cfa000
mmap(NULL, 2473984, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30d06a000
brk(0x5d39000)                          = 0x5d39000
mmap(NULL, 2473984, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30ce0e000
brk(0x5752000)                          = 0x5752000
brk(0x5935000)                          = 0x5935000
mmap(NULL, 2473984, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff30cbb2000
munmap(0x7ff30ce0e000, 2473984)         = 0
brk(0x5b91000)                          = 0x5b91000
munmap(0x7ff30cbb2000, 2473984)         = 0
brk(0x5dec000)                          = 0x5dec000
brk(0x6030000)                          = 0x6030000
brk(0x6242000)                          = 0x6242000
^C--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---

So it appears my box hangs on "munmap" with the EditRowPlugin, and the cpu stays pegged at 100% until I kill. So your patch looks good for performance, but I have something else going on here.

-- JacobChamplin - 16 Oct 2015

After much debugging this appears to be regular expression related. If we comment out Render.pm lines 350, and 352. Then the page seems now loads in 44s. If I comment out more regex in Render.pm like line 324 it goes down to 25s. So something is going on in my environment that is making these perl regex super slow, and the more text in the wiki page the slower it gets. We also took the value in the $text variable and put it into a file and ran the regex on line 350 against it from the command line and its instantaneous. So I think perl might be fine but something about that regex in the foswiki enviroment...

I came across this bug: https://rt.perl.org/Public/Bug/Display.html?id=66852

Which suggest that perl 5.18 may have a performance problem with regex over utf-8. George what version of perl are you running?

-- JacobChamplin - 16 Oct 2015

Last one I swear... My colleague wrote a test:

our $REMARKER = "\0";
my $text;
{
    local $/;
    open my $fh, '<', "/tmp/table.txt" or die "can't open file: $!";
    $text = <$fh>;
}
#utf8::upgrade($text);
$text =~ s/&(#x?[0-9a-f]+);/$REMARKER$1;/gi;    # "&#123;"
print $text;

This code on Perl 5.18.1 is 500x slower when the text is utf-8. I can't really try another perl version because of the OpenSuSE dependancy graph.

-- JacobChamplin - 16 Oct 2015

Hi Jacob, I run perlbrew locally, so I can test pretty much any version. Is there anything specific about /tmp/table.txt? I used that simple | a | b | c | ... table, 10,000 lines, and ran the above script. The difference is not significant. I moved to Benchmark.pl to compare performance:
  • a: Runs the above code with uft8::upgrade ... un-commented.
  • b: Runs the above code, with #utf8::upgrade... commented out.
$ perl test-utf8.pl 
v5.18.4
    Rate    a    b
a 75.9/s   -- -89%
b  708/s 834%   --
$ perlbrew switch 5.20.2
$ perl test-utf8.pl 
v5.20.2
    Rate    a    b
a 76.1/s   -- -89%
b  692/s 809%   --

So it shows a considerable difference between utf-8 and non-utf-8 performance, but the difference between perl 5.18.4 and Perl 5.20.2 is insignificant. By the way, perl 5.8.8 is much worse.
v5.8.8
    Rate      a      b
a 5.90/s     --   -99%
b  647/s 10863%     --

-- GeorgeClark - 16 Oct 2015

I have attached our table.txt. So the difference is its not just wiki syntax. We grabbed the $text variable contents from debugging Render.pm and put them into a text file for the test. The difference is that his appears it is parsing the HTML produced from the wiki text of the 400 row table. If I have time today I might setup perlbrew, maybe its our 5.18.1 version exactly. Maybe its some OpenSuSE patch to perl. Who knows, looks like this one is going to get deep fast. Thanks for your continued effort, even though it appears to be our environment.

-- JacobChamplin - 19 Oct 2015

I have run my perl version tests using our HTML table.txt file and get the following results...

#Perl 5.8.9 (Perlbrew)
> time perl ./test-utf8.pl > /dev/null 
real   0m0.050s
user   0m0.048s
sys     0m0.001s

#Perl 5.16.3 (Perlbrew)
> time perl ./test-utf8.pl > /dev/null 
real   0m0.031s
user   0m0.030s
sys     0m0.001s

# Perl 5.18.1 (OpenSuSE 13.1)
>time perl ./test-utf8.pl > /dev/null 
real   0m12.572s
user   0m12.560s
sys     0m0.008s

#Perl 5.18.4 (Perlbrew)
> time perl ./test-utf8.pl > /dev/null 
real   0m12.504s
user   0m12.498s
sys     0m0.001s

#Perl 5.20.3 (Perlbrew)
> time perl ./test-utf8.pl > /dev/null 
real   0m0.028s
user   0m0.025s
sys   0m0.003s

#Perl 5.23.3 (Perlbrew)
> time perl ./test-utf8.pl > /dev/null 
real   0m0.023s
user   0m0.023s
sys     0m0.000s

My conclusion remains that something is wrong with Perl 5.18.X with UTF-8, and these regular expressions.

-- JacobChamplin - 19 Oct 2015

The following patch to Foswiki::Render and EditRowPlugin greatly reduces the render time, though there are still significant issues. For a table with 400 rows and 8 columns
  • Perl 5.20.2: 0m1.97 seconds
  • Perl 5.18.4: 11m17.921s
  • Perl 5.18.4: 2m24.405s (patched)

The issue addressed below. ERP generates all HTML using double-quotes, and then has to entity-encode them. By changing generation to use single quotes, then we don't have to entity encode them, greatly reducing processing.

In table.txt file attached, there are 21,572 html entities, of which only 2496 are not double-quotes &#34; So this patch eliminates 19,076 entities from the rendered HTML.
diff --git a/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin.pm b/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin.pm
index f033aa4..793bb11 100644
--- a/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin.pm
+++ b/EditRowPlugin/lib/Foswiki/Plugins/EditRowPlugin.pm
@@ -22,8 +22,8 @@ BEGIN {
                 # SMELL: make the sort conditional on DEBUG
                 foreach my $k ( sort keys %$attrs ) {
                     my $v = $attrs->{$k};
-                    $v =~ s/([&<>"\x8b\x9b'])/'&#'.ord($1).';'/ge;
-                    $html .= " $k=\"$v\"";
+                    $v =~ s/([&<>\x8b\x9b'])/'&#'.ord($1).';'/ge;
+                    $html .= " $k='$v'";
                 }
             }
             $innerHTML = '' unless defined $innerHTML;


diff --git a/core/lib/Foswiki/Render.pm b/core/lib/Foswiki/Render.pm
index abbc266..a8c7062 100644
--- a/core/lib/Foswiki/Render.pm
+++ b/core/lib/Foswiki/Render.pm
@@ -598,8 +598,8 @@ sub html {
             my $v = $attrs->{$k};
 
             # This is what CGI encodes, so....
-            $v =~ s/([&<>"\x8b\x9b'])/'&#'.ord($1).';'/ge;
-            $html .= " $k=\"$v\"";
+            $v =~ s/([&<>\x8b\x9b'])/'&#'.ord($1).';'/ge;
+            $html .= " $k='$v'";
         }
     }
     $innerHTML = '' unless defined $innerHTML;

-- GeorgeClark - 19 Oct 2015

Not sure what else I can do here. Marking task Waiting for Release.

-- GeorgeClark - 26 Oct 2015
 
I Attachment Action Size Date Who Comment
table.txttxt table.txt manage 377 K 19 Oct 2015 - 13:20 JacobChamplin  
Topic revision: r16 - 16 Nov 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy