← Index
NYTProf Performance Profile   « line view »
For ./view
  Run on Fri Jul 31 18:42:36 2015
Reported on Fri Jul 31 18:48:16 2015

Filename/var/www/foswikidev/core/lib/Foswiki/Query/HoistREs.pm
StatementsExecuted 1693 statements in 4.87ms
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
40111.06ms1.86msFoswiki::Query::HoistREs::::_hoistEQFoswiki::Query::HoistREs::_hoistEQ
4011668µs3.18msFoswiki::Query::HoistREs::::hoistFoswiki::Query::HoistREs::hoist
4011542µs542µsFoswiki::Query::HoistREs::::_hoistDOTFoswiki::Query::HoistREs::_hoistDOT
4011355µs2.51msFoswiki::Query::HoistREs::::_hoistANDFoswiki::Query::HoistREs::_hoistAND
4011302µs2.16msFoswiki::Query::HoistREs::::_hoistORFoswiki::Query::HoistREs::_hoistOR
8021253µs253µsFoswiki::Query::HoistREs::::_hoistConstantFoswiki::Query::HoistREs::_hoistConstant
11113µs25µsFoswiki::Query::HoistREs::::BEGIN@52Foswiki::Query::HoistREs::BEGIN@52
11110µs56µsFoswiki::Query::HoistREs::::BEGIN@58Foswiki::Query::HoistREs::BEGIN@58
1119µs13µsFoswiki::Query::HoistREs::::BEGIN@53Foswiki::Query::HoistREs::BEGIN@53
1114µs4µsFoswiki::Query::HoistREs::::BEGIN@55Foswiki::Query::HoistREs::BEGIN@55
1113µs3µsFoswiki::Query::HoistREs::::BEGIN@56Foswiki::Query::HoistREs::BEGIN@56
0000s0sFoswiki::Query::HoistREs::::_monTermFoswiki::Query::HoistREs::_monTerm
0000s0sFoswiki::Query::HoistREs::::_monitorFoswiki::Query::HoistREs::_monitor
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1# See bottom of file for license and copyright information
2
3=begin TML
4
5---+ package Foswiki::Query::HoistREs
6
7Static functions to extract regular expressions from queries. The REs can
8be used in caching stores that use the Foswiki standard inline meta-data
9representation to pre-filter topic lists for more efficient query matching.
10
11See =Store/QueryAlgorithms/BruteForce.pm= for an example of usage.
12
13Note that this hoisting is very crude. At this point of time the
14functions don't attempt to do anything complicated, like re-ordering
15the query. They simply hoist up expressions on either side of an AND,
16where the expressions apply to a single domain.
17
18The ideal would be to rewrite the query for AND/OR evaluation i.e. an
19expression of the form (A and B) or (C and D). However this is
20complicated by the fact that there are three search domains (the web
21name, the topic name, and the topic text) that may be freely
22intermixed in the query, but cannot be mixed in the generated search
23expressions. The problem becomes one of rewriting the query to
24separate these three sets. For example, a query such as:
25
26name='Topic' OR Field='maes' OR web='Trash'
27
28requires three searches. We have to filter on name='Topic', and
29separately filter on Field='maes' and then union the sets.
30
31This gets complicated when the sets are intermixed; for example,
32
33(name='Topic' OR Field='maes') AND (web='Trash' OR Maes="field")
34
35Because the Field= terms on each side of the AND could potentially
36match any topic, we can't usefully hoist the name= or web= sub-terms.
37We can, however, hoist the Field subqueries. Now, what happens when we
38have an expression like this?
39
40(name='Topic' OR Field='maes') AND (web='Trash')
41
42Obviously we can pre-filter on the web='Trash' term, but we can't
43filter on name="Topic" because it is part of an OR.
44
45If you think I'm making this too complicated, please feel free to
46implement your own superior heuristics!
47
48=cut
49
50package Foswiki::Query::HoistREs;
51
52226µs238µs
# spent 25µs (13+12) within Foswiki::Query::HoistREs::BEGIN@52 which was called: # once (13µs+12µs) by Foswiki::Store::Interfaces::QueryAlgorithm::BEGIN@17 at line 52
use strict;
# spent 25µs making 1 call to Foswiki::Query::HoistREs::BEGIN@52 # spent 12µs making 1 call to strict::import
53222µs217µs
# spent 13µs (9+4) within Foswiki::Query::HoistREs::BEGIN@53 which was called: # once (9µs+4µs) by Foswiki::Store::Interfaces::QueryAlgorithm::BEGIN@17 at line 53
use warnings;
# spent 13µs making 1 call to Foswiki::Query::HoistREs::BEGIN@53 # spent 4µs making 1 call to warnings::import
54
55222µs14µs
# spent 4µs within Foswiki::Query::HoistREs::BEGIN@55 which was called: # once (4µs+0s) by Foswiki::Store::Interfaces::QueryAlgorithm::BEGIN@17 at line 55
use Foswiki::Infix::Node ();
# spent 4µs making 1 call to Foswiki::Query::HoistREs::BEGIN@55
56222µs13µs
# spent 3µs within Foswiki::Query::HoistREs::BEGIN@56 which was called: # once (3µs+0s) by Foswiki::Store::Interfaces::QueryAlgorithm::BEGIN@17 at line 56
use Foswiki::Query::Node ();
# spent 3µs making 1 call to Foswiki::Query::HoistREs::BEGIN@56
57
5821.64ms2102µs
# spent 56µs (10+46) within Foswiki::Query::HoistREs::BEGIN@58 which was called: # once (10µs+46µs) by Foswiki::Store::Interfaces::QueryAlgorithm::BEGIN@17 at line 58
use constant MONITOR_HOIST => 0;
# spent 56µs making 1 call to Foswiki::Query::HoistREs::BEGIN@58 # spent 46µs making 1 call to constant::import
59
601300nsour $indent = 0;
61
62sub _monitor {
63 my @p = map { ref($_) ? $_->stringify() : $_ } @_;
64 print STDERR ( ' ' x $indent ) . join( ' ', @p ) . "\n";
65}
66
67=begin TML
68
69---++ StaticMethod hoist($query) -> \%regex_lists
70
71Main entry point for the hoister.
72
73Returns a hash where the keys are the aspects to be tested
74(web|name|text) and the AND terms represented as lists of regexes,
75each of which is one OR term.
76
77There are also keys named "(web|name|text)_source" where the list
78contains what the user entered for that term.
79
80=cut
81
82
# spent 3.18ms (668µs+2.51) within Foswiki::Query::HoistREs::hoist which was called 40 times, avg 80µs/call: # 40 times (668µs+2.51ms) by Foswiki::Store::QueryAlgorithms::BruteForce::_webQuery at line 111 of /var/www/foswikidev/core/lib/Foswiki/Store/QueryAlgorithms/BruteForce.pm, avg 80µs/call
sub hoist {
834017µs my $node = shift;
84409µs my %collation;
85
86 # Gather up all the terms applicable to a particular field
8740162µs402.51ms my @terms = _hoistAND($node);
# spent 2.51ms making 40 calls to Foswiki::Query::HoistREs::_hoistAND, avg 63µs/call
884075µs foreach my $term (@terms) {
894090µs push( @{ $collation{ $term->{field} } }, $term->{regex} );
9040127µs push( @{ $collation{ $term->{field} . '_source' } }, $term->{source} );
91 }
92
93 #use Data::Dumper;
94 #print STDERR "--- hoisted: ".Dumper(%collation)."\n" if MONITOR_HOIST;
9540165µs return \%collation;
96}
97
98# Used for MONITOR_HOIST
99sub _monTerm {
100 my $term = shift;
101 return "$term->{field} => /$term->{regex}/";
102}
103
104# Each collection object in the result contains the field the regex is for, a
105# regex string, and the source string that the user entered. e.g.
106# {
107# field => 'web|name|text',
108# regex => 'Web.*'
109# source => 'Web*'
110# }
111
# spent 2.51ms (355µs+2.16) within Foswiki::Query::HoistREs::_hoistAND which was called 40 times, avg 63µs/call: # 40 times (355µs+2.16ms) by Foswiki::Query::HoistREs::hoist at line 87, avg 63µs/call
sub _hoistAND {
1124019µs my $node = shift;
113
1144039µs return () unless ref( $node->{op} );
115
1164033µs if ( $node->{op}->{name} eq '(' ) {
117 return _hoistAND( $node->{params}[0] );
118 }
119
1204031µs if ( $node->{op}->{name} eq 'and' ) {
121
122 # An 'and' conjunction yields a set of individual expressions,
123 # each of which must match the data
124 my @list = @{ $node->{params} };
125 $indent++;
126 my @collect = _hoistAND( shift(@list) );
127 while ( scalar(@list) ) {
128 my $term = _hoistOR( shift @list );
129 next unless $term;
130 push( @collect, $term );
131 }
132 $indent--;
133 _monitor( "hoistAND ", $node,
134 join( ', ', map { _monTerm($_) } @collect ) )
135 if MONITOR_HOIST;
136 return @collect;
137 }
138 else {
13940110µs402.16ms my $or = _hoistOR($node);
# spent 2.16ms making 40 calls to Foswiki::Query::HoistREs::_hoistOR, avg 54µs/call
14040104µs return ($or) if $or;
141 }
142
143 _monitor( "hoistAND ", $node, " FAILED" ) if MONITOR_HOIST;
144 return ();
145}
146
147# depth 1; we can handle a sequence of ORs, which we collapse into
148# a common regular expression when they apply to the same field.
149
# spent 2.16ms (302µs+1.86) within Foswiki::Query::HoistREs::_hoistOR which was called 40 times, avg 54µs/call: # 40 times (302µs+1.86ms) by Foswiki::Query::HoistREs::_hoistAND at line 139, avg 54µs/call
sub _hoistOR {
150409µs my $node = shift;
151
1524034µs return unless ref( $node->{op} );
153
1544037µs if ( $node->{op}->{name} eq '(' ) {
155 return _hoistOR( $node->{params}[0] );
156 }
157
1584032µs if ( $node->{op}->{name} eq 'or' ) {
159 my @list = @{ $node->{params} };
160 $indent++;
161 my %collection;
162 while ( scalar(@list) ) {
163 my $term = _hoistEQ( shift(@list) );
164
165 # If we fail to hoist the subexpression then it can't
166 # be expressed using simple regexes. In this event we can't
167 # account for this term in a top-level and, so we have
168 # to abort the entire hoist.
169 unless ($term) {
170 %collection = ();
171 last;
172 }
173 my $collect = $collection{ $term->{field} };
174 if ($collect) {
175
176 # Combine with previous
177 $collect->{regex} .= '|' . $term->{regex};
178 $collect->{source} .= ',' . $term->{source};
179 }
180 else {
181 $collection{ $term->{field} } = $term;
182 }
183 }
184 $indent--;
185 _monitor( "hoistOR ", $node,
186 join( ', ', map { _monTerm($_) } values %collection ) )
187 if MONITOR_HOIST;
188
189 # At this point we have collected terms for all the domains, and
190 # if there is only one we can just return it. However if the
191 # expression involved more than one domain, we have a "mixed or"
192 # and we can't hoist.
193 if ( scalar( keys %collection ) == 1 ) {
194 return ( values(%collection) )[0];
195 }
196 }
197 else {
19840189µs401.86ms return _hoistEQ($node);
# spent 1.86ms making 40 calls to Foswiki::Query::HoistREs::_hoistEQ, avg 46µs/call
199 }
200
201 _monitor( "hoistOR ", $node, " FAILED" ) if MONITOR_HOIST;
202 return;
203}
204
2051500nsour $PHOLD = "\000RHS\001";
206
207# depth 2: can handle = and ~ expressions
208
# spent 1.86ms (1.06+795µs) within Foswiki::Query::HoistREs::_hoistEQ which was called 40 times, avg 46µs/call: # 40 times (1.06ms+795µs) by Foswiki::Query::HoistREs::_hoistOR at line 198, avg 46µs/call
sub _hoistEQ {
2094040µs my $node = shift;
210
2114038µs return unless ref( $node->{op} );
212
2134027µs if ( $node->{op}->{name} eq '(' ) {
214 return _hoistEQ( $node->{params}[0] );
215 }
216
217 # $PHOLD is a placeholder for the RHS term in the regex
2184041µs if ( $node->{op}->{name} eq '=' ) {
2194050µs $indent++;
22040145µs40542µs my $lhs = _hoistDOT( $node->{params}[0] );
# spent 542µs making 40 calls to Foswiki::Query::HoistREs::_hoistDOT, avg 14µs/call
2214084µs40179µs my $rhs = _hoistConstant( $node->{params}[1] );
# spent 179µs making 40 calls to Foswiki::Query::HoistREs::_hoistConstant, avg 4µs/call
2224013µs $indent--;
2234021µs if ( $lhs && defined $rhs ) {
2244045µs $rhs = quotemeta($rhs);
22540204µs $lhs->{regex} =~ s/$PHOLD/$rhs/g;
2264093µs4074µs $lhs->{source} = _hoistConstant( $node->{params}[1] );
# spent 74µs making 40 calls to Foswiki::Query::HoistREs::_hoistConstant, avg 2µs/call
227 _monitor( "hoistEQ ", $node, " =>" ) if MONITOR_HOIST;
22840112µs return $lhs;
229 }
230
231 # = is symmetric, so try the other order
232 $indent++;
233 $lhs = _hoistDOT( $node->{params}[1] );
234 $rhs = _hoistConstant( $node->{params}[0] );
235 $indent--;
236 if ( $lhs && defined $rhs ) {
237 $rhs = quotemeta($rhs);
238 $lhs->{regex} =~ s/$PHOLD/$rhs/g;
239 $lhs->{source} = _hoistConstant( $node->{params}[0] );
240 _monitor( "hoistEQ ", $node, " <=" )
241 if MONITOR_HOIST;
242 return $lhs;
243 }
244 }
245 elsif ( $node->{op}->{name} eq '~' ) {
246 $indent++;
247 my $lhs = _hoistDOT( $node->{params}[0] );
248 my $rhs = _hoistConstant( $node->{params}[1] );
249 $indent--;
250 if ( $lhs && defined $rhs ) {
251 $rhs = quotemeta($rhs);
252 $rhs =~ s/\\\?/./g;
253 $rhs =~ s/\\\*/.*/g;
254 $lhs->{regex} =~ s/$PHOLD/$rhs/g;
255 $lhs->{source} = _hoistConstant( $node->{params}[1] );
256 _monitor( "hoistEQ ", $node, " ~" )
257 if MONITOR_HOIST;
258 return $lhs;
259 }
260 }
261 elsif ( $node->{op}->{name} eq '=~' ) {
262 $indent++;
263 my $lhs = _hoistDOT( $node->{params}[0] );
264 my $rhs = _hoistConstant( $node->{params}[1] );
265 $indent--;
266 if ( $lhs && defined $rhs ) {
267
268#need to detect if its a field, or in a text, and if its a field, remove the ^$ chars...
269#or if there are no ^$, add .*'s if they are not present
270 if ( $lhs->{regex} ne $PHOLD ) {
271 if ( ( not( $rhs =~ m/^\^/ ) )
272 and ( not( $rhs =~ m/^\.\*/ ) ) )
273 {
274 $rhs = '.*' . $rhs;
275 }
276
277 if ( ( not( $rhs =~ m/\$$/ ) )
278 and ( not( $rhs =~ m/\.\*$/ ) ) )
279 {
280 $rhs = $rhs . '.*';
281 }
282
283 #if we're embedding the regex into another, then remove the ^'s
284 $rhs =~ s/^\^//;
285 $rhs =~ s/\$$//;
286 }
287 $lhs->{regex} =~ s/$PHOLD/$rhs/g;
288 $lhs->{source} = _hoistConstant( $node->{params}[1] );
289 _monitor( "hoistEQ ", $node, " =~" )
290 if MONITOR_HOIST;
291 return $lhs;
292 }
293 }
294
295 _monitor( "hoistEQ ", $node, " FAILED" ) if MONITOR_HOIST;
296 return;
297}
298
299# Expecting a (root level) field access expression. This must be of the form
300# <name>
301# or
302# <rootfield>.<name>
303# <rootfield> may be aliased
304
# spent 542µs within Foswiki::Query::HoistREs::_hoistDOT which was called 40 times, avg 14µs/call: # 40 times (542µs+0s) by Foswiki::Query::HoistREs::_hoistEQ at line 220, avg 14µs/call
sub _hoistDOT {
3054020µs my $node = shift;
306
3074028µs if ( ref( $node->{op} ) && $node->{op}->{name} eq '(' ) {
308 return _hoistDOT( $node->{params}[0] );
309 }
310
31140101µs if ( ref( $node->{op} ) && $node->{op}->{name} eq '.' ) {
312 my $lhs = $node->{params}[0];
313 my $rhs = $node->{params}[1];
314 if ( !ref( $lhs->{op} )
315 && !ref( $rhs->{op} )
316 && $lhs->{op} eq Foswiki::Infix::Node::NAME
317 && $rhs->{op} eq Foswiki::Infix::Node::NAME )
318 {
319 $lhs = $lhs->{params}[0];
320 $rhs = $rhs->{params}[0];
321 if ( $Foswiki::Query::Node::aliases{$lhs} ) {
322 $lhs = $Foswiki::Query::Node::aliases{$lhs};
323 }
324 if ( $lhs =~ m/^META:/ ) {
325
326 _monitor( "hoist DOT ", $node, " => $rhs" )
327 if MONITOR_HOIST;
328
329 # $PHOLD is a placholder for the RHS term
330 return {
331 field => 'text',
332 regex => '^%'
333 . $lhs
334 . '\\{.*\\b'
335 . $rhs
336 . "=\\\"$PHOLD\\\""
337 };
338 }
339
340 # Otherwise assume the term before the dot is the form name
341 if ( $rhs eq 'text' ) {
342
343 _monitor( "hoist DOT ", $node, " => formname" )
344 if MONITOR_HOIST;
345
346 # Special case for the text body
347 return { field => 'text', regex => $PHOLD };
348 }
349 else {
350 _monitor( "hoist DOT ", $node, " => fieldname" )
351 if MONITOR_HOIST;
352 return {
353 field => 'text',
354 regex =>
355"^%META:FIELD\\{name=\\\"$rhs\\\".*\\bvalue=\\\"$PHOLD\\\""
356 };
357 }
358
359 }
360 }
361 elsif ( !ref( $node->{op} ) && $node->{op} eq Foswiki::Infix::Node::NAME ) {
3624083µs if ( $node->{params}[0] eq 'name' ) {
363
364 # Special case for the topic name
365 _monitor( "hoist DOT ", $node, " => topic" )
366 if MONITOR_HOIST;
367 return { field => 'name', regex => $PHOLD };
368 }
369 elsif ( $node->{params}[0] eq 'web' ) {
370
371 # Special case for the web name
372 _monitor( "hoist DOT ", $node, " => web" )
373 if MONITOR_HOIST;
374 return { field => 'web', regex => $PHOLD };
375 }
376 elsif ( $node->{params}[0] eq 'text' ) {
377
378 # Special case for the text body
379 _monitor( "hoist DOT ", $node, " => text" )
380 if MONITOR_HOIST;
381 return { field => 'text', regex => $PHOLD };
382 }
383 else {
384 _monitor( "hoist DOT ", $node, " => field" )
385 if MONITOR_HOIST;
386 return {
38740347µs field => 'text',
388 regex =>
389"^%META:FIELD\\{name=\\\"$node->{params}[0]\\\".*\\bvalue=\\\"$PHOLD\\\""
390 };
391 }
392 }
393
394 _monitor( "hoistDOT ", $node, " FAILED" ) if MONITOR_HOIST;
395 return;
396}
397
398# Expecting a constant
399
# spent 253µs within Foswiki::Query::HoistREs::_hoistConstant which was called 80 times, avg 3µs/call: # 40 times (179µs+0s) by Foswiki::Query::HoistREs::_hoistEQ at line 221, avg 4µs/call # 40 times (74µs+0s) by Foswiki::Query::HoistREs::_hoistEQ at line 226, avg 2µs/call
sub _hoistConstant {
4008029µs my $node = shift;
401
40280105µs if (
403 !ref( $node->{op} )
404 && ( $node->{op} eq Foswiki::Infix::Node::STRING
405 || $node->{op} eq Foswiki::Infix::Node::NUMBER )
406 )
407 {
408 _monitor( "hoist CONST ", $node, " => $node->{params}[0]" )
409 if MONITOR_HOIST;
41080227µs return $node->{params}[0];
411 }
412 return;
413}
414
41513µs1;
416__END__