Item2274: Optimise query searches with a constant expression (such as "1")
Priority: Normal
Current State: Closed
Released In: 1.1.0
Target Release: minor
Applies To: Engine
Component:
Branches:
To find topics whose name match a specific pattern, e.g. *Contrib, *AddOn,
a
SEARCH using a regex against the filenames is faster and more robust than a
SEARCH of type query using a bogus "1" query as it is used in
That's because every type-query opens and parses
every topic in the hit set parsing its meta data using a
$meta->reload()
and
$meta->getLoadedRev()
(see
Foswiki::QueryAlgorithms::BruteForce
). If the
getLoadedRev()
fails, the topic is removed from the hit set ... which is quite fragile actually.
Sven, any comments before I rework them to good'ol'
SEARCH{".*(Contrib|AddOn)$" type="regex" scope="topic"}?
--
MichaelDaum - 19 Oct 2009
yup. Don't do that.
What we want to do is make query search the faster algo - and '1' query is a nop.
specifically,
%SEARCH{
"1"
topic="*Skin"
scope="topic"
type="query"
means that type-query should simply list
all topics that are in the
topic
glob list..
if we're only talking about trunk, then definitely don't - I'll use this task to make sure its true - looking at the code, you're right, it goes through a very convoluted and pointless lot of loading - if you want to change it in the release branch (i've not looked hard at the code paths there) - go for it.
Similarly SEARCH{"name~'*Form'"
should also not need to load the Meta obj. - clearly, thankyou for raising this.
Mind you - I'm not sure how much it gains (i guess if you have alot of Plugins, but only list the first 2?), as the permissions check, sorting and formatting then goes and loads the meta anyway -
have you got some benchmarks that show it?
adding Crawford into the feedback list - as I'm not sure if there's any sense in listing a topic on which
getLoadedRev()
fails
--
SvenDowideit - 20 Oct 2009
Sorry, I don't have benchmarks. This specific query opens the file while it doesn't need so. That's obviously slower than just reading a directory object. Not sure if optimizing search-type-query for this special case cuts it. I wonder if WEBTOPICLIST has got an
include/exclude
pair of filter parameters.
I've found out about this because I was looking at my system's InstalledPlugins and was missing some contribs in the list that were obviously installed but not shown. The reason was that
Meta::load()
(or was it
getLoadedRev()
) failed on a TOPICINFO containing a
version="$Rev$"
. So
Foswiki::Store::QueryAlgorithms::BruteForce::query()
kicked it out of the result list. That's right at the point where all the topics get meta-loaded.
--
MichaelDaum - 20 Oct 2009
mmm, ok, so the reason you've reported this task is actually due to a bug in the query algo, that I suspect is actually there in the 1.0 release stream too -
getLoadedRev
looks like it fails on a dev setup - and the possible performance issue is a side issue.
I can confirm that in my trunk test with the benchmarking i started to set up to see if you had a case has the missing results issue
basically, i think that in the case where I use
"1"
, all the topics need to be loaded some time - it doesn't matter (yet) if thats in the query algo, in the security check, or during the format stage. Once the paging work is releasable, then its a real thing.
BUT it is also quite plausible that we're doing way more than we should.
- basic expectation - if a Meta object has been loaded for a topic, it should not be loaded again (it looks like this expectation has not been met, and so there is indeed a pointless multiple re-reading of the same topic file (and version))
I'll split this into 2 tasks later on when I've stopped making test code to show myself what is going on.
--
SvenDowideit - 20 Oct 2009
well. Commit
distro:59beb0971d0e on
Tasks.Item4795 repairs the symptoms that lead to this bug report. I will now use this report to fix the serious lack of
fastpath optimisation for
search='1'
--
SvenDowideit - 14 Nov 2009
OK, if the commit solves the immediate problem I guess this is an optimisation problem now, and therefore no longer Urgent, so downgrading to Normal for further work.
--
CrawfordCurrie - 06 May 2010
with Crawford's
simplify()
and
evaluatesToConstant()
methods, Query Search now is massivly faster for
SEARCHs that only render a partial set - paging, or limit - by delaying the reading of topics until they are sorted/rendered
During the result set phase 1 refactoring, I also added a topic meta cache, so we don't keep re-loading topics, so this task is good to go (most of the work was done in other tasks)
--
SvenDowideit - 18 May 2010
this revealed a bug, and i broke the build
--
SvenDowideit - 27 May 2010