Feature Proposal: Bring together the SEARCH visions

Motivation

There are really way too many FeatureRequests for small parts of TOM and QUERY functionality that will be implemented by completing the rearchitecting we've been beavering away at.

Description and Documentation

Pluggable Query Parsers

Foswiki now has a sufficiently complete Query::Node structure to represent the legacy regex, word and keyword SEARCH types - which means we are ready to move away from having the user specify the code engine that does the search - instead, they specify a Parser that will be used to convert their entered query into a Query::Node.

Create a mapping mechanism so that any Extension can add a new Parser.

Parsers output Query::Nodes, and can define their own Node types - clearly requiring more implementation work by the extension developer. (eg, a parser for SQL..)

Pluggable Evaluation Engines

To match these pluggable Parsers, we need to be able to register Evaluation engines for Nodes - with ways to prefer one engine over another (because we will have more than one able to serve the same result).
    • ok, so this is hard - as it depends not just on what the operation is, but on its context - the grep algo may well be faster at regex matching the raw_text, but the BruteForce query engine is likely to be better at regex matching a particular formfield.

custom Operations - ie, non-core ones, will need to be in a different namespace to be obvious and debuggable.

Move Scope into the query.

in regx and word SEARCh types, we have scope=name,topic,all. These are redundant, and pretty misleading when you try to read them into query searches, where it is expressed as:
  • "text ~ '*Something*' OR topic ~ '*Something*'"

On the other hand, this current query implies query over a predefined 'ResultSet' we can call webs ie, it is possible to consider the above query as shorthand for:
  • "webs[text ~ '*Something*' OR topic ~ '*Something*'"]

giving us the opportunity to defined queries on other ResultSets, including non-topic based ones:

eg.
  • "PrevisoulySavedResult[text ~ '*Something*''"]
  • "attachments[user='SvenDowideit']"
  • log[action='save' AND '12/12/2005'<date<'12/12/2008']
  • "webs[text ~ '*Something*' OR topic ~ '*Something*'"]
  • web='Sandbox' AND revisions[author='JoeBloggs']
    • search for all revisions saved by 'JoeBlogs' in topics that are currently in the Sandbox web
  • revisions[web='Sandbox' AND author='JoeBloggs']
    • search for all revisions saved by 'JoeBlogs' in topics that were ever in the Sandbox web (much more computationally expensive)

complete the ResultSets abstraction

ResultSets become ordered lists of TOM addresses, which can be linked directly to the Object Cache. ResultSets themselves don't contain Objects, rather links to Ojects, so that the lifetime of parsed or complexly retrieved information is independent of possibly short lived lists.

At the same time, ResultSets become save/name-able and reusable.

eg
  • =
    Searched:
Number of topics: 0
%SEARCH{type="query" "saveme[constraint]"=

additionally, built in ResultSets will need to be able to be defined by developers in code -
  • log[action='save' AND '12/12/2005'<date<'12/12/2008']
  • tags[name='silly']
  • SQLDB1[SELECT * FROM evil LIMIT 12]

revision specific TOM address

We're still pondering the right way for a user to be able to see and specify a particular revision of an object, some discussion can be found at QueryAcrossTopicRevisions

Pluggable Formatters

once you have a set of results of 'TOM' addresses / objects, you need a way to format them - which is where the Pluggable FORMAT engine I began to migrate out of SEARCH comes in - right now its biased towards different MACRO's being able to supply functions to call when a particular $format operator is found int he format string

This needs to be extended to support the 'Nodes' in the ResultSet.

add grouping and filtering operators

for many of these Queries, you may want only the first() and last(), or only a unique() - for eg
  • list all the unique names of topics a particular user has ever saved - unique(name, revisions[author='JoeBloggs')

Changes intended:

  1. convert the Search::Parser for regex, word and keyword SEARCH into a proper parser that outputs Query::Nodes - removing the Search::Node placeholder
  2. add Configure support, and Foswiki::Search support for a hash of $Foswiki{cfg}{QueryParsers}{query} = "Foswiki::Query::Parser"; etc - adding a new parser is a single ine in a =Config.spec
  3. design a mapping Query::Nodes to evaluation engines.

.... more, i lots track

Impact

%EDITTABLE{format="|label,1|text,70|" changerows="off"}%
WhatDoesItAffect: %WHATDOESITAFFECT%

Implementation

  • Contributors: Lots and lots of people, as this dates back to pre-dawn times.

Discussion

gosh. that is alot of work.

-- SvenDowideit - 16 Oct 2010

We shall conquer the world smile As you know, I agree 100% with the vision. And I think pluggable Query + Eval engines are very interesting, I just wonder if they need to be delivered with 2.0? Perhaps if we get some experience with ResultSets and pluggable formatters (and the rest), we would be in a really solid position to introduce that stuff (and pluggable ResultSet implementations) in a 2.1 release. As I assume 'pluggable' also means publishing the API.

I am just nervous after the enormous dev cycle we had for 1.1, and the multi-month freezes of trunk... I am confident that 6-months between feature freeze would give us better quality releases.

Looking forward to helping break & un-break trunk smile

-- PaulHarvey - 16 Oct 2010

yes, I'm not expecting that it is all done by 2.0 - I wanted to try to tie it all together, so we can figure out what we need now, what is actually redundant, and all move in the same direction.

-- SvenDowideit - 16 Oct 2010

Removing myself as committed developer - I never committed to this, as I recall, though I do support it.

-- CrawfordCurrie - 24 Feb 2012

Setting to parked. Developers no longer active.
Topic revision: r9 - 12 Jan 2016, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy