Feature Proposal: Search results pagination

Foswiki currently lacks a way to control a set of search results. I would like to be able to show results 1-10, then 11-20, etc.

Motivation

It might be useful to present a subset of search results, but I think it is really fit for Foswiki applications: for instance to show blog entries/comments across pages, a subset of results in table format, only 5 topic in bookview per page, etc.

Implementation

Tasks.Item3931

Visual schema for specification:

pagination.png

We can discern these entities: Collection, Result set, Sorted result list, Display page

Collection

All searchable items: topics, form fields, table rows, etc.

(Raw) result set

Unsorted. For performance this set might be cached (so a new sorted result list can be created from the set). The maximum number of results is defined by SEARCH parameter limit.

Properties:
  • items
It would be tremendously powerful if we could merge search results from different plugins and core. For instance a combination of TagMePlugin search resuls with core search results.

Sorted result list

When no sort is specified the default sort. For performance this set might also be cached.

Properties:
  • (sorted) items
    • Follows from this: item count

(Display) pages params for SEARCH and FOREACH

Proposed parameters for pages: pagesize and showpage. Users should not be bothered with implementation specific terms like offset.

if either of the pagesize and showpage params are set, paging is enabled.

Properties:
  • pagesize: the number of items displayed per 'page'
    • When no pagesize is given, the page size is $Foswiki::cfg{Search}{DefaultPageSize}
    • When pagesize > item count, the page size is item count
    • if page size is smaller than 1, the SEARCH becomes a NOP
  • showpage: the first page that is displayed
    • When no showpage is given, the start page is 1 (using one-based index, so array position 0)
    • When start page is larger than the number of pages, showpage is the last page
    • if showpage is smaller than 1, the SEARCH becomes a NOP, or as Kenneth suggests, shows pages in reverse (this may be useful for some apps)
  • pagerformat - defines the page decoration - default to use skin's tmpl file.
  • pager ="on"
    • defaults to "off"
    • can be set to "on" / "auto", or TML?
  • need to work out when to turn on the automatic addition of a pager.., should it just be wacked onto the end of the last footer by default, or what?
at this point the effect of mixing pager with limit is ill-defined - lets see what the result is then decide if its 'ok'
  • groupby defaults to web, but can be set to none to allow paging of mixed web results without getting per web headers&footers

format, header and footer variables for rendering

  • $currentpage, $nexpage, $previouspage - the current page number, next and prev page
  • $numberofpages - total number of pages
  • $pagesize - number of elements on each page
  • $previouspageurl, $nextpageurl
  • $pager - uses the pageformat setting to render a page - also has a TMPL definition default, or 'off' to disable.
  • and more...
at this point, $ntopics shows the number of topics in that page - so pagesize, unless there are fewer elements available.

Page decoration

  • Display of the start and end of the page. For example: "Results: 1-10".
  • Links to pages (as page numbers). The current page is not clickable.
  • Link to previous page. Is disabled or not shown on the first page.
  • Link to next page. Is disabled or not shown on the last page.
  • need to work out how to allow the user to customize the location of the pagination - perhaps add a $pager() format operator.
See also: Search Pagination Pattern (Yahoo! Design Pattern Library)

-- ArthurClemens - 22 Apr 2007

Discussion

This can be realized relatively easily with a new offset="10" parameter. Combined with limit="20" you can do a paginated search.

For speed, the resulting list of topics of the first search could be cached and re-used in follow-up searches that have an offset.

-- PeterThoeny - 01 Dec 2006

To provide interface feedback it would be useful to know if there are any more results, so a Next button can be showed.

-- ArthurClemens - 01 Dec 2006

It is easy to determine the total number of topics. Spec-wise, not sure how that info can be returned by a %SEARCH{}%.

-- PeterThoeny - 02 Dec 2006

I proposed the exact same a year ago in SearchOrderAndLimitBehavour#Proposal.

I just called it start instead of offset. But note the proposal that a negative start/offset would start from the other end and resolve some additional needs.

An up2date rewrite of my old proposal would then be

An enhancement which will be 100% backwards compatible we create an additional SEARCH option called "offset" which is the first hit in the list of found topic to show. This offset value can be positive which means it counts from the beginning. Or negative which means it counts from the end. Default must be 0 which makes it backwards compatible with Cairo.

Examples:
  • Show the latest 10 bug reports in ascending order: order="modified" reverse="off" offset="-10"
  • Show the latest 10 bug reports before the last 10. order="modified" reverse="off" offset="-20" limit="10"
  • Show all bugs. Browse 20 at a time. Newest first order="modified" reverse="on" offset="0" limit="20" and then you browse by having a link that contains a twiki variable which increases by 20. This use is actually so common that the search itself should have the feature of setting a variable that can be used for this purpose.
-- KennethLavrsen - 02 Dec 2006

And then you would require running the search again for the next, say 10 items? If you have a 1000 item search, you are wasting a lot of time.

Maybe we should move to a scheme as in FormQueryPlugin where the search is done once and can be rendered in different ways over and over again. (YetAnotherFormQueryPlugin is 100% compatible with Cairo search, but I have not gotten around adding new Dakar search features, by the way.)

-- ThomasWeigert - 03 Dec 2006

Kenneth: I do not understand the need for a negative offset. I think it is not necessary if we apply the offset after the limit, this way there is no need to compensate for the reverse flag.

Thomas: That is what I meant by "for speed, the resulting list of topics of the first search could be cached and re-used in follow-up searches that have an offset." Technically, the resulting topic list of a search can be cached (cache ID based on user name and all search parameters except offset parameter), and cache is re-used if same search is applied (with same/different offset) within a certain timeframe (say 10 min.)

-- PeterThoeny - 05 Dec 2006

As an example, pagination of search results has been part of the BlogPlugin for quite some time now. It is based on the skip and limit parameter of the DBCachePlugin. Here is a rewrite of TWiki's WebChanges. All kinds of pages are paginated with "next" and "prev" links showing up at the bottom and the top, but only if there are next or previous items. More examples: the frontpage showing the 5 most recent postings; category pages showing the 5 most recent postings in that category; the blog author pages; each posting is double linked; news feeds are double linked. There's an implementation of NEXTDOC and PREVDOC in the BlogPlugin (making use of the DBCacheContrib API) that compute the double linkage used for paginating forward and backward which could be externalized to be reused in other applications.

Very essential to pagination is the proper integration of sort and limit. As far as I remember SEARCH first limits the search result and sorts the remaining hit set afterwards ... which will result in an incorrect pagination. So there might be some more "gotchas" using SEARCH for pagination. Another one is, that a SEARCH spanning multiple webs does not merge all results from all webs properly. Sortion is done per web only and then all is concatenated.

-- MichaelDaum - 05 Dec 2006

This feature request is listed for some time now in TWikiFeature04x02. It looks like all are in agreement to add a pagination feature to SEARCH. It just needs a clearly defined spec and a person driving it.

-- PeterThoeny - 26 Mar 2007

I'd prefer short and descriptive parameters. Google uses a simple start parameter for the hit number, not page number, and a hardcoded size parameter. How about something like poffset="50" psize="25"?

Suggestion for implementation when pagination is used: For speed, do the complete search the first time and cache the list of topics for the next pages. A hash string can be built from the full search string (with all parameters, excluding header and format parameters) and user name. (Something similar has been done for the HeadlinesPlugin)

Example:
  • WikiGuest does a search for %SEARCH{ "faq" scope="topic" web="all, -Sandbox" nonoise="on" format="| $topic: $summary |" poffset="%URLPARAM{poffset}%" psize="25" }%.
  • Do the search, build the list of topics
  • Build a hash string of Main.WikiGuest and "faq" scope="topic" web="all, -Sandbox" nonoise="on".
  • Use the hash string as the file name of the cache; in it store the topic list.
  • For poffset other than 0:
    • build the hash string
    • if cache file exist of same has string, and if not older than 15 min, use cache
-- PeterThoeny - 22 Apr 2007

How about skip and limit instead of poffset and psize. That's what DBCachePlugin uses.

-- MichaelDaum - 24 Apr 2007

poffset, psize and skip forces you to think in individual results instead of in pages. When the end goal is to show results in pages of n size we shouldn't force the user of SEARCH to think in numbers.
limit is already taken.

-- ArthurClemens - 24 Apr 2007

Which users are you talking about: the TWikiApplication developer or the visitor to your wiki? The first should be able to think either way. The latter shouldn't have to think about poffsets, psizes or pagesizes at all.

-- MichaelDaum - 24 Apr 2007

The users of SEARCH: so I am talking about the public interface of the search function. I think pagesize and showpage are all you need. But perhaps you can give usage examples where finegrained control is necessary and can't be done with these 2 parameters?

-- ArthurClemens - 24 Apr 2007

Well, using skip and limit (or psize) you could skip fragments of a page: the last half of page one and the first half of page two ... but I am not sure if that is of much use.

-- MichaelDaum - 24 Apr 2007

I worked on this way long time ago, then something happened that prevented me from commit it ( I think that there was a problem with multiple searchs on the same page, or something like that, followed by a major refactoring of the codebase or I just forgot about it).

For what is worth, I'm attaching the resulting patch, against a pre-Dakar version of TWiki. It uses "start" and "step" to define the initial element and how many rows to display.

Hope it helps.

-- RafaelAlvarez - 24 Apr 2007

I need this functionality asap for one of my FoswikiApplications - but i don't know from the above discussion what the final syntax is to be.

-- SvenDowideit - 21 Jan 2009

Follow the original spec.

-- ArthurClemens - 21 Jan 2009

here's a poor man's implementation for basic functionality ( pagesize and showpage parameters, and items for format and header) caveats: it won't work across webs properly (still), undefined behavior when combined with limit parameter, make sure you specify a sort clause

www-data@r17311:~/community.reefsimple.org/foswiki$ svn diff lib/Foswiki/Search.pm
Index: lib/Foswiki/Search.pm
===================================================================
--- lib/Foswiki/Search.pm       (revision 600)
+++ lib/Foswiki/Search.pm       (working copy)
@@ -370,6 +370,8 @@
     my $doMultiple    = Foswiki::isTrue( $params{multiple} );
     my $nonoise       = Foswiki::isTrue( $params{nonoise} );
     my $noEmpty       = Foswiki::isTrue( $params{noempty}, $nonoise );
+    my $pagesize      = $params{pagesize} || undef;
+    my $showpage      = $params{showpage} || 1;                # 1-based system; 0 is not a valid page number

     # Note: a defined header overrides noheader
     my $noHeader = !defined($header)
@@ -773,6 +775,15 @@
         # output the list of topics in $web
         my $ntopics    = 0;
         my $headerDone = $noHeader;
+       my $items = scalar @topicList;
+       if ( defined $pagesize ) {
+           my $startPage = ($showpage-1) * $pagesize;
+           my $endPage = $startPage + $pagesize - 1;
+           # don't go off the end and create extra undef entries
+           $startPage = $#topicList if $startPage > $#topicList;
+           $endPage = $#topicList if $endPage > $#topicList;
+           @topicList = @topicList[ $startPage .. $endPage ];
+       }
         foreach my $topic (@topicList) {
             my $forceRendering = 0;
             unless ( exists( $topicInfo->{$topic} ) ) {
@@ -862,6 +873,7 @@
                     $out =~ s/\$isodate/$isoDate/gs;
                     $out =~ s/\$rev/$revNum/gs;
                     $out =~ s/\$wikiusername/$wikiusername/ges;
+                   $out =~ s/\$items/$items/gs;

                     my $wikiname = $users->getWikiName($cUID);
                     $wikiname = 'UnknownUser' unless defined $wikiname;
@@ -987,6 +999,7 @@
                       || '\#FF00FF';
                     $beforeText =~ s/%WEBBGCOLOR%/$thisWebBGColor/go;
                     $beforeText =~ s/%WEB%/$web/go;
+                   $beforeText =~ s/\$items/$items/gs;
                     $beforeText =
                       $session->handleCommonTags( $beforeText, $web, $topic );
                     if ( defined $callback ) {

-- WillNorris - 08 Jun 2009

this proposal has no specification for the paging controls.

-- WillNorris - 08 Jun 2009

my incomplete thoughts where to add a $pager, $currentpage, $numberofpages, $pageSize that could be added to header format or footer, which would have a sane default in a tmpl (so skins can over-ride them, and then a pager param so users can make their own.

the pager can probably be generated using the new (in trunk) %!FORMAT{}%, which also suggests that in addition to FORMAT taking a comma separated list, it could also take a range... (eg %FORMAT{"A...Z" format=" * $value"}%)

additional hopes - to have an alpha pager smile

assuming we are gong to make a 1.1 for feature complete this month, this, FORMAT and Search Accelleration would probly be my main contribution.

-- SvenDowideit - 10 Jun 2009

I can make some UI specs / drafts if noone has already.

-- CarloSchulz - 10 Jun 2009

here's my current use case:

pager-usecase.png

-- WillNorris - 10 Jun 2009

Carlo - yes please smile anything that can get us thinking will be very useful smile

-- SvenDowideit - 11 Jun 2009

Here's a quick mockup how paginated search results could look like (inluding more nicely displayed results) :-).

-- CarloSchulz - 15 Jun 2009

See also Pagination Gallery: Examples And Good Practices.

-- MichaelDaum - 16 Jun 2009

I've just commited a naive implementation of pagesize and showpage to trunk, but the work we're doing makes it clear to me that we shouldn't assume that we know the total number of items... as partial evaluation of a query should be encouraged.

There isn't a pager implemenation yet.

more unit tests are welcomed

-- SvenDowideit - 31 Aug 2009

While counting all items is probably not a good idea for the current grep-based backend, it is a pretty natural piece of information for other more decent search engines.

-- MichaelDaum - 08 Mar 2010

I agree, and just yesterday I was thinking about how to pull the pieces together to at least provide 'estimated' counts for the current brute force back-ends - and its pretty much unlikely to happen at the same time as improving their speed by only evaluating as few results as we want to display. But its likely that i'll add an method to the resultset, which by default will either return undef, or lie.

i'm adding in a simplistic pager, with automagic showpage setting via (and admittedly ugly) urlparam based on an md5 hash of the raw SEARCH params (so that we have a contextless paging for multiple SEARCH's at one time.)

once I've pushed that and the default pager tmpl out, I'm looking forward to you UI people making it look good. (still a few days before i commit)

-- SvenDowideit - 08 Mar 2010

I'm using SiteChanges as my first non-trivial search to add paging to, and its quite obvious that in this context, the rendering based on limit and each web at a time is very unhelpful (especially as the simplistic pager is added to each footer (one at the end of each web shown...)

so... I'm thinking that I'll investigate adding a seperatewebresults="off", where on is the default and how foswiki has rendered multi-web results for a decade, and probably off will be the default if paging is on. This may well be too much work to unravel tho frown, sad smile

  • simple_pager_in_footer_limit.png:
    simple pager in footer limit.png

-- SvenDowideit - 09 Mar 2010

I support seperatewebresults="off"=. Which could be a group="web" (default) and group="none" (site changes).

-- ArthurClemens - 09 Mar 2010

I have implemented a non-public option groupby which defaults to web, but allows none - specifically for the SiteChanges topic. The real implementation for this is dependent on the SupportMultiKeySorting feature.

from here on, its docco, tests and bug fixes for me, and UI work for Arthur

-- SvenDowideit - 01 Apr 2010

This has been released in 1.1.0 - though clearly could do with many improvements.

-- SvenDowideit - 18 Apr 2012
I Attachment Action Size Date Who Comment
PagingSearch.patchpatch PagingSearch.patch manage 8 K 24 Apr 2007 - 19:04 RafaelAlvarez  
pager-usecase.pngpng pager-usecase.png manage 636 K 10 Jun 2009 - 19:10 WillNorris  
pagination.pngpng pagination.png manage 102 K 22 Apr 2007 - 18:08 ArthurClemens Visual schema for specification
search_results_with_pagination_1.pngpng search_results_with_pagination_1.png manage 58 K 15 Jun 2009 - 22:27 CarloSchulz  
search_results_with_pagination_2.pngpng search_results_with_pagination_2.png manage 58 K 15 Jun 2009 - 22:28 CarloSchulz  
simple_pager_in_footer_limit.pngpng simple_pager_in_footer_limit.png manage 212 K 09 Mar 2010 - 08:21 SvenDowideit  
Topic revision: r27 - 18 Apr 2012, SvenDowideit
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy