Item8460: .changes inaccurate

pencil
Priority: Urgent
Current State: Closed
Released In: 1.1.0
Target Release: minor
Applies To: Engine
Component:
Branches:
Reported By: MichaelDaum
Waiting For:
Last Change By: KennethLavrsen
Sometimes there's a need to get quite accurate change notifications. However, the current Foswiki::Func::eachChangeSince API is designed to be rather imprecise. It is implemented by maintaining a .changes file per web which logs anything considered worth it, i.e. a call to store->saveTopic(). This makes total sense from the point of view that lots of small-step changes of the same author must not create a boost of mail notifications, or increase the topic's version number dramatically.

This seems to work fine for mailnotify and statistics.

It does not work out for other services that need a precise up-to-the-minute picture of what changed. This is the case for fulltext indexers that perform delta indexing every 15 minutes or so. Each topic change that slips under the radar of foswiki's change detection will inevitably not be updated in search index.

The reason is that minor changes do not use store->saveTopic() but use store->repRev().... alas the latter does not call recordChange. Compared to saveTopic the repRev method (a) does reuse a revision number to accumulate small-step saves but also (b) does not record the change in .changes

The best you can do now to get precise web changes is code along these lines:

my $since = ...; # epoch secs
my $session = $Foswiki::Plugins::SESSION;

my @topics = Foswiki::Func::getTopicList($web);

foreach my $topic (@$topics) {

  my $time = $session->getApproxRevTime($web, $topic);
  next if $time < $since;

  index($web, $topic);
}

So my question is: why don't we record changes in revRev, and which consequences does it have for other subsystems?

-- MichaelDaum - 03 Feb 2010

I did think about this back when I coded the eachChangeSince API, but my thinking ran like this:
  • I personally believe that every last little detailed change needs to be recorded, forever
  • the .changes text impl is choked (it uses a text file; any sensible impl will use a DB)
  • because the file is choked, it need to focus on important changes, and minimise "noise"
Note that the same issues apply to log files; I implemented a "level" parameter on the API there, so that the receiver could choose whether to log the event or not. The .changes logger needs to do the same sort of thing (indeed, it may be able to reuse the logger impl.)

Note also that changes to attachments are not even recorded, except as a side-effect of the change to the referring topic.

-- CrawfordCurrie - 03 Feb 2010

For sure saving a topic 15 times during a one hour working session done by the same user should not create 15 revisions of the topic. The repRev feature is brilliant and works well.

And the total change is recorded.

So the problem you are trying to solve is the problem with the full text indexer.

We should take care not to destroy something that works well when we resolve the text indexing problem.

-- KennethLavrsen - 03 Feb 2010

Kenneth, just in case I wasn't clear enuf. There are two aspects of repRev under consideration, where it differs from saveTopic():

  1. it does not create a new revision ... we want to keep it that way.
  2. it does not record a change to the topic by writing it to the .changes file per web ... that's my concern and why it renders .changes quite useless for indexers.

The result of Foswiki::Func::eachChangeSince() did not behave like expected for that reason. I am using the above workaround by using unofficial internal apis. This is also code similar in DBCacheContrib for the needed use cases as outlined at the top.

-- MichaelDaum - 03 Feb 2010

Cool. We agree then.

-- KennethLavrsen - 17 Mar 2010

Given the importance of full-text indexing, I think this is justified Urgent status.

Looking back at the records for a couple of my sites, I'm not overly concerned that adding repRev to .changes will overly flood the logs; it's not as common as you might have thought. While a DB implementation of .changes is a very desireable enhancement, it is not a requirement to fulfil this task. Confirmed.

-- CrawfordCurrie - 29 Mar 2010

  1. Audited all calls to logging, to ensure consistent interpretation of 'minor' and 'dontlog' options
  2. Extended change recording to all data-modifying ops in the store, including reprev and attachment and web writes

-- CrawfordCurrie - 06 May 2010
 
Topic revision: r12 - 04 Oct 2010, KennethLavrsen
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy