Item8460: .changes inaccurate
Priority: Urgent
Current State: Closed
Released In: 1.1.0
Target Release: minor
Applies To: Engine
Component:
Branches:
Sometimes there's a need to get quite accurate change notifications. However, the current
Foswiki::Func::eachChangeSince
API is designed
to be rather imprecise. It is implemented by maintaining a
.changes
file per web which logs anything considered worth it, i.e. a call to
store->saveTopic()
. This makes total sense from the point of view that lots of small-step changes of the same author must not create a
boost of mail notifications, or increase the topic's version number dramatically.
This seems to work fine for
mailnotify
and
statistics
.
It does
not work out for other services that need a precise up-to-the-minute picture of what changed. This is the case for
fulltext indexers that perform delta indexing every 15 minutes or so. Each topic change that slips under the radar of foswiki's change detection
will inevitably not be updated in search index.
The reason is that minor changes do not use
store->saveTopic()
but use
store->repRev()
.... alas the latter does not call
recordChange
.
Compared to
saveTopic
the
repRev
method (a) does reuse a revision number to accumulate small-step saves but also (b) does not record the
change in
.changes
The best you can do
now to get precise web changes is code along these lines:
my $since = ...; # epoch secs
my $session = $Foswiki::Plugins::SESSION;
my @topics = Foswiki::Func::getTopicList($web);
foreach my $topic (@$topics) {
my $time = $session->getApproxRevTime($web, $topic);
next if $time < $since;
index($web, $topic);
}
So my question is: why don't we record changes in
revRev
, and which consequences does it have for other subsystems?
--
MichaelDaum - 03 Feb 2010
I did think about this back when I coded the
eachChangeSince
API, but my thinking ran like this:
- I personally believe that every last little detailed change needs to be recorded, forever
- the .changes text impl is choked (it uses a text file; any sensible impl will use a DB)
- because the file is choked, it need to focus on important changes, and minimise "noise"
Note that the same issues apply to log files; I implemented a "level" parameter on the API there, so that the receiver could choose whether to log the event or not. The .changes logger needs to do the same sort of thing (indeed, it may be able to reuse the logger impl.)
Note also that changes to attachments are not even recorded, except as a side-effect of the change to the referring topic.
--
CrawfordCurrie - 03 Feb 2010
For sure saving a topic 15 times during a one hour working session done by the same user should not create 15 revisions of the topic. The repRev feature is brilliant and works well.
And the total change is recorded.
So the problem you are trying to solve is the problem with the full text indexer.
We should take care not to destroy something that works well when we resolve the text indexing problem.
--
KennethLavrsen - 03 Feb 2010
Kenneth, just in case I wasn't clear enuf. There are two aspects of repRev under consideration, where it differs from
saveTopic()
:
- it does not create a new revision ... we want to keep it that way.
- it does not record a change to the topic by writing it to the
.changes
file per web ... that's my concern and why it renders .changes
quite useless for indexers.
The result of
Foswiki::Func::eachChangeSince()
did not behave like expected for that reason. I am using the above workaround by using unofficial internal apis. This is also code similar in
DBCacheContrib for the needed use cases as outlined at the top.
--
MichaelDaum - 03 Feb 2010
Cool. We agree then.
--
KennethLavrsen - 17 Mar 2010
Given the importance of full-text indexing, I think this is justified Urgent status.
Looking back at the records for a couple of my sites, I'm not overly concerned that adding repRev to .changes will overly flood the logs; it's not as common as you might have thought. While a DB implementation of .changes is a very desireable enhancement, it is
not a requirement to fulfil this task. Confirmed.
--
CrawfordCurrie - 29 Mar 2010
- Audited all calls to logging, to ensure consistent interpretation of 'minor' and 'dontlog' options
- Extended change recording to all data-modifying ops in the store, including reprev and attachment and web writes
--
CrawfordCurrie - 06 May 2010