You are here: Foswiki>Tasks Web>Item12965 (06 Nov 2018, MichaelDaum)Edit Attach

Item12965: Cache misses can create deadlocks if there is a collision

pencil
Priority: Urgent
Current State: Closed
Released In: 2.0.0
Target Release: major
Applies To: Engine
Component: PageCache
Branches: master
Reported By: FredTarbox
Waiting For:
Last Change By: MichaelDaum
When a single user account is shared among a large number of users, such as WikiGuest, and caching is turned on, deadlocks are created when the server is under load.

To force this behavior, set PageCache::getPage to always return undef, then put the server under load. I used JMeter to repeatedly load the same page until the error occurred, 10 times per second. It still took a while to show up.

I'm presuming this is caused by two WikiGuests getting cache misses on the same page at the same time, forcing two renders. There could be other causes.

Turning caching on for a wiki for which there was formerly no caching creates a flurry of cache misses that could make this problem more prevalent than in a mature cached wiki.

-- FredTarbox - 15 Jul 2014

Could you please provide some details about your cache configuration as well as the kind of page you are testing against, just to make sure there's no other layer causing a deadlock.

Could you please enable TRACE in Foswiki::PageCache and Foswiki::PageCache::DBI and provide the logs here? Thanks.

-- MichaelDaum - 16 Jul 2014

The logs hit 15 MB before the first deadlock showed up, even though that only took about 30 seconds. Also, they're too detailed for me to scrub all private info from them. Can we arrange an email? I'll try to hang out in the IRC room.

-- FredTarbox - 16 Jul 2014

Settings:
$Foswiki::cfg{Cache}{Enabled} = 1;
$Foswiki::cfg{Cache}{WebDependencies} = 'WebRss, WebAtom, WebTopicList, WebIndex, WebSearch, WebSearchAdvanced';
$Foswiki::cfg{Cache}{DependencyFilter} = '$Foswiki::cfg{SystemWebName}\\..*|$Foswiki::cfg{TrashWebName}\\..*|.*Template$|TWiki\\..*';
$Foswiki::cfg{CacheManager} = 'Foswiki::Cache::FileCache';
$Foswiki::cfg{MetaCacheManager} = 'Foswiki::Cache::DB_File';
$Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache';
$Foswiki::cfg{Cache}{DBFile} = '$Foswiki::cfg{WorkingDir}/tmp/foswiki_db';
$Foswiki::cfg{Cache}{NameSpace} = '$Foswiki::cfg{DefaultUrlHost}';
$Foswiki::cfg{Cache}{MaxSize} = 1000;
$Foswiki::cfg{Cache}{Servers} = '127.0.0.1:11211';
$Foswiki::cfg{Cache}{Debug} = 0;
$Foswiki::cfg{Cache}{Implementation} = 'Foswiki::PageCache::DBI::MySQL';
$Foswiki::cfg{Cache}{DBI}{TablePrefix} = 'foswiki_cache';
$Foswiki::cfg{Cache}{DBI}{DSN} = '';
$Foswiki::cfg{Cache}{DBI}{Username} = '';
$Foswiki::cfg{Cache}{DBI}{Password} = MIME::Base64::decode_base64('');
$Foswiki::cfg{Cache}{DBI}{SQLite}{Filename} = '$Foswiki::cfg{WorkingDir}/sqlite.db';
$Foswiki::cfg{Cache}{DBI}{MySQL}{Host} = 'localhost';
$Foswiki::cfg{Cache}{DBI}{MySQL}{Port} = '3306';
$Foswiki::cfg{Cache}{DBI}{MySQL}{Database} = 'Foswiki';
$Foswiki::cfg{Cache}{DBI}{MySQL}{Username} = 'foswiki';
$Foswiki::cfg{Cache}{DBI}{MySQL}{Password} = MIME::Base64::decode_base64('scrubbed');
$Foswiki::cfg{Cache}{DBI}{PostgreSQL}{Port} = '';
$Foswiki::cfg{Cache}{DBI}{PostgreSQL}{Database} = '';
$Foswiki::cfg{Cache}{DBI}{PostgreSQL}{Username} = '';
$Foswiki::cfg{Cache}{DBI}{PostgreSQL}{Password} = MIME::Base64::decode_base64('');

-- FredTarbox - 16 Jul 2014

I also had similar troubles with sqlite, although it's not as good about producing errors for them. My mysql setup is the default.

-- FredTarbox - 16 Jul 2014

Did you test a vanilla Foswiki or are there any additional plugins installed?

Which page did you use for load-testing?

Also: please specify the way you've set up your HTTP server:

  • Which one are you using?
  • CGI or FCGI
  • number of parallel Foswiki backends in case you are using FCGI

You mentioned on IRC that you changed getPage to always return undef. Have you got a patch of your local changes to Foswiki::PageCache:*?

Btw, I've been running 5 parallel users against a Foswiki storing cache meta in sqlite for a while now without problems.

For load-testing the page-cache with all parallel users generating new cache data on the same page all the time I've applied this patch:

--- lib/Foswiki/PageCache/DBI.pm        (revision 17816)
+++ lib/Foswiki/PageCache/DBI.pm        (working copy)
@@ -220,6 +220,9 @@
         $variation->{data} = '' unless defined $variation->{data};
     }
 
+### load testing 
return undef;
+### load testing
     return $variation;
 }

Also no problems using sqlite. Will try mysql later as I can't install DBD::mysql that easily in my plenv env on the test laptop.

-- MichaelDaum - 17 Jul 2014

I'm really sorry, my company drafted me for some extracurricular stuff. I'll try to wrap up all this stuff for you on Sunday.

-- FredTarbox - 17 Jul 2014

Loaded and enabled plugins and contribs:

    AttachContentPlugin    
   AutoCompletePlugin    
   AutoTemplatePlugin    
   BreadCrumbsPlugin    
   ClassificationPlugin    
   CommentPlugin    
   CompareRevisionsAddOn    
   DBCacheContrib    
   DBCachePlugin    
   EditRowPlugin    
   EditSyntaxPlugin    
   FamFamFamContrib    
   FastCGIEngineContrib    
   FilterPlugin    
   FlexFormPlugin    
   FlexWebListPlugin    
   HistoryPlugin    
   HomePagePlugin    
   ImageGalleryPlugin    
   ImagePlugin    
   InfiniteScrollContrib    
   InterwikiPlugin    
   JQGridPlugin    
   JQMomentContrib    
   JQSerialPagerContrib    
   JQTwistyContrib    
   JQueryPlugin    
   JSCalendarContrib    
   LdapContrib    
   MailerContrib    
   MediaWikiTablePlugin    
   MediaWikiToFoswikiContrib    
   MimeIconPlugin    
   NatEditPlugin    
   NatSkin    
   NatSkinPlugin    
   NewUserPlugin    
   PatchFoswikiContrib    
   PatchItem11983Contrib    
   PatchItem12285Contrib    
   PatternSkin    
   PlainFileStoreContrib    
   PreferencesPlugin    
   PublishPlugin    
   RedDotPlugin    
   RenderListPlugin    
   RenderPlugin    
   SetVariablePlugin    
   SlideShowPlugin    
   SmiliesPlugin    
   SolrPlugin    
   SpreadSheetPlugin    
   StringifierContrib    
   SubscribePlugin    
   TablePlugin    
   TagCloudPlugin    
   TinyMCEPlugin    
   TipsContrib    
   TopicInteractionPlugin    
   TopicUserMappingContrib    
   TwistyPlugin    
   UpdateAttachmentsPlugin    
   UpdateInfoPlugin    
   WebLinkPlugin    
   WikiWorkbenchContrib    
   WysiwygPlugin    
   YahooGridsAddOn    
   YahooUserInterfaceContrib

Server: apache 2.2.22 using mod_fcgid Locked at 6 processes. (min and max)

I'm using Internal/IntentionallyBlank as my target page, which is (surprise) intentionally blank. I'm unable to duplicate the deadlocks on any page in System, including System/TwistyPlugin which looked promising because of the large number of %INCLUDEs in it. So, um... either System is excepted in some way, or maybe it has something to do with my WebMenu and/or WebLinks? I'll take direction on that.

We do have a rest call built into our UI that routes to a plugin which in turn taps our main sql database for the tiniest bit of "live" info useful to our users, no more than 10-20 characters max. But that's being called on the System pages in addition to the rest of the wiki.

The change to return undef is my only change to pagecache, and I only made the change because I could not recreate the error without it. It was a stab in the dark, it's possible the error it creates isn't my original error, but it at least produces the same error message:
Foswiki detected an internal error - please check your Foswiki logs and webserver logs for more information.

DBD::mysql::st execute failed: Deadlock found when trying to get lock; try restarting transaction

The logs reveal
stderr: DBD::mysql::st execute failed: Deadlock found when trying to get lock; try restarting transaction at /var/www/foswiki.telenetwork.com/lib/Foswiki/PageCache/DBI.pm line 368.
But the surrounding log entries are malformed, so I'm not sure what's causing it. (Finally showed up again)

-- FredTarbox - 20 Jul 2014

Can you repro the deadlock while all extras are disabled, i.e. stuff also interacting with the DB?

-- MichaelDaum - 21 Jul 2014

I disabled dbcacheplugin and turned off ldap based security, then disabled all plugins and could not reproduce the deadlock. I re-enabled NatSkinPlugin (the page was rather silly looking without it) and nothing else, in particular not jquery plugin, and the deadlock returned, despite the page being pretty ugly. Without jquery plugin, none of my ui changes are in effect, they're just meaningless text now.

-- FredTarbox - 21 Jul 2014

You mentioned some DB related customizations based on NatSkin, right? Try to disable them while still using NatSkinPlugin.

-- MichaelDaum - 23 Jul 2014

No, I've just checked. No changes to NatSkin's code. However! I did make a change to automatic linking. It only links to a topic if you have view rights to that topic. This makes NatSkin loading WebMenu fire off a huge number of security checks when rendering a page, since WebMenu is nothing but links. When I disable the view rights check, I don't get deadlocks. Or at least, I didn't after the 60 seconds I let it run. Might've gotten lucky.

-- FredTarbox - 28 Jul 2014

Fred, as long as you are narrowing down and given the fact that I wasn't able to repro the error, I'll have to lower the prio of this bug report from Urgent to Normal. Please feel free to up the prio again as soon as you are sure where the error stems from in your setup. Thanks again for excessively testing the upcoming Foswiki code. This is much appreciated.

-- MichaelDaum - 29 Jul 2014

With the security check I added disabled and a page that has 95 distinct tests of view rights like so:
         %IF{"'SomeWeb.SomePage' allows 'view'" then="YES!" else="NO!"}%
I get deadlocks. That's 95 different pages, not the same page 95 times, just to be clear.

-- FredTarbox - 29 Jul 2014

Could you please attach your LocalSite.cfg (passwds removed)?

-- MichaelDaum - 29 Jul 2014

Attached.

-- FredTarbox - 29 Jul 2014

Okay, I've been reading it top down. Some changes:

  1. remove rest from {AuthScripts}
  2. switch back {Store}{Implementation} to RcsWrap
  3. switch {PasswordManager} back to HtPasswdUser ... your user mapper isn't ldap'ed yet anyway

-- MichaelDaum - 30 Jul 2014

Done. I've also disabled all plugins this time, and now I've gotten a deadlock on a blank page:
%JQREQUIRE{"superfish, ui, blockui, farscroll, scrollto, imagetooltip, autosuggest, ui::tooltip, cookie, fontawesome" warn="off"}% %WEBCOMPONENT{"HtmlHead"}%
%USERACTIONS{ guest="
$login$sep$register
" format="
$account$sep$logout
" }%
%NATWEBLOGO{height="50"}%
%WEBCOMPONENT{"SideBar"}%
%WEBCOMPONENT{"TopicActions"}%
You are here: %BREADCRUMBS{separator=" » " newtopicformat="$topic"}%

%WEBCOMPONENT{"BottomBar"}%
This page was cached on 30 Jul 2014 - 09:09. Get a fresh version here.
^
%WEBCOMPONENT{"HtmlBottom"}%
Nothing resolves because all the plugins are disabled. On the bright side, this means that I have an error log that's free of confidential information, attached.

-- FredTarbox - 30 Jul 2014

Error confirmed.

Here's a patch.

Please try.

-- MichaelDaum - 05 Aug 2014
 

ItemTemplate edit

Summary Cache misses can create deadlocks if there is a collision
ReportedBy FredTarbox
Codebase trunk
SVN Range
AppliesTo Engine
Component PageCache
Priority Urgent
CurrentState Closed
WaitingFor
Checkins distro:b20daa54f110
TargetRelease major
ReleasedIn 2.0.0
CheckinsOnBranches master
trunkCheckins
masterCheckins distro:b20daa54f110
ItemBranchCheckins
Release01x01Checkins
I Attachment Action Size Date Who Comment
LocalSite.cfgcfg LocalSite.cfg manage 53 K 29 Jul 2014 - 18:10 FredTarbox LocalSite.cfg of deadlocking Foswiki.
PageCache.patchpatch PageCache.patch manage 14 K 05 Aug 2014 - 14:56 MichaelDaum  
apache.loglog apache.log manage 1 MB 30 Jul 2014 - 14:43 FredTarbox Apache error log with trace turned on
mysql.loglog mysql.log manage 47 K 30 Jul 2014 - 14:44 FredTarbox InnoDB logs showing deadlocks.
Topic revision: r22 - 06 Nov 2018, MichaelDaum
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy