Feature Proposal: Explicitly control the storage location of temporary files used by Foswiki

Motivation

Tasks.Item10408 has exposed an issue caused when multiple foswiki installations are hosted on the same server. File collisions in /tmp with different owners were causing failures.

It is not possible to resolve this using the $ENV{TEMPDIR} setting because the parameter is ignored by File::Spec and File::Temp when taint checking is enabled,and when it is tainted. The solution is to ensure that the environment variables used on various platforms are all untainted.

See also Tasks.Item9233 for windows temporary file issues.

Description and Documentation

Foswiki has a deprecated / hidden temporary file location - $Foswiki::cfg{TempfileDir} that is documented as retained for possible use by plugins.

This proposal is to:

  • Define {TempfileDir} in Foswiki.spec. as an expert parameter.Default $Foswiki::cfg{WorkingDir}/tmp (no change from current default)
  • Add a TempfileDir.pm checker that ensures a sane default. Suggest alternatives from the environment variables if available.
    • Suggest $Foswiki::cfg{WorkingDir}/requestTmp as another alternative.
  • Update any modules using File::Temp and File::Spec to use a configurable directory.
  • Use a temporary root for each use of temporary files. These would be implemented as constants in their respective modules rather than adding more config variables. For example {TempfileDir}/sessions for cgi session files created by LoginManager, {TempfileDir}/meta for files created during attachment handling. etc.

Searching for references to File::Spec and File::Temp - the following modules appear to use temporary files.
  • Foswiki::Store::VC::Handler::mkTmpFilename (This does not appear to be referenced anywhere, is it a dead function? The only place it is referenced is in the RCS unit tests, and in the GitPlugin store.)
  • Foswiki::Sandbox::sysCommand() Cache to capture STDERR. Uses {WorkingDir}/tmp explicitly. Update to use =File::Spec->tmpdir()=
  • Foswiki::Plugins::EmptyPlugin Contains examples of using File::Temp
  • Foswiki::Meta::attach() Temporary storage during attachment processing. Already uses File::Temp
  • Foswiki::Configure::Package Stores extension files fetched from the repository. Already uses File::Temp
  • Foswiki::Configure::Util Storage for expanded archive files. Already uses File::Temp
  • Foswiki::Cache (temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/cache'; )
  • rcs uses the environment variable TMPDIR provided that it is not tainted.

Searching for explicit use of cfg{WorkingDir}, all of the current usage in core + default extensions appears to be consistent with the intended use.
  • {WorkingDir}/cache used by page cache DBI for file storage
  • {WorkingDir}/htpasswd.lock used by Users::HtPasswdUser as the lock file.
  • {WorkingDir}/sqlite.db also used by page cache.
  • {WorkingDir}/registration_approvals used by UI::Register
  • {WorkingDir}/tmp used by:
    • Foswiki::LoginManager for session files, and session/ip map files..
  • {WorkingDir}/work_areas used by Foswiki::Store for plugin persistent file storage.
  • {WorkingDir}/languages.cache use by Foswiki::I18N

Other temporary file usage

The CGI code has its own temporary file implementation, used primarily to hold files during upload. This change currently does not apply to CGI or the Foswiki::Request::Upload code.

See CPAN:CGI

Examples

Impact

%WHATDOESITAFFECT%
edit

Implementation

-- Contributors: GeorgeClark - 25 Feb 2011

Discussion

I'm curious about when to use File::Spec or just blah/blah to maintain portability... for example, $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache'; could be re-written as File::catdir($Foswiki::cfg{WorkingDir}, 'tmp', 'cache')

-- PaulHarvey - 26 Feb 2011

we should always use File::Spec - that way, the code is more likely to work with fewer modifications on platforms we're not using.. like, mmm, someone had tmwiki running on VMS once, and another on some s/360 etc - and who knows what will happen in future...

ok, additionally, if you cna figure out howto get rcs to use the set dir, rather than /tmp, quite a few admins will thank you - as that has made them cry a few times.

-- SvenDowideit - 26 Feb 2011

I'd forgotten about rcs, but according to some man pages I've found searching around:

Temporary files are created in the directory containing the working file, and also in the temporary directory (see TMPDIR under ENVIRONMENT ). ... TMPDIR

Name of the temporary directory. If not set, the environment variables TMP and TEMP are inspected instead and the first value found is taken; if none of them are set, a host-dependent default is used, typically /tmp.

Does this not work? I pulled down rcs source to take a look: Could we set TMPDIR prior to invoking RCS commands and make sure sandbox passes it through into the environment?
       if (!s
                &&  !(s = cgetenv("TMPDIR"))    /* Unix tradition */
                &&  !(s = cgetenv("TMP"))       /* DOS tradition */
                &&  !(s = cgetenv("TEMP"))      /* another DOS tradition */

-- GeorgeClark - 26 Feb 2011

/tmp (and by implication File::Spec->tempdir() is usually defined as being "a scratch area which you can use to hold files and directories for short periods of time" and "cleared whenever the system is "booted up" and by the system administrator when the directory gets full". Most of us regard /tmp as a relatively small, server-specific, transitional partition that can be cleared as and when we feel like it. Because /tmp is local, we tend to regard it as "fast".

Does the way we use working/tmp (ignoring configure) correspond to this view?
  • Originally working/tmp started out as a home for session files (which is how it got the name tmp, because these files were moved there from /tmp. Session files are nottrue temp files, because they (can) persist well beyond the end of process activation / request handling.
    • Killing session files arbitrarily would (1) force users to log in again and (2) cause loss of session variables.
  • It also serves as the home of the ip2sid map - used on very, very few installs, I suspect, but a persistent file and definitely not /tmpmaterial.
    • Killing ip2sid arbitrarily would require connected users to log in again.
  • Next came passthrufiles which were closer to "true" tmp files, in that they have a strictly defined life cycle.
    • Killing passthru files could break requests, especially authentication.
So what we have is close to /tmp but not quite the same; it's not really a scratch area, it's a managed storage area.

So, what about other uses of /tmp? George captured them:
  1. Foswiki::Store::VC::Handler::mkTmpFilename is used on Windows only, IIRC, for very short-lived files created during checkin
  2. Foswiki::Sandbox::sysCommand() Cache to capture STDERR
  3. Foswiki::Plugins::EmptyPlugin Contains examples of using File::Temp
  4. Foswiki::Meta::attach() Temporary storage during attachment processing
  5. Foswiki::Cache (temporary file storage is already explicitly set in the configuration. $Foswiki::cfg{Cache}{RootDir} = '$Foswiki::cfg{WorkingDir}/tmp/cache'; )
1, 2 and 4 are temp files held only for the duration of a single request - true temp files that can be purged almost as soon as they are closed. Arbitrary deletion isn't going to do them any favours. But they are all server-local and need to be fast (which is why they were left in /tmp). 3 and 5 I'm not so sure about, but in general:
  • Foswiki uses /tmp as fast, local, request-specific store. Files created there are not expected to live beyond the end of a request, and are specific to a single request.
  • working/tmp on the other hand is for longer-lived, Foswiki-managed files that are expected persist over many requests.
Can these two file types coexist in a single directory? I'm not so sure. If we need to provide a cushion for /tmp then I'd prefer (subject to someone persuading me otherwise) to add working/request_tmp.

-- CrawfordCurrie - 26 Feb 2011

working/request_tmp sounds fine. I wonder about renaming /tmp to session_tmp at least for new installations might make sense. I guess I'm guilty of not reading the README in working/tmp but I had just "assumed" that anything /tmp would be for any temporary file use.

Okay, how about separate temporary files into explicit "Life of session" working/session_tmp and "Life of request" working/request_tmp directories. and define them with two expert configuration parameters - {sessionTmp} and {requestTmp}. This way the two classes of transient storage are documented in the configuration, and can be modified to accommodate requirements on shared hosting or other unique installation.
  • In Foswiki.pm,
    • if sessionTmp is undefined, default to working/tmp. Upgraded sites then would not have any loss of session data.
    • if requestTmp is undefined, guess per the current rules, such as using File::Spec.
  • In configure
    • If sessionTmp undefined, checker determines if working/tmp exists and contains other than the README. If yes, set to working/tmp otherwise create the working/session_tmp directory and use that for session files.
    • if requestTmp is s undefined, checker can guess using the File::Spec tempdir setting, or as appropriate for the platform. This way there is no significant change for simple installations. Sites with multiple foswiki's installed under different users, or with other unique requirements can set the expert parameter.
And the modules identified above, use the configured sessionTmp or requestTmp explicitly in all temp file request (and set into the environment for rcs). Document the use of requestTmp in EmptyPlugin.pm.

-- GeorgeClark - 27 Feb 2011

The title of this topic implies that explicit control of the temp directory is the best solution. There is nothing inherently wrong with using /tmp if the name is sufficiently redundant to avoid conflicts between installs, or is there? One solution is to add something like the current time, for example, as I suggested in Tasks.Item10408. Is there some other benefit to a require a more substantial fix?

-- RaymondLutz - 10 Mar 2011

The cleanup scripts might well be simplified by putting the various types of temp files in separate directories.

Temporary files are a tricky business - intuitive approaches such as using PIDs or adding times have multiple dangerous failure modes. Use File::Temp; use file handles (and never temp file names). See http://perldoc.perl.org/File/Temp.html (read the whole thing, especially the warnings), the Security::Temporary Files section of the camel book, and open( ..,'>+', undef) for some basic information.

Please don't re-invent solutions for uniqueness - you'll have a painful experience, open security holes, rediscover portability issues - and consume time that can be applied to more productive uses.

-- TimotheLitt - 20 Apr 2011

This advice is good, however there are applications where there is no way to pass through a file handle. I don't see as we have any choice but to pass through a filename. For example when capturing the STDERR / STDOUT from a Sandbox script. The file name is opened in a different thread. And it's also not good to leave the file handle open by multiple writers, so we have to close it. If you have suggestions for a portable solution here, it would be appreciated.

    # Note:  Use of the file handle $fh returned here would be safer than
    # using the file name. But it is less portable, so filename will have to do.
    my ( $fh, $stderrCache  ) = tempfile(
     "STDERR.$$.XXXXXXXXXX",
     DIR    => "$Foswiki::cfg{WorkingDir}/tmp",
     UNLINK => 0
    );
    close $fh;

This is the use case that triggered this work. Also to another comment above, /tmp is a good location, except on Windows, and possibly some other platforms. There have been cases where the temporary files all end up in the C:/ root location, or when that location is not writable, Foswiki crashes.

-- GeorgeClark - 20 Apr 2011

I realize that this is already accepted, but I've learned a bit more along the way. It won't be as far reaching.
  • File::Spec should work fine provided that the $ENV variables for the temporary directory are untainted. The primary effect of the change will be to make sure that the environment variables are set correctly and untainted.
  • the working/requestTmp directory will be suggested, but not used by default, except on Windows.
  • I won't bother renaming working/tmp to working/sessionTmp. Not sure of the value.

-- GeorgeClark - 24 Oct 2012

I'm glad that this is slated for a correction as I decided not to chance an upgrade until this was fixed. I think having a separate temporary file area for each install should fix it, and I think reducing the scope of the change is a smart move. I wonder if it is not also prudent to check that the proposed filename already exists before attempting to use it and then enduring a hard-stop failure.

I was trying to determine if this was already corrected in 1.1.3. Why do I see Tasks.Item10408 as closed in http://foswiki.org/Tasks/TasksByRelease?release=1.1.3 ??

-- RaymondLutz - 26 Oct 2012

The bug was "fixed" or at least minimized in 1.1.3 by adding the pid to the temporary file name.

-- GeorgeClark - 27 Oct 2012
Topic revision: r17 - 05 Jul 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy