Using git to manage your Foswiki installation

Problem

This tip is probably only useful to very few Foswiki administrators; it was developed from an environment where we wanted to track Foswiki's code, data, pub, htpasswd & LocalSite.cfg to more easily duplicate the production environment onto the staging & development servers.

It was also a moderately large wiki with ~210,000 topics with almost 100GB of attachments; this will mostly appeal to those who want to manage their Foswiki installation with git. Reasons for doing this might include:
  • You have a testing server which mirrors your production Foswiki instance. You would rather use git to keep the test/dev servers synchronised with prod (including testing bulk topic editing/deployment before pushing to prod).
  • You are already using git elsewhere, and would like to incorporate git for disaster recovery, development, tracking config & data changes.

Context

The script below makes some assumptions:
  • Directories are 'sticky', and are only owner/group read+writable, i.e. 2770 permission
  • Files are owner/group read+writeable, ie. 0660
  • Directories & files are owned www-data:fwadmins - that is, owned by the webserver user, with a group dedicated to users who are expected to be able manipulate Foswiki files directly
  • Foswiki is configured to use data & pub directories located outside of the Foswiki installation
  • Foswiki is configured to use a .htpasswd file located outside of the Foswiki installation, in a dedicated git repository
  • Foswiki's configuration file, LocalSite.cfg file is located outside of the Foswiki installation, in a dedicated git repository (a symbolic link into the Foswiki installation's lib/ directory is used)

Solution

Notes:
  • Due to the number of topics managed by our installation, we split data/ and pub/ up so that each root web has its own git repository; data/ and pub/ are also set up as parent repositories, which don't track any files but are merely used to track the child repos for convenience (allows us to do git foreach submodule <do stuff>....
  • The script below automatically detects any newly created root webs, initialises git there and adds this as a new repo to the parent supermodule.
  • pub/ git repositories are configured not to do any compression or deltas. On our virtual machines, this just added an impossible amount CPU overhead when repos exceeded the size of server RAM, and with most attachments already being compressed .png, .jpg, .pdf files etc. the compression was probably not saving a whole lot of disk space anyway.

#!/bin/bash
# Fix perms
nice chown -R www-data:fwadmins /path/to/foswiki/storage
nice chmod -R ug+rwX /path/to/foswiki/storage
nice find /path/to/foswiki/storage -type d -exec chmod g+s {} \;
# Essentials
sudo -u www-data nice perl -I /path/to/foswiki/core/bin /path/to/foswiki/core/tools/mailnotify -user AdminUser
sudo -u www-data nice perl -I /path/to/foswiki/core/bin /path/to/foswiki/core/tools/tick_foswiki.pl
# Deal with any newly created root webs in data/ (set them up as git submodules)
sudo -u www-data nice bash -c 'find /path/to/foswiki/storage/data -maxdepth 1 -mindepth 1 -type d -not -name .git |
while read dir; do
    if [ ! -d $dir/.git ]; then
        #echo "Doing $dir as $USER"
        cd $dir
        git init
        git config core.filemode false
        cd ..
        git submodule add -q ./`basename $dir`
    fi
done'
# Deal with any newly created root webs in pub/ (set them up as git submodules)
sudo -u www-data nice bash -c 'find /path/to/foswiki/storage/pub -maxdepth 1 -mindepth 1 -type d -not -name .git -not -name images |
while read dir; do
    if [ ! -d $dir/.git ]; then
        #echo "Doing $dir as $USER"
        cd $dir
        git init
        git config core.filemode false
        # These settings assume a pub/RootWeb directory containing mostly binary
        # files. Without these settings, cloning, pulling, gc & repacking repos
        # approaching the size of the server"s free RAM is fantastically slow.
        #
        # These settings prevent delta compression, so if you have many unique
        # revs, this can take up a lot of disk. If you tweak these settings, run
        # "git gc" and/or "git repack" on the affected repo(s)
        git config pack.depth 1
        # Compressing already compressed attachments (.pdf, .png, .jpg, etc)
        # can bog down the server"s CPU unnecessarily; especially if the repo
        # doesn"t fit in available system RAM, it really thrashes the disk.
        # TODO: demonstrate a nice way to schedule compression/re-packing
        # (with deltas) in a separate cron job, Eg. over weekend.
        git config core.compression 0
        git config core.loosecompression 0
        git config pack.compression 0
        # Ignore DirectedGraphPlugin & ImagePlugin temporary files
        echo "igp_*
DirectedGraphPlugin_*" >> .gitignore
        cd ..
        git submodule add -q ./`basename $dir`
    fi
done'
# Add *.txt and *.txt,v in separate commits. Don't want lease/.changes or other
# cruft in here, otherwise it's difficult for Foswikis running a clone of data/
# to stay merged without conflicts in .changes/.lease, etc.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/data && git submodule foreach "find . -name .git -prune -o -type d -exec bash -c \"cd {} && git add *.txt && git commit -q -m cron-update:txt || : && git add *.txt,v && git commit -q -m cron-update:txt,v || : \" \; && git commit -q -a -m cron-update:other || :"'
# Update the supermodule to point at the latest commits in the submodules.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/data && git commit -q -a -m "cron-update"'
# Commit everything in pub/.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/pub && git submodule foreach "git add . && git commit -q -a -m cron-update" || :'
# Update the supermodule to point at the latest commits in the submodules.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/pub && git commit -q -a -m "cron-update"'
# commit htpasswd file
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/htpasswd && git commit -q -a -m "cron-update"'
# Commit LocalSite config
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/LocalSite && git commit -q -a -m "cron-update"'
# Update supermodule in case we've upgraded an extension/core
sudo -u www-data nice bash -c 'cd /path/to/foswiki && git commit -q -a -m "cron-update"'

We can now clone this by doing something like
git clone server.org:/path/to/foswiki/storage/data
cd data
git submodule update --init
cd ..
git clone server.org:/path/to/foswiki/storage/pub
cd pub
git submodule update --init
cd ..
git clone server.org:/path/to/foswiki/storage/LocalSite
git clone server.org:/path/to/foswiki/storage/htpasswd

We could actually create a single super-parent repo to hold all these, then we'd only have one clone command and we'd just need to do git submodule update --init --recursive. Anyway, we use something like this to keep clones up-to-date:

#!/bin/sh
bash -c 'cd /path/to/foswiki/storage/data && git pull origin && git submodule update --init'
bash -c 'cd /path/to/foswiki/storage/pub && git pull origin && git submodule update --init'
bash -c 'cd /path/to/foswiki/storage/htpasswd && git pull origin'

Known Uses

Known Limitations

See Also

BestPracticeTipsForm edit

Category Installation and Upgrading
Related Topics GitAndPseudoInstall
Topic revision: r2 - 16 Apr 2012, PaulHarvey
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy