Life with MongoDBPlugin

Notes for living with MongoDBPlugin

Building the CPAN driver without cpan

Installing with cpan tool is convenient, but we usually try to use OS packages instead, wherever possible.

Debian: using cpan2deb from dh-make-perl (the version shipped with Ubuntu 10.04 LTS won't work. It seems 0.68 from natty works ok):
sudo apt-get install libany-moose-perl libmoose-perl libclass-method-modifiers-perl libdata-types-perl libdatetime-perl libfile-slurp-perl libjson-perl libjson-xs-perl libjson-any-perl libtest-exception-perl libtry-tiny-perl libboolean-perl
cpan2deb MongoDB
sudo dpkg -i libmongodb*

Working in mongo

Mongo shell (with auth)

mongo mongod.example.org:27017
use admin
db.auth('<user>', '<pass>')
# Show dbs, just for fun
show dbs
# TODO: Work with sven to make this easier. Step 1: 'use' any old web (they're all md5sum'd instead of human-readable names), Eg. sandbox web is web_2652eec977dcb2a5aea85f5bec235b05
use web_2652eec977dcb2a5aea85f5bec235b05
# Get a database name
db.eval('foswiki_getDatabaseName("TaxonProfile")')
# Use it
use web_aa908ce370b040470c582f47536b09a8
# Confirm it's a real web
show collections
# Do things, Eg. db.dropDatabase(), db.getIndexes(), etc.

Alternatively, without using Sven's js function (mongodb's native find() method):
> use webs
switched to db webs
> show collections
map
system.indexes
> db.map.find({_id: 'Marine/SeaSlugs/Messages'})
{ "hash" : "web_60f1085a95ad14944fee4a10fc8a1a34", "_id" : "Marine/SeaSlugs/Messages" }
> use web_60f1085a95ad14944fee4a10fc8a1a34
switched to db web_60f1085a95ad14944fee4a10fc8a1a34
> 

Dumping a topic

Example: Dump the topic M10233a from Marine/SeaSlugs/Images web
> use webs
switched to db webs
> db.map.find({_id: 'Marine/SeaSlugs/Images'})
{ "hash" : "web_57fc23b0041256f8771dd4880a6ad90d", "_id" : "Marine/SeaSlugs/Images" }
> use web_57fc23b0041256f8771dd4880a6ad90d
switched to db web_57fc23b0041256f8771dd4880a6ad90d
> db.current.find({_topic : 'M10233a'})
{ "_id" : ObjectId("4e4a462e15a05f18538d8b35"), "_web" : "Marine/SeaSlugs/Images", ... }
// or pretty-print the JSON:
> > db.current.find({_topic : 'M10233a'}).forEach(printjson)
{
   "_id" : ObjectId("4e4a462e15a05f18538d8b35"),
   "_web" : "Marine/SeaSlugs/Images",
...
}

"Restoring" a topic

From the JSON in mongo - Eg. you accidentally deleted a .txt file, and want to regenerate it using the cached version stored in mongo.

Use a find expression to locate the topic via its _topic name property, and emit the __raw_text property:
ALERT! Wrong:
> db.current.findOne({_topic : 'M10233a'})['_raw_text']

The above statement finds the just the first of any document whose _topic is M10233a - if you are holding more than one version, there will be multiple documents that match. Here's a snippet that gives all versions:
IDEA! We're using forEach() because unlike findOne(), find() does not actually return document objects, but merely a cursor which can access them.
db.current.find({_topic : 'M10233a'}).forEach(function (document) {print(document['_raw_text'])})

This snippet gives only the latest version:
db.current.find({$and: [{_topic : 'M10233a'}, {_latestIsLoaded : 1}]}).forEach(function (document) {print(document['_raw_text'])})

Then you can copy-paste the _raw_text verbatim into a .txt topic file

ALERT! NB: mongo stores everything in utf-8. Your operating system/applications (the terminal, the text editor that you paste into) may or may not be using utf-8 also. So you may have to convert the txt into the Foswiki's {Site}{CharSet}. GNU recode tool can easily convert the encoding (see man page)

Importing into Foswiki

We have a 'mongoload' script which lives at /usr/local/bin that looks like
#!/bin/sh
cd /path/to/foswiki/core/bin
time sudo -u www-data ./rest /MongoDBPlugin/update -updateweb $@

And then we do
  • mongoload <webname>
  • mongoload <webname> -recurse on - include subwebs
  • mongoload <webname> -recurse on -revision off - don't import all versions (just current topic versions) - quicker
  • mongoload <webname> -recurse on -revision off -fork on - fork a new import process for each subweb (uses less memory - REQUIRED when importing 'all' webs)
Topic revision: r2 - 28 Sep 2011, PaulHarvey
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy