Project

General

Profile

MAUSDocumentCacheConfiguration » History » Version 13

Rogers, Chris, 15 May 2012 09:28

1 9 Jackson, Mike
h1. How to configure MongoDB as a document cache
2 1 Jackson, Mike
3 11 Jackson, Mike
{{>toc}}
4
5 9 Jackson, Mike
MAUS can use the MongoDB (http://www.mongodb.org/) document-oriented database to cache spills that have been transformed until they are ready to be merged. A MongoDB server holds 0 or more databases. Each database holds 1 or more collections and each collection 0 or more documents. MongoDB is schema free - the documents can be all of the same structure or of different structures.
6 1 Jackson, Mike
7 10 Jackson, Mike
h2. Set up MongoDB 
8 2 Jackson, Mike
9 1 Jackson, Mike
MongoDB can be installed using @yum@ as follows.
10
11 7 Jackson, Mike
 * Log in as a super-user by using @sudo su -@ or @su@.
12 1 Jackson, Mike
 * Edit @/etc/yum.repos.d/10gen.repo@ and add the lines
13
<pre>
14
[10gen]
15
name=10gen Repository
16
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686
17
gpgcheck=0
18
</pre>
19
 * Run 
20
<pre>
21
$ yum install mongo-10gen
22
 ...
23
 mongo-10gen         i686         2.0.1-mongodb_1           10gen          28 M
24
 ...
25
$ yum install mongo-10gen-server
26
 ...
27
 mongo-10gen-server       i686       2.0.1-mongodb_1          10gen       5.4 M
28
...
29
</pre>
30
 * Start the server
31
<pre>
32
$ /sbin/service mongod start
33
Starting mongod: forked process: 4357
34
                                                           [  OK  ]
35
all output going to: /var/log/mongo/mongod.log
36
$ /sbin/service mongod status
37
mongod (pid 4357) is running...
38
</pre>
39
(as an alternative to @service mongod@ you can use @/etc/init.d/mongod@)
40
41
By default MongoDB is available on http://localhost:27017/.
42
43 10 Jackson, Mike
h2. Set up pymngo 
44 1 Jackson, Mike
45 10 Jackson, Mike
pymongo(http://api.mongodb.org/python/current/) provides a Python API to MongoDB. pymongo is automatically downloaded and installed when you build MAUS. 
46 1 Jackson, Mike
47 10 Jackson, Mike
h2. Set up MongoDB connection
48 1 Jackson, Mike
49 10 Jackson, Mike
By default MAUS is set up to use a MongoDB database running locally. 
50 1 Jackson, Mike
51 10 Jackson, Mike
If you need to change this, or make other configuration changes, then the supported configuration parameters are as follows:
52
53
 * Document store class name. This mandatory parameter specifies the MAUS Python module that handles interaction with MongoDB. The parameter and value needs to be:
54 1 Jackson, Mike
<pre>
55
doc_store_class="MongoDBDocumentStore.MongoDBDocumentStore"
56
</pre>
57
 * MongoDB host. This optional parameter specifies the MongoDB host. If omitted then the default of @localhost@ is used. To override this value do:
58
<pre>
59
mongodb_host="maus.org.uk"
60
</pre>
61
 * MongoDB port. This optional parameter specifies the MongoDB port. If omitted then the default of @27017@ is used. To override this value do:
62
<pre>
63
mongodb_port=12345
64
</pre>
65
 * MongoDB database name. This optional parameter specifies the database within MongoDB to use. If omotted then the default of @mausdb@ is used. To override this value do:
66
<pre>
67 8 Jackson, Mike
mongodb_database_name="someotherdbname" 
68
</pre>
69
 ** Note that if the database is not present in MongoDB it will be created automatically.
70
 * MongoDB collection name. This optional parameter specifies the collection within the MongoDB database to use. If omotted then the default of @spills@ is used. To override this value do:
71
<pre>
72 1 Jackson, Mike
mongodb_collection_name="someothercollectionname" 
73 8 Jackson, Mike
</pre>
74 2 Jackson, Mike
 ** Note that if the database is not present in MongoDB it will be created automatically.
75 11 Jackson, Mike
76
h2. Run a quick test
77
78
Run the MAUS MongoDB integration tests:
79
<pre>
80
$ python tests/integration/test_docstore/test_MongoDBDocumentStore.py
81
............
82
----------------------------------------------------------------------
83
Ran 12 tests in 76.781s
84
85
OK
86
</pre>
87 12 Rogers, Chris
88
h2. Clear Mongo Database Cache
89
90
Mongo can encounter weird problems with data corruption, for example if the machine on which it is running crashes (power outage, etc). All existing data can be wiped from the database by doing
91
<pre>
92 13 Rogers, Chris
$ /sbin/service mongod stop
93 12 Rogers, Chris
$ rm /var/lib/mongo/*
94
$ /sbin/service mongod start
95
$ /sbin/service mongod status
96
</pre>
97
98
As mongo is only used as a transient database, this is probably a safe operation.