How to set up Celery and RabbitMQ¶
- Table of contents
- How to set up Celery and RabbitMQ
MAUS can be used with the Celery asynchronous distributed task queue (http://celeryproject.org/) to allow transform (map) steps to be executed on multiple processors in parallel.
Celery uses RabbitMQ (http://www.rabbitmq.com/) as a broker to dispatch jobs to processing nodes.
Set up Celery¶
Celery is a Python tool which is automatically downloaded and installed when you build MAUS.
Set up RabbitMQ¶
To install on Scientific Linux or RedHat you should:
- Log in as a super-user by using
sudo su -or
$ yum install rabbitmq-server Installing: rabbitmq-server noarch 2.2.0-1.el5 epel 890 k Installing for dependencies: erlang i386 R12B-5.10.el5 epel 39 M unixODBC i386 2.2.11-7.1 sl-base 832 k
- Check that
/usr/sbinis in the
$ echo %PATH ...
- If not, then add it:
$ export PATH=$PATH:/usr/sbin
- Start the RabbitMQ server:
$ /sbin/service rabbitmq-server start
- Create a MAUS username and password pair e.g.
$ rabbitmqctl add_user maus suam Creating user "maus" ... ...done.
- Create a MAUS virtual host:
$ rabbitmqctl add_vhost maushost Creating vhost "maushost" ... ...done.
- Set the permissions for the user on this host:
$ rabbitmqctl set_permissions -p maushost maus ".*" ".*" ".*" Setting permissions for user "maus" in vhost "maushost" ... ...done.
- Check it is running OK:
$ /sbin/service rabbitmq-server status Status of all running nodes... Node 'rabbit@maus' with Pid 1377: running done.
By default RabbitMQ uses port 5672. If you want worker nodes outside your firewall to use the RabbitMQ broker then you will need to open this port.
For more information see:
Configure nodes as Celery workers¶
Ensure the MAUS software is deployed on the nodes you want to use as workers and that you have run
$ source env.sh
Within the MAUS software directory, edit
src/common_py/mauscelery/celeryconfig.py and change
BROKER_HOST = "localhost"
to have the full hostname of the host on which RabbitMQ was deployed e.g.
BROKER_HOST = "maus.epcc.ed.ac.uk"
Run a quick test¶Start the Celery workers on each node
$ celeryd -l INFO --purge
Wait for the Celery workers on each node to start. This may take a minute or two.
- You should get output ending like
[2012-04-27 11:51:38,391: WARNING/MainProcess] celery@miceonrec01a has started.
- If you get output like
[2012-04-27 11:43:59,611: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 4 seconds... [2012-04-27 11:44:03,611: ERROR/MainProcess] Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 6 seconds...
It means rabbitmq is not set up.
Now, on any node with MAUS deployed, run
$ ./bin/examples/simple_histogram_example.py -type_of_dataflow=multi_process \ -doc_store_class="docstore.InMemoryDocumentStore.InMemoryDocumentStore"
The client should show information on spills being passed to Celery and the results returned.
You can also run the MAUS Celery integration tests:
$ python tests/integration/test_distributed_processing/test_celery.py
A lot of messages will be printed. However, the run should end with:
---------------------------------------------------------------------- Ran 11 tests in 76.748s OK
See the MAUS pages on,