Project

General

Profile

Actions

MLCR Deployment » History » Revision 5

« Previous | Revision 5/42 (diff) | Next »
Rogers, Chris, 06 June 2013 07:54


MLCR Deployment

There are three installations of MAUS in the MICE Local Control Room:

  • The current version .maus_release (bound to lp:maus)
  • The previous version .maus_release_old (bound to lp:maus)
  • A development copy .maus_control-room (bound to lp:~maus-mlcr/maus/control-room/)

The current and previous version are updated by leapfrogging; at the moment only the control-room version is used for online reconstruction, the control room stuff was never merged into the trunk. Note that versions are in hidden folders (prefixed by '.'). To see them do

ls -h
There is a soft link that points to the "default" version (for use by shifters). This should usually point at the current version .maus_release.

New version of main MAUS release

To update to a new version:

  1. Get permission from MOM
  2. First move the current MAUS install onto the fall back release
    1. Do mv .maus_release_old .maus_release_old_
    2. Do mv .maus_release .maus_release_old
    3. Reconfigure the old install ./configure; source env.sh
    4. Check it is okay - do bash tests/unit_tests.bash; bash tests/application_tests.bash
  3. Next get the current MAUS release copy (or a version as specified by MAUS experts)
    1. Do bzr checkout lp:maus .maus_release or bzr checkout lp:maus -rMAUS-vx.y.z .maus_release
    2. Build the code cd .maus_release; ./install_build_test.bash
    3. Run the integration tests bash tests/application_tests.bash

New version of control room branch

Then update the control-room branch:

  • Check that the release was merged into the control-room branch
  • Do
    bzr update
  • Check README file that this is indeed the new version of the code
  • Commit any changes. If appropriate, merge them back into lp:maus/merge

Finally

MAUS web application

Set up proxy:

export https_proxy=http://wwwcache.rl.ac.uk:8080/
export http_proxy=wwwcache.rl.ac.uk:8080

Check out code:

bzr branch -v http://code.launchpad.net/maus-apps/ maus-apps

The current version is in MAUS/maus-apps.

A caution on Apache 2.0 and paths

Rogers: this section may be out of date

Currently Apache 2.0 exposes the MAUS online reconstruction web front-end. It uses the versions of MAUS and the MAUS web front-end in:

MAUS.new/maus-control-room
MAUS.new/maus-apps

If you move these to another directory, then you need to update the absolute paths in /usr/local/apache2/bin/envvars also.

Starting Manchego - MAUS Online DQ

Open up four windows,

xterm&
xterm&
xterm&
xterm&

DO THESE COMMANDS IN ORDER

Terminal 1 - web server terminal

If you want to use the Django lightweight web server then:

  • Configure the web server:
    cd /home/mice/MAUS.new/maus-control-room
    source env.sh
    cd /home/mice/MAUS.new/maus-apps
    ./configure --with-maus
    
  • Start up the web server:
    source env.sh
    python src/mausweb/manage.py runserver localhost:9000
    

If you want to use the Apache web server, then:

  • Configure the web server,
    cd /home/mice/MAUS.new/maus-apps
    ./configure
    rm -rf media/thumbs/*
    rm -rf media/raw/*
    
  • As super user, restart Apache,
    /usr/local/apache2/bin/apachectl restart
    
  • Check the logs,
    more /usr/local/apache2/logs/error_log 
    
  • You should see something like,
    [Mon Mar 12 12:41:16 2012] [notice] Apache/2.2.22 (Unix) mod_wsgi/3.3 
     Python/2.7.2 configured -- resuming normal operations
    
  • If you get a warning like,
    httpd: Syntax error on line 55 of /usr/local/apache2/conf/httpd.conf: 
    Cannot load /usr/local/apache2/modules/mod_wsgi.so into server: libpython2.7.so.1.0: 
    cannot open shared object file: No such file or directory
    
    • Then unset MAUS_ROOT_DIR and try again:
      unset MAUS_ROOT_DIR
      /usr/local/apache2/bin/apachectl restart
      

Go to web site

http://localhost:9000/maus/ if using Django web server.

http://localhost:80/maus/ if using Apache web server.

You should see a MAUS page listing no histograms.

Run a quick web front-end test

Copy some sample image files into the web front-end media directory:

cp images/* media/raw

Refresh the web page.

Type sample into the search form.

A new page should appear with two histograms.

Now, delete the images and the thumbnails that would have been auto-generated:

rm -rf media/thumbs/*
rm -rf media/raw/*

Terminal 2 - Celery worker terminal

Start up a Celery worker that will use up to 8 cores:

cd /home/mice/MAUS.new/maus-control-room
source env.sh
celeryd -c 8 -l INFO --purge

Wait for the Celery worker to start. This may take a minute or two.

When it is initialised it will display a start up message that will look similar to this:

The 'CELERY_AMQP_TASK_RESULT_EXPIRES' setting is scheduled for deprecation in
version 2.5 and removal in version v3.0.     CELERY_TASK_RESULT_EXPIRES
  warnings.warn(w)
[2012-03-12 12:19:31,252: WARNING/MainProcess] discard: Erased 0 message from the queue.
[2012-03-12 12:19:31,253: WARNING/MainProcess] -------------- celery@miceonrec01a v2.5.1
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      amqp://maus@localhost:5672/maushost
- ** ----------   . loader:      celery.loaders.default.Loader
- ** ----------   . logfile:     [stderr]@INFO
- ** ----------   . concurrency: 8
- ** ----------   . events:      OFF
- *** --- * ---   . beat:        OFF
-- ******* ----
--- ***** ----- [Queues]
 --------------   . celery:      exchange:celery (direct) binding:celery
[Tasks]
  . mauscelery.maustasks.MausGenericTransformTask
[2012-03-12 12:19:31,269: INFO/MainProcess] MAUS version: MAUS release version 0.1.4
[2012-03-12 12:19:31,286: INFO/PoolWorker-1] child process calling self.run()
[2012-03-12 12:19:31,289: INFO/PoolWorker-1] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,307: INFO/PoolWorker-2] child process calling self.run()
[2012-03-12 12:19:31,311: INFO/PoolWorker-2] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,320: INFO/PoolWorker-3] child process calling self.run()
[2012-03-12 12:19:31,324: INFO/PoolWorker-3] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,325: INFO/PoolWorker-4] child process calling self.run()
[2012-03-12 12:19:31,329: INFO/PoolWorker-4] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,340: INFO/PoolWorker-5] child process calling self.run()
[2012-03-12 12:19:31,343: INFO/PoolWorker-5] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,356: INFO/PoolWorker-6] child process calling self.run()
[2012-03-12 12:19:31,359: INFO/PoolWorker-6] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,372: INFO/PoolWorker-7] child process calling self.run()
[2012-03-12 12:19:31,375: INFO/PoolWorker-7] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,389: INFO/PoolWorker-8] child process calling self.run()
[2012-03-12 12:19:31,392: INFO/PoolWorker-8] Setting MAUS ErrorHandler to raise 
exceptions
[2012-03-12 12:19:31,390: WARNING/MainProcess] celery@miceonrec01a has started.

Note that one Celery pool worker will be created for each of the 8 cores.

You can ignore any warnings like:

*** Break *** write on a pipe with no one to read it

Run a quick Celery and MongoDB test

In terminal 3, test that Celery and the document store, MongoDB, are available. First check that the Celery worker has spawned 8 sub-processes:

$ ps -a 
  PID TTY          TIME CMD 
...
22903 pts/1    00:00:01 celeryd 
22910 pts/1    00:00:00 celeryd
22911 pts/1    00:00:00 celeryd 
22912 pts/1    00:00:00 celeryd 
22913 pts/1    00:00:00 celeryd 
22914 pts/1    00:00:00 celeryd 
22915 pts/1    00:00:00 celeryd 
22916 pts/1    00:00:00 celeryd 
22917 pts/1    00:00:00 celeryd 
...

There should be 9 entries. One will be the parent Celery process, and there should be 8 sub-processes.

Now run a simple spill processing example,

 
$ ./bin/examples/simple_histogram_example.py -type_of_dataflow=multi_process  

Four spills should be processed and then the program should just sit there. At which point you can press CTRL-C.
Check there are histograms and JSON documents with meta-data
$ ls
...
sample-imagetdcadc000001.eps 
sample-imagetdcadc000001.json 
sample-imagetdcadc000002.eps 
sample-imagetdcadc000002.json 
sample-imagetdcadc000003.eps 
sample-imagetdcadc000003.json 
sample-imagetdcadc000004.eps 
sample-imagetdcadc000004.json 
...

Check the database contains the associated documents

 
$ ./bin/utilities/summarise_mongodb.py --database ALL 
Database: mausdb 
  Collection: spills : 4 documents (5776 bytes 5 Kb 0 Mb) 
Database: local 
  No collections 

If this is the case then all is well!

Terminal 3 - input-transform terminal

Now, start up the MAUS input-transform process:

If you haven't already, run this:

export DATE_DB_MYSQL_DB=DATE_CONFIG
export DATE_DB_MYSQL_USER=daq
export DATE_DB_MYSQL_PWD=daq
export DATE_DB_MYSQL_HOST=miceacq07
export DATE_SITE=/dateSite
export DATE_HOSTNAME=`hostname`

Now, start up the input-transform process which continuously reads the data, transforms it and stores it in a MongoDB database:

source /home/mice/MAUS.new/maus-control-room/env.sh
$MAUS_ROOT_DIR/bin/user/reconstruct_daq.py -type_of_dataflow=multi_process_input_transform

Terminal 4 - merge-output terminal

Start up the MAUS merge-output process, which continuously reads the transformed data from MongoDB, merges it to update histograms and outputs these histograms:

source /home/mice/MAUS.new/maus-control-room/env.sh
source /home/mice/MAUS.new/maus_apps/env.sh
$MAUS_ROOT_DIR/bin/user/reconstruct_daq.py -type_of_dataflow=multi_process_merge_output

What you should see

These are examples of the sort of output you can expect to see during online operation, if all is running well.

For errors and possible recovery options, see Distributed spill transformation troubleshooting and recovery.

Web page

Once operation begins, the MAUS web page will display a list of available images. If you type a keyword e.g. tof into the search box then a new page will appear with the images. This page will automatically refresh as the images are updated.

Terminal 2 - Celery worker terminal

The Celery worker terminal displays information every time a spill is received Celery worker and the worker starts to execute it e.g.:

[2012-03-12 14:41:34,497: INFO/MainProcess] Got task from broker: 
mauscelery.maustasks.MausGenericTransformTask[92dd453d-549e-4ed8-9d3e-f9e820056e2c]
[2012-03-12 14:41:34,533: INFO/PoolWorker-4] None[92dd453d-549e-4ed8-9d3e-f9e820056e2c]:
Task invoked by maus.epcc.ed.ac.uk-23685

The Celery worker displays the host that submitted the task and the process ID.

When the transform workers have completed and the Celery worker is ready to return the result spill, this will be logged e.g.:

[2012-03-12 14:41:34,548: INFO/MainProcess] 
Task mauscelery.maustasks.MausGenericTransformTask[92dd453d-549e-4ed8-9d3e-f9e820056e2c] 
succeeded in 0.0160460472107s: '{"daq_data":null,"daq_event_type":"end_of_bur...

Terminal 3 - input-transform terminal

After MAUS starts up:

Welcome to MAUS:
        Process ID (PID): 23685
        Program Arguments: ['./bin/user/reconstruct_daq.py, '-type_of_dataflow=multi_process_input_transform']
        Version: MAUS release version 0.1.4

The MAUS framework will check for active Celery workers e.g.:

INITIATING EXECUTION
Purging document store
Checking for active Celery nodes...
Number of active Celery nodes: 1

If there are no Celery workers available the MAUS framework will exit.

If there are Celery workers then the input worker will be configured.

It will then enter a loop which operates as follows:

  • Read a spill from the input worker.
  • Check for a new run.
  • If a new run is detected then,
    • Wait for any transforms currently being executed by Celery to complete.
    • Death the transformers.
    • Birth the transformers.
  • Submit a spill to a Celery worker for execution of the transform (map) workers.
  • Check the status of the Celery worker.
  • Get the result spill from the Celery worker when it is available.
  • Put the result spill into the MongoDB database.
  • Check spills remaining from the input worker.

For example:

INPUT: read next spill
Spills input: 1 Processed: 0 Failed 0
New run detected...waiting for current processing to complete
---------- RUN 3386 ----------
Configuring Celery nodes and birthing transforms...
Celery nodes configured!
TRANSFORM: processing spills
Task ID: ddbf57a0-17a3-4abd-8b10-d340fbfd31b9
1 spills left in buffer
 Celery task ddbf57a0-17a3-4abd-8b10-d340fbfd31b9 SUCCESS 
   SAVING to collection spills (with ID 1)
Spills input: 1 Processed: 1 Failed 0
INPUT: read next spill
Spills input: 2 Processed: 1 Failed 0
Task ID: 95b0e098-952d-4da4-8b9c-39dd2c2a514a
1 spills left in buffer
INPUT: read next spill
Spills input: 3 Processed: 1 Failed 0
Task ID: 426ffdfa-11e7-43e8-a36a-ad07e5aebe3a
1 spills left in buffer
INPUT: read next spill
Spills input: 4 Processed: 1 Failed 0
Task ID: 0d83b05d-00f6-4269-ae1f-c2e592d8d01e

The input buffer always has 1 spill until the final spill is processed.

The MAUS framework maintains a count of the spills it's read from its input, those successfully transformed and those for which the transform failed. The count of processed spills is used as an index for the results in MongoDB but note that this is not the same as the "spill_num" which occurs within a spill.

When all the spills from the input have been processed, a message is printed:

--------------------
Requesting Celery nodes death transforms...
Celery node transforms deathed!
TRANSFORM: transform tasks completed
INPUT: Death
--------------------
DONE

Terminal 4 - merge-output terminal

After MAUS starts up:

Welcome to MAUS:
        Process ID (PID): 23700
        Program Arguments: ['./bin/user/reconstruct_daq.py, '-type_of_dataflow=multi_process_merge_output']
        Version: MAUS release version 0.1.4
INITIATING EXECUTION
-------- MERGE OUTPUT --------

It will then enter a loop which interleaves:

  • Read spills added to MongoDB since the last read.
  • For each spill,
    • Check for a new run.
    • If a new run is detected then,
      • Death the mergers and outputters.
      • Birth the mergers and outputters.
    • Pass the spill to the merger.
    • Pass the result to the outputter.

For example:

Read spill 1 (dated 2012-03-12 14:50:28.628000)
Change of run detected
---------- START RUN 0 ----------
BIRTH merger ReducePyTOFPlot.ReducePyTOFPlot
BIRTH outputer OutputPyImage.OutputPyImage
Executing Merge for spill 1
Executing Output for spill 1
Spills processed: 1
Read spill 2 (dated 2012-03-12 14:50:28.657000)
Executing Merge for spill 2
Executing Output for spill 2
Spills processed: 2
Read spill 3 (dated 2012-03-12 14:50:28.686000)
Executing Merge for spill 3
Executing Output for spill 3
Spills processed: 3
Read spill 4 (dated 2012-03-12 14:50:28.711000)
Executing Merge for spill 4
Executing Output for spill 4
Spills processed: 4

The spill number shown is the spill ID as described above.

At present this loop does not terminate. So, if you are sure that all the spills have been processed, press CTRL-C. The mergers and outputters will be deathed before the program exists.

Terminal 1 - web server terminal (if using Django web server)

After the Django web server starts up:

Validating models...
0 errors found
Django version 1.3.1, using settings 'mausweb.settings'
Development server is running at http://localhost:9000/
Quit the server with CONTROL-C.

This will display a list of HTTP requests that have been received e.g.


[12/Mar/2012 14:55:21] "GET /maus/ HTTP/1.0" 200 5335
[12/Mar/2012 14:55:41] "GET /maus/imagetof_hit_y.eps/png HTTP/1.0" 200 10647
...

Do not worry if the image names or paths are different.

Occasionally you may see a stack trace (see issue #811) - this usually means that the web server has been interrupted mid-request.

Updated by Rogers, Chris about 9 years ago · 5 revisions