Project

General

Profile

MLCR Deployment » History » Revision 5

Revision 4 (Rogers, Chris, 06 June 2013 07:54) → Revision 5/42 (Rogers, Chris, 06 June 2013 07:54)

h1. MLCR Deployment 

 There are three installations of MAUS in the MICE Local Control Room: 

 * The current version @.maus_release@ (bound to lp:maus) 
 * The previous version @.maus_release_old@ (bound to lp:maus) 
 * A development copy @.maus_control-room@ (bound to lp:~maus-mlcr/maus/control-room/) 

 The current and previous version are updated by leapfrogging; at the moment only the control-room version is used for online reconstruction, the control room stuff was never merged into the trunk. Note that versions are in hidden folders (prefixed by '.'). To see them do <pre>ls -h</pre> There is a soft link that points to the "default" version (for use by shifters). This should usually point at the current version @.maus_release@. 

 h2. New version of main MAUS release 

 To update to a new version: 

 # Get permission from MOM 
 # First move the current MAUS install onto the fall back release 
 ## Do @mv .maus_release_old .maus_release_old_@ 
 ## Do @mv .maus_release .maus_release_old@ 
 ## Reconfigure the old install @./configure; source env.sh@ 
 ## Check it is okay - do @bash tests/unit_tests.bash; bash tests/application_tests.bash@ 
 # Next get the current MAUS release copy (or a version as specified by MAUS experts) 
 ## Do @bzr checkout lp:maus .maus_release@ or @bzr checkout lp:maus -rMAUS-vx.y.z .maus_release@ 
 ## Build the code @cd .maus_release; ./install_build_test.bash@ 
 ## Run the integration tests @bash tests/application_tests.bash@ 


 h2. New version of control room branch 


 



 Then update the control-room branch: 

 * Check that the release was merged into the control-room branch 
 * Do <pre>bzr update</pre> 
 * Check README file that this is indeed the new version of the code 
 * Commit any changes. If appropriate, merge them back into lp:maus/merge 

 Finally 

 * check the updated installation functions correctly by following the shifter instructions http://micewww.pp.rl.ac.uk/documents/32 

 h3. MAUS web application 

 Set up proxy: 
 <pre> 
 export https_proxy=http://wwwcache.rl.ac.uk:8080/ 
 export http_proxy=wwwcache.rl.ac.uk:8080 
 </pre> 

 Check out code: 
 <pre> 
 bzr branch -v http://code.launchpad.net/maus-apps/ maus-apps 
 </pre> 

 The current version is in @MAUS/maus-apps@. 

 h3. A caution on Apache 2.0 and paths 

 _Rogers: this section may be out of date_ 

 Currently Apache 2.0 exposes the MAUS online reconstruction web front-end. It uses the versions of MAUS and the MAUS web front-end in: 
 <pre> 
 MAUS.new/maus-control-room 
 MAUS.new/maus-apps 
 </pre> 
 If you move these to another directory, then you need to update the absolute paths in @/usr/local/apache2/bin/envvars@ also. 

 h1. Starting Manchego - MAUS Online DQ 

 Open up four windows, 
 <pre> 
 xterm& 
 xterm& 
 xterm& 
 xterm& 
 </pre> 

 DO THESE COMMANDS IN ORDER 

 h2. Terminal 1 - web server terminal 

 If you want to use the Django lightweight web server then: 

 * Configure the web server: 
 <pre> 
 cd /home/mice/MAUS.new/maus-control-room 
 source env.sh 
 cd /home/mice/MAUS.new/maus-apps 
 ./configure --with-maus 
 </pre> 
 * Start up the web server: 
 <pre> 
 source env.sh 
 python src/mausweb/manage.py runserver localhost:9000 
 </pre> 

 If you want to use the Apache web server, then: 

 * Configure the web server, 
 <pre> 
 cd /home/mice/MAUS.new/maus-apps 
 ./configure 
 rm -rf media/thumbs/* 
 rm -rf media/raw/* 
 </pre> 
 * As super user, restart Apache, 
 <pre> 
 /usr/local/apache2/bin/apachectl restart 
 </pre> 
 * Check the logs, 
 <pre> 
 more /usr/local/apache2/logs/error_log  
 </pre> 
 * You should see something like, 
 <pre> 
 [Mon Mar 12 12:41:16 2012] [notice] Apache/2.2.22 (Unix) mod_wsgi/3.3  
  Python/2.7.2 configured -- resuming normal operations 
 </pre> 
 * If you get a warning like, 
 <pre> 
 httpd: Syntax error on line 55 of /usr/local/apache2/conf/httpd.conf:  
 Cannot load /usr/local/apache2/modules/mod_wsgi.so into server: libpython2.7.so.1.0:  
 cannot open shared object file: No such file or directory 
 </pre> 
 ** Then unset @MAUS_ROOT_DIR@ and try again: 
 <pre> 
 unset MAUS_ROOT_DIR 
 /usr/local/apache2/bin/apachectl restart 
 </pre> 

 h2. Go to web site 

 http://localhost:9000/maus/ if using Django web server. 

 http://localhost:80/maus/ if using Apache web server. 

 You should see a MAUS page listing no histograms. 

 h2. Run a quick web front-end test 

 Copy some sample image files into the web front-end media directory: 
 <pre> 
 cp images/* media/raw 
 </pre> 
 Refresh the web page. 

 Type @sample@ into the search form. 

 A new page should appear with two histograms. 

 Now, delete the images and the thumbnails that would have been auto-generated: 
 <pre> 
 rm -rf media/thumbs/* 
 rm -rf media/raw/* 
 </pre> 

 h2. Terminal 2 - Celery worker terminal 

 Start up a Celery worker that will use up to 8 cores: 

 <pre> 
 cd /home/mice/MAUS.new/maus-control-room 
 source env.sh 
 celeryd -c 8 -l INFO --purge 
 </pre> 

 Wait for the Celery worker to start. This may take a minute or two.  

 When it is initialised it will display a start up message that will look similar to this: 
 <pre> 
 The 'CELERY_AMQP_TASK_RESULT_EXPIRES' setting is scheduled for deprecation in 
 version 2.5 and removal in version v3.0.       CELERY_TASK_RESULT_EXPIRES 
   warnings.warn(w) 
 [2012-03-12 12:19:31,252: WARNING/MainProcess] discard: Erased 0 message from the queue. 
 [2012-03-12 12:19:31,253: WARNING/MainProcess] -------------- celery@miceonrec01a v2.5.1 
 ---- **** ----- 
 --- * ***    * -- [Configuration] 
 -- * - **** ---     . broker:        amqp://maus@localhost:5672/maushost 
 - ** ----------     . loader:        celery.loaders.default.Loader 
 - ** ----------     . logfile:       [stderr]@INFO 
 - ** ----------     . concurrency: 8 
 - ** ----------     . events:        OFF 
 - *** --- * ---     . beat:          OFF 
 -- ******* ---- 
 --- ***** ----- [Queues] 
  --------------     . celery:        exchange:celery (direct) binding:celery 
 [Tasks] 
   . mauscelery.maustasks.MausGenericTransformTask 
 [2012-03-12 12:19:31,269: INFO/MainProcess] MAUS version: MAUS release version 0.1.4 
 [2012-03-12 12:19:31,286: INFO/PoolWorker-1] child process calling self.run() 
 [2012-03-12 12:19:31,289: INFO/PoolWorker-1] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,307: INFO/PoolWorker-2] child process calling self.run() 
 [2012-03-12 12:19:31,311: INFO/PoolWorker-2] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,320: INFO/PoolWorker-3] child process calling self.run() 
 [2012-03-12 12:19:31,324: INFO/PoolWorker-3] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,325: INFO/PoolWorker-4] child process calling self.run() 
 [2012-03-12 12:19:31,329: INFO/PoolWorker-4] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,340: INFO/PoolWorker-5] child process calling self.run() 
 [2012-03-12 12:19:31,343: INFO/PoolWorker-5] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,356: INFO/PoolWorker-6] child process calling self.run() 
 [2012-03-12 12:19:31,359: INFO/PoolWorker-6] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,372: INFO/PoolWorker-7] child process calling self.run() 
 [2012-03-12 12:19:31,375: INFO/PoolWorker-7] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,389: INFO/PoolWorker-8] child process calling self.run() 
 [2012-03-12 12:19:31,392: INFO/PoolWorker-8] Setting MAUS ErrorHandler to raise  
 exceptions 
 [2012-03-12 12:19:31,390: WARNING/MainProcess] celery@miceonrec01a has started. 
 </pre> 
 Note that one Celery pool worker will be created for each of the 8 cores. 

 You can ignore any warnings like: 
 <pre> 
 *** Break *** write on a pipe with no one to read it 
 </pre> 

 h2. Run a quick Celery and MongoDB test 

 In terminal 3, test that Celery and the document store, MongoDB, are available. First check that the Celery worker has spawned 8 sub-processes: 

 <pre> 
 $ ps -a  
   PID TTY            TIME CMD  
 ... 
 22903 pts/1      00:00:01 celeryd  
 22910 pts/1      00:00:00 celeryd 
 22911 pts/1      00:00:00 celeryd  
 22912 pts/1      00:00:00 celeryd  
 22913 pts/1      00:00:00 celeryd  
 22914 pts/1      00:00:00 celeryd  
 22915 pts/1      00:00:00 celeryd  
 22916 pts/1      00:00:00 celeryd  
 22917 pts/1      00:00:00 celeryd  
 ... 
 </pre> 
 There should be 9 entries. One will be the parent Celery process, and there should be 8 sub-processes. 

 Now run a simple spill processing example, 
 <pre>  
 $ ./bin/examples/simple_histogram_example.py -type_of_dataflow=multi_process   
 </pre> 
 Four spills should be processed and then the program should just sit there. At which point you can press CTRL-C. 
 Check there are histograms and JSON documents with meta-data  
 <pre> 
 $ ls 
 ... 
 sample-imagetdcadc000001.eps  
 sample-imagetdcadc000001.json  
 sample-imagetdcadc000002.eps  
 sample-imagetdcadc000002.json  
 sample-imagetdcadc000003.eps  
 sample-imagetdcadc000003.json  
 sample-imagetdcadc000004.eps  
 sample-imagetdcadc000004.json  
 ... 
 </pre> 

 Check the database contains the associated documents  
 <pre>  
 $ ./bin/utilities/summarise_mongodb.py --database ALL  
 Database: mausdb  
   Collection: spills : 4 documents (5776 bytes 5 Kb 0 Mb)  
 Database: local  
   No collections  
 </pre> 
 If this is the case then all is well! 

 h2. Terminal 3 - input-transform terminal 

 Now, start up the MAUS input-transform process: 

 If you haven't already, run this: 

 <pre> 
 export DATE_DB_MYSQL_DB=DATE_CONFIG 
 export DATE_DB_MYSQL_USER=daq 
 export DATE_DB_MYSQL_PWD=daq 
 export DATE_DB_MYSQL_HOST=miceacq07 
 export DATE_SITE=/dateSite 
 export DATE_HOSTNAME=`hostname` 
 </pre> 

 Now, start up the input-transform process which continuously reads the data, transforms it and stores it in a MongoDB database: 

 <pre> 
 source /home/mice/MAUS.new/maus-control-room/env.sh 
 $MAUS_ROOT_DIR/bin/user/reconstruct_daq.py -type_of_dataflow=multi_process_input_transform 
 </pre> 

 h2. Terminal 4 - merge-output terminal 

 Start up the MAUS merge-output process, which continuously reads the transformed data from MongoDB, merges it to update histograms and outputs these histograms: 

 <pre> 
 source /home/mice/MAUS.new/maus-control-room/env.sh 
 source /home/mice/MAUS.new/maus_apps/env.sh 
 $MAUS_ROOT_DIR/bin/user/reconstruct_daq.py -type_of_dataflow=multi_process_merge_output 
 </pre> 

 h2. What you should see 

 These are examples of the sort of output you can expect to see during online operation, *if all is running well*. 

 For errors and possible recovery options, see [[MAUSCeleryRabbitMQRecovery|Distributed spill transformation troubleshooting and recovery]]. 

 h3. Web page 

 Once operation begins, the MAUS web page will display a list of available images. If you type a keyword e.g. @tof@ into the search box then a new page will appear with the images. This page will automatically refresh as the images are updated. 

 h3. Terminal 2 - Celery worker terminal 

 The Celery worker terminal displays information every time a spill is received Celery worker and the worker starts to execute it e.g.: 

 <pre> 
 [2012-03-12 14:41:34,497: INFO/MainProcess] Got task from broker:  
 mauscelery.maustasks.MausGenericTransformTask[92dd453d-549e-4ed8-9d3e-f9e820056e2c] 
 [2012-03-12 14:41:34,533: INFO/PoolWorker-4] None[92dd453d-549e-4ed8-9d3e-f9e820056e2c]: 
 Task invoked by maus.epcc.ed.ac.uk-23685 
 </pre> 

 The Celery worker displays the host that submitted the task and the process ID. 

 When the transform workers have completed and the Celery worker is ready to return the result spill, this will be logged e.g.: 

 <pre> 
 [2012-03-12 14:41:34,548: INFO/MainProcess]  
 Task mauscelery.maustasks.MausGenericTransformTask[92dd453d-549e-4ed8-9d3e-f9e820056e2c]  
 succeeded in 0.0160460472107s: '{"daq_data":null,"daq_event_type":"end_of_bur... 
 </pre> 

 h3. Terminal 3 - input-transform terminal 

 After MAUS starts up: 

 <pre> 
 Welcome to MAUS: 
         Process ID (PID): 23685 
         Program Arguments: ['./bin/user/reconstruct_daq.py, '-type_of_dataflow=multi_process_input_transform'] 
         Version: MAUS release version 0.1.4 
 </pre> 

 The MAUS framework will check for active Celery workers e.g.: 

 <pre> 
 INITIATING EXECUTION 
 Purging document store 
 Checking for active Celery nodes... 
 Number of active Celery nodes: 1 
 </pre> 

 If there are no Celery workers available the MAUS framework will exit. 

 If there are Celery workers then the input worker will be configured. 

 It will then enter a loop which operates as follows: 

 * Read a spill from the input worker. 
 * Check for a new run. 
 * If a new run is detected then, 
 ** Wait for any transforms currently being executed by Celery to complete. 
 ** Death the transformers. 
 ** Birth the transformers. 
 * Submit a spill to a Celery worker for execution of the transform (map) workers. 
 * Check the status of the Celery worker. 
 * Get the result spill from the Celery worker when it is available. 
 * Put the result spill into the MongoDB database. 
 * Check spills remaining from the input worker. 

 For example: 

 <pre> 
 INPUT: read next spill 
 Spills input: 1 Processed: 0 Failed 0 
 New run detected...waiting for current processing to complete 
 ---------- RUN 3386 ---------- 
 Configuring Celery nodes and birthing transforms... 
 Celery nodes configured! 
 TRANSFORM: processing spills 
 Task ID: ddbf57a0-17a3-4abd-8b10-d340fbfd31b9 
 1 spills left in buffer 
  Celery task ddbf57a0-17a3-4abd-8b10-d340fbfd31b9 SUCCESS  
    SAVING to collection spills (with ID 1) 
 Spills input: 1 Processed: 1 Failed 0 
 INPUT: read next spill 
 Spills input: 2 Processed: 1 Failed 0 
 Task ID: 95b0e098-952d-4da4-8b9c-39dd2c2a514a 
 1 spills left in buffer 
 INPUT: read next spill 
 Spills input: 3 Processed: 1 Failed 0 
 Task ID: 426ffdfa-11e7-43e8-a36a-ad07e5aebe3a 
 1 spills left in buffer 
 INPUT: read next spill 
 Spills input: 4 Processed: 1 Failed 0 
 Task ID: 0d83b05d-00f6-4269-ae1f-c2e592d8d01e 
 </pre> 

 The input buffer always has 1 spill until the final spill is processed. 

 The MAUS framework maintains a count of the spills it's read from its input, those successfully transformed and those for which the transform failed. The count of processed spills is used as an index for the results in MongoDB but note that this is not the same as the "spill_num" which occurs within a spill. 

 When all the spills from the input have been processed, a message is printed: 
 <pre> 
 -------------------- 
 Requesting Celery nodes death transforms... 
 Celery node transforms deathed! 
 TRANSFORM: transform tasks completed 
 INPUT: Death 
 -------------------- 
 DONE 
 </pre> 

 h3. Terminal 4 - merge-output terminal 

 After MAUS starts up: 

 <pre> 
 Welcome to MAUS: 
         Process ID (PID): 23700 
         Program Arguments: ['./bin/user/reconstruct_daq.py, '-type_of_dataflow=multi_process_merge_output'] 
         Version: MAUS release version 0.1.4 
 INITIATING EXECUTION 
 -------- MERGE OUTPUT -------- 
 </pre> 

 It will then enter a loop which interleaves: 

 * Read spills added to MongoDB since the last read. 
 * For each spill, 
 ** Check for a new run. 
 ** If a new run is detected then, 
 *** Death the mergers and outputters. 
 *** Birth the mergers and outputters. 
 ** Pass the spill to the merger. 
 ** Pass the result to the outputter. 

 For example: 

 <pre> 
 Read spill 1 (dated 2012-03-12 14:50:28.628000) 
 Change of run detected 
 ---------- START RUN 0 ---------- 
 BIRTH merger ReducePyTOFPlot.ReducePyTOFPlot 
 BIRTH outputer OutputPyImage.OutputPyImage 
 Executing Merge for spill 1 
 Executing Output for spill 1 
 Spills processed: 1 
 Read spill 2 (dated 2012-03-12 14:50:28.657000) 
 Executing Merge for spill 2 
 Executing Output for spill 2 
 Spills processed: 2 
 Read spill 3 (dated 2012-03-12 14:50:28.686000) 
 Executing Merge for spill 3 
 Executing Output for spill 3 
 Spills processed: 3 
 Read spill 4 (dated 2012-03-12 14:50:28.711000) 
 Executing Merge for spill 4 
 Executing Output for spill 4 
 Spills processed: 4 
 </pre> 

 The spill number shown is the spill ID as described above. 

 At present this loop does not terminate. So, if you are sure that all the spills have been processed, press CTRL-C. The mergers and outputters will be deathed before the program exists. 

 h3. Terminal 1 - web server terminal (if using Django web server) 

 After the Django web server starts up: 

 <pre> 
 Validating models... 
 0 errors found 
 Django version 1.3.1, using settings 'mausweb.settings' 
 Development server is running at http://localhost:9000/ 
 Quit the server with CONTROL-C. 
 </pre> 

 This will display a list of HTTP requests that have been received e.g. 
 <pre> 

 [12/Mar/2012 14:55:21] "GET /maus/ HTTP/1.0" 200 5335 
 [12/Mar/2012 14:55:41] "GET /maus/imagetof_hit_y.eps/png HTTP/1.0" 200 10647 
 ... 
 </pre> 

 Do not worry if the image names or paths are different. 

 Occasionally you may see a stack trace (see issue #811) - this usually means that the web server has been interrupted mid-request.