Actions
2015-07-14-MEMO+Computing¶
MAUS¶
- Speedup implementation nearly complete
- no more string conversions between Input-Map-Output
- Reconstruction side is done including offline reducers
- MC side is nearly done (less critical for data-taking)
- Tests show reconstruction speed is ~ data-taking rate
- Some string conversions remain in the current online-reconstruction framework due to Celery and MongoDB
- We now have a plan to change the online-reconstruction framework to pass data to reducers via ROOT sockets and ditch celery-based multiprocessing
Offline processing¶
- Encountered two issues with offline processing
- At current MAUS processing speed, reconstruction of some high-rate runs took longer than 24 hours which automatically killed a job on the RAL queue (due to proxy expiration)
- Janusz has been working with the RAL grid people and has a solution to renew proxies so that jobs can continue on beyond 24 hours
- This will not be an issue when the sped-up-MAUS implementation is released
- To avoid the 24 hour limit on the RAL queue, we submitted jobs on the Imperial Tier-2 queue, but jobs died there eventually because of a memory leak
- Chris Rogers has fixed some of the leaks, Adam Dobbs is investigating any remaining leaks.
- With the MAUS speedup, reconstruction can keep up with data-taking and regardless of the GRID, we should reconstruct data "live".
- implementation to be fully worked out -- which machine, whether it stays in the MLCR, etc.
- At current MAUS processing speed, reconstruction of some high-rate runs took longer than 24 hours which automatically killed a job on the RAL queue (due to proxy expiration)
Online¶
- Important remaining item: DAQ feedback to EPICS to alert shifters to data corruption.
- The unpacker catches errors
- Ed Overton has some improvements to the tracker unpacking to catch corrupt data from tracker readout
- Needs to be communicated to EPICS/Run Control so an alarm can be raised
- Rhys Gardener has had a conversation with Pierrick, and is working on providing the necessary input to RC
Infrastructure¶
- Nagios monitoring of file compactor and data mover chain has been implemented
- Have a Nagios mirroring ability set up and tested (so that the status page is visible outside micenet)
- needs approval to make sure it doesn't break any RAL computing guidelines
Updated by Rajaram, Durga over 7 years ago ยท 1 revisions