Online reconstruction fails
Cross posted to elog...
- Reported error was that MAUS was making histograms but failing to produce any hits.
- Assumed some problem with data structure so applied merge with trunk to online code. Still no luck.
- Made a test_analyze_data_online at tests/integration/test_analyze_online/; hacked that inputter at bin/online/analyze_data_online_input_transform.py so that it uses the offline inputter and bin/analyze_data_online.py so that it specifies a filed number and location. Downloaded data 04168.tar and attempted to reconstruct. No luck, and no daq_data tree at all was created.
- Tried a different data file; 04010.tar. Still no luck. Prodded InputCppDAQData some - looks like everything is working okay but the line
didn't seem to fill the "daq_data" branch.
- Maybe there is something funny about single station tests. Tried 03513 then 03512 data files from March run. 03512 produced daq_data branch but still no histograms - tof reducer complains "No recon_events branch" or similar
- Added some debugging output into the MapCppTOFDigits and ran input/transform/OutputCppRoot in single threaded mode. MapCppTOFDigits claims that it is indeed producing data...
- That's as far as I got - I put the code back into multithreaded mode and InputCppOnlineDAQ.
- So somehow it looks like it is producing reconstruction data but this is getting eaten somewhere...
Sorry, that's as far as I got. It looks like we will have to do some more debugging next week...
Updated by Taylor, Ian almost 9 years ago
During today's shift, I tracked down a part of the error. The software was attempting to read from DAQ_hostname = 'miceraid1a', which should now be 'miceraid5'. This allowed me to load data during runs, but still produced errors. I am looking at this, but hindered but lack of familiarity with the code.
Updated by Rogers, Chris almost 9 years ago
- Category set to Online reconstruction
- Assignee set to Richards, Alexander
- Target version set to Future MAUS release
Alex, I think you should coordinate this - online is now your beast. If there is a problem in one of the reconstruction routines, then please bump it up to the relevant dev (ask if you aren't sure).
Online not working is a serious problem and I think we need to understand what went wrong and produce a post mortem type document. Understand how the system failed both at a code level and also at a management level (i.e. what are the systems that we need to put in place to make sure this doesn't happen again).
Updated by Rajaram, Durga almost 9 years ago
trunk rev# 829, using analyze_data_online_input_transform.py [ from onrec01:MAUS/maus/bin/online/ ]
changed Offline=True in the script
ran on 03513.000
no data in output root tree.
I got the following errors while running
Traceback (most recent call last): File "/home/durga/mtest/merge/src/common_py/ErrorHandler.py", line 159, in HandleCppException raise(CppError(error_message)) ErrorHandler.CppError: In branch recon_events In branch part_event_number Missing required branch part_event_number converting json->cpp at PointerItem::SetCppChild
- with output=OutputPyJSON()
the errors go away; output json file seems fine [ has space points]
- with output=OutputPyImage(), reducer=ReducePyTOFPlot()
no errors & tof histograms are filled
- added MapPyReconSetup() before the other mappers & ouput=OutputCppRoot()
no errors & output root tree is filled
modified script attached (btw, bin/analyze_data_offline has MapPyReconSetup() by default)
Updated by Richards, Alexander almost 9 years ago
This sounds like you have found the problem? As I am not very familiar with the data structure I can only give my two cents from the perspective of understanding generally how the cppRoot output is working. The error that you posted looks to me like during the conversion from Json to Cpp one of the processors decided that the Json document should contain the element `part_event_number' and either it couldn't be found or wasn't being set up in the cpp data structure.
I assume therefore that adding MapPyReconSetup() has created the correct entry in the Json hence the problem has disappeared.