Project

General

Profile

Bug #1352

End-of-run dataflow issue

Added by Rajaram, Durga about 10 years ago. Updated about 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
RealData
Target version:
Start date:
05 October 2013
Due date:
% Done:

100%

Estimated time:
Workflow:
New Issue

Description

maus-input-transform.log (from Run 5261, Sat Oct 5) indicates an issue with the dataflow at end-of-run -- possibly due to DAQ crashes (https://micewww.pp.rl.ac.uk/elog/MICE+Log/2450) or abnormal termination

Traceback (most recent call last):
  File "/home/mice/MAUS/.maus_release/bin/online/analyze_data_online_input_transform.py", line 65, in <module>
    run()
  File "/home/mice/MAUS/.maus_release/bin/online/analyze_data_online_input_transform.py", line 62, in run
    MAUS.Go(my_input, my_map, reducer, output_worker, data_cards) 
  File "/home/mice/MAUS/.maus_release/src/common_py/Go.py", line 131, in __init__
    self.get_job_footer())
  File "/home/mice/MAUS/.maus_release/src/common_py/framework/input_transform.py", line 285, in execute
    map_buffer = DataflowUtilities.buffer_input(emitter, 1)
  File "/home/mice/MAUS/.maus_release/src/common_py/framework/utilities.py", line 67, in buffer_input
    value = next(input_emitter)
  File "/home/mice/MAUS/.maus_release/build/InputCppDAQOnlineData.py", line 86, in emitter
    while (self.readNextEvent()):
  File "/home/mice/MAUS/.maus_release/build/InputCppDAQOnlineData.py", line 108, in readNextEvent
    def readNextEvent(self): return _InputCppDAQOnlineData.InputCppDAQOnlineData_readNextEvent(self)
KeyboardInterrupt
WARNING - Attempt to delete the physical volume store while geometry closed !
WARNING - Attempt to delete the logical volume store while geometry closed !
WARNING - Attempt to delete the solid store while geometry closed !
WARNING - Attempt to delete the region store while geometry closed !

This causes the tofcalib reducer to not properly write out the reduced root file -- the ROOT file from the reducer is written at the end-of-run
From reconstruct_daq_tofcalib_reducer.log:

---------- END RUN 5261 ----------
Ending job
Clearing Globals
Traceback (most recent call last):
  File "/home/mice/MAUS/.maus_release/bin/online/reconstruct_daq_tofcalib_reducer.py", line 71, in <module>
    run()
  File "/home/mice/MAUS/.maus_release/bin/online/reconstruct_daq_tofcalib_reducer.py", line 68, in run
    MAUS.Go(my_input, my_map, reducer, output_worker, data_cards) 
  File "/home/mice/MAUS/.maus_release/src/common_py/Go.py", line 131, in __init__
    self.get_job_footer())
  File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 281, in execute
    run_again = will_run_until_ctrl_c and self._execute_inner_loop()
  File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 327, in _execute_inner_loop
    self.process_event(spill)
  File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 238, in process_event
    raise RuntimeError("Failed to execute Output")
RuntimeError: Failed to execute Output

Related issues

Related to MAUS - Bug #1351: Data structure issue in V1724ClosedRogers, Chris04 October 2013

Actions
#1

Updated by Rajaram, Durga about 10 years ago

This looks like it's coming from InputCppDAQOnlineData

#2

Updated by Rogers, Chris about 10 years ago

  • Target version set to Future MAUS release

Are you sure that maus-input-transform triggered the death? KeyboardInterrupt - usually comes from receiving a SIGINT from the bin/analyze_data_online.py application launcher.

In bin/analyze_data_online.py main(), we poll_processes (line 377) unless there is an exception. If an exception occurs, we go into the finally: where we call cleanup(PROCESSES). cleanup(PROCESSES) calls SIGINT (ctrl-c/KeyboardInterrupt) on remaining processes. I note in tofcalib reducer the following error

     34 Traceback (most recent call last):
     35   File "/home/mice/MAUS/.maus_release/src/common_py/ErrorHandler.py", line 162, in HandleCppException
     36     raise CppError(error_message)
     37 ErrorHandler.CppError: In branch daq_data
     38 Failed to recognise all json properties emr  at ObjectProcessor<ObjectType>::JsonToCpp
     39 Traceback (most recent call last):
     40   File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 281, in execute
     41     run_again = will_run_until_ctrl_c and self._execute_inner_loop()
     42   File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 327, in _execute_inner_loop
     43     self.process_event(spill)
     44   File "/home/mice/MAUS/.maus_release/src/common_py/framework/merge_output.py", line 238, in process_event
     45     raise RuntimeError("Failed to execute Output")
     46 RuntimeError: Failed to execute Output

There is indeed no emr branch in the DAQDataProcessor:

     21 DAQDataProcessor::DAQDataProcessor()
     22     : _V830_proc(), _trigger_request_proc(new TriggerRequestProcessor),
     23       _tof1_proc(new TOFDaqProcessor), _ckov_proc(new CkovDaqProcessor),
     24       _tof2_proc(new TOFDaqProcessor), _unknown_proc(new UnknownProcessor),
     25       _kl_proc(new KLDaqProcessor), _tag_proc(new TagProcessor),
     26       _tof0_proc(new TOFDaqProcessor), _trigger_proc(new TriggerProcessor) {
     27     RegisterValueBranch
     28           ("V830", &_V830_proc, &DAQData::GetV830,
     29           &DAQData::SetV830, false);
     30     RegisterValueBranch
     31           ("trigger_request", &_trigger_request_proc, &DAQData::GetTriggerRequestArray,
     32           &DAQData::SetTriggerRequestArray, false);
     33     RegisterValueBranch
     34           ("tof1", &_tof1_proc, &DAQData::GetTOF1DaqArray,
     35           &DAQData::SetTOF1DaqArray, false);
     36     RegisterValueBranch
     37           ("ckov", &_ckov_proc, &DAQData::GetCkovArray,
     38           &DAQData::SetCkovArray, false);
     39     RegisterValueBranch
     40           ("tof2", &_tof2_proc, &DAQData::GetTOF2DaqArray,
     41           &DAQData::SetTOF2DaqArray, false);
     42     RegisterValueBranch
     43           ("unknown", &_unknown_proc, &DAQData::GetUnknownArray,
     44           &DAQData::SetUnknownArray, false);
     45     RegisterValueBranch
     46           ("kl", &_kl_proc, &DAQData::GetKLArray,
     47           &DAQData::SetKLArray, false);
     48     RegisterValueBranch
     49           ("tag", &_tag_proc, &DAQData::GetTagArray,
     50           &DAQData::SetTagArray, false);
     51     RegisterValueBranch
     52           ("tof0", &_tof0_proc, &DAQData::GetTOF0DaqArray,
     53           &DAQData::SetTOF0DaqArray, false);
     54     RegisterValueBranch
     55           ("trigger", &_trigger_proc, &DAQData::GetTriggerArray,
     56           &DAQData::SetTriggerArray, false);
     57     RegisterIgnoredBranch("single_station", false);
     58 }

So Yordan needs to tell us what data structure he wants so we can fix this.

The issue is that the data structure is strict - so just adding random bits and pieces will break the code. If we don't do a strict data structure, then nobody has a clue what data looks like and we end up in a mess, so we do have to be strict.

#3

Updated by Rogers, Chris about 10 years ago

This is fixed on onrec01 - I added "emr" branch as an Ignored branch in daq data. I will do a push...

#4

Updated by Rogers, Chris about 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Pushed to merge branch as r994

#5

Updated by Rajaram, Durga about 10 years ago

  • Target version changed from Future MAUS release to MAUS-v0.7.2

Also available in: Atom PDF