Mappers taking too long?
I had a prod from Yordan to investigate the overhead in writing mappers. the specific question was - what is the overhead in making TOF reconstruction as 3 maps instead of 1. So I did a little investigation. I attach a script that generates a big input data set and then pushes it through a number of mappers. The test script is attached and I hope reasonably clear.
Output is also attached. Thing to see is that it took 75.94 s to do MapPyDoNothing once; and 219 s to do MapPyDoNothing 10 times. For comparison I also calculated the time to do (json -> string -> json) * number_of_maps directly. It took 15 s and 710 s respectively.
- To do it properly I should repeat the study a thousand times - don't quite have the time.
- The data set was comparable to a MICE run. Each spill was 1000 monte carlo primaries (~170 bytes each, so ~170 Kb in total). I did 1000 spills. (Say a MICE spill is presumably 1000 spills with 100 triggers, but each trigger is probably a lot bigger - I assumed a factor 10, might be a bit small still.
So in short, the overhead for splitting TOF code into 3 maps is probably going to be 5 minutes over the run - which shouldn't break us.
Is this good enough Yordan? Do you want further investigation?