Project

General

Profile

Feature #1638

MAUS speedup implementation

Added by Rajaram, Durga over 6 years ago. Updated over 6 years ago.

Status:
Open
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
09 March 2015
Due date:
% Done:

0%

Estimated time:
Workflow:
New Issue

Description

We would like to speedup the processing time with MAUS.
Yordan suggests a multithreaded C++11 implementation.

#1

Updated by Rajaram, Durga over 6 years ago


Notes from March 9 meeting (YK, CR, AD, DR)


  • Outline of Yordan's implementation
    • parallelize with C++11 multithreading
    • parallelization is at mapper-level (unlike at spill-level with the current MAUS celery framework)
      • Input -> FIFO buffer -> workers (mappers)
  • Requirements:
    • a gcc version that supports C++11
      • the default gcc version on SL6.4 is gcc 4.4.6 which does not have c++11 support
      • need gcc >= 4.7 GNU GCC page
    • get rid of JSON mappers
      • this was something we were planning to day anyway
      • want to still support JSON as an output format
    • Isolate processing from the datastructure library
      • YK: some of the tracker data structures do processing
      • Needs cleanup
    • Build separate libraries for:
      • Datastructure
      • Mappers
      • cppUtils
    • Get rid of static calls, singletons at processing time:
      • statics at the input level (pre-threading) are OK, eg. Config cards (?)
      • Need code development/conversion to const in Globals, FieldMaps, MiceModules (used by digitizers), Squeak/Squeal, ErrorHandler
        • any other places ?
    • Questions:
      • how do we handle MapCppSimulation since Geant 4.96 is not multithreaded? (4.9.10 is, according to YK)
      • Do we want to do multiprocessing instead of multithreading?
      • Can we get a c++11-compatible gcc compiler installed on the GRID?
        • an alternative which was suggested by YK is to have this "fast processing" plugin used only for onrec/laptop jobs, and continue to use MAUS-as-is on the GRID. But I think that makes maintenance a potential problem. Do we really want our online reconstruction and batch reconstruction using different "frameworks"? Note: the analysis group expects to analyze data that comes out of the GRID.
    • ACTIONS
      • get gcc >= 4.7 installed on onrec03
      • Test Yordan's implementation with available CppMappers on onrec03
      • Convert JSON mappers to MAUS::Data
      • Remove processing elements out of the Datastructure
      • Remove static calls and singletons from Globals
#3

Updated by Dobbs, Adam over 6 years ago

Broadly I think this sounds like a good plan. A few questions / comments:

  • C++11 multithreading - great idea
    • ... at mapper level - why not at spill level as at the moment? Will just fanning out each mapper to a different thread buy us that much, given the overhead of fanning out and collecting back in? Don't some mappers need to be consecutive too e.g. MapCppSimulation before subsequent reconstruction? Our data structure is already explicitly set up for parallel processing at the spill level.
  • GCC 4.7 - we could just bundle it with MAUS? I believe we used to do it this way in the past
  • Some of the tracker data structures do processing - Really? I deliberately tried to keep the data structure and reconstruction routines separate. Which classes specifically?
  • Separate libraries - this is a good idea anyway, our current monolithic library is rather ugly
  • Upgrading GEANT to 4.9.10 - fine by me
  • Multiprocessing instead of multithreading - could someone explain the difference?
  • How far has this implementation been taken in Yordan's branch?
  • If we decide to proceed we will need to consider how to divide up the labour
#4

Updated by Nebrensky, Henry over 6 years ago

Re. C++11 multithreading and Grid:

  • We need to be wary of just throwing multithreaded jobs at the Grid - they may get killed off by sites
  • It is possible to specify how many threads we want at submission time - sufficient CPU cores are then guaranteed at run time (else you may find that all threads run on the same core!) - a minor (hopefully) change in the Grid submission stuff
  • The Grid accounting knows about multithreaded jobs :(
  • With our present model, an 8-thread MICE job would then hold 8 cores completely unused whilst downloading and unpacking the data tarball. This will not make us popular. From memory our existing job efficiencies are already down in the 10-20% range... (CPU time/wall time)
  • Do we get useful speedup at (say) 4 threads? I've no idea what the typical cores/node is across the Grid any more, but demanding too many cores may constrain available resources
  • What is the effect on the total memory usage of the job?

If MICE wants to follow this up then we should include it in the upcoming Grid workshop.

#5

Updated by Rajaram, Durga over 6 years ago

Dobbs, Adam wrote:

Broadly I think this sounds like a good plan. A few questions / comments:

  • C++11 multithreading - great idea
    • ... at mapper level - why not at spill level as at the moment? Will just fanning out each mapper to a different thread buy us that much, given the overhead of fanning out and collecting back in? Don't some mappers need to be consecutive too e.g. MapCppSimulation before subsequent reconstruction? Our data structure is already explicitly set up for parallel processing at the spill level.

Yordan should comment further, but I think his implementation is much simpler if done at the mapper level so different threads/workers just run different mappers. The order can be controlled.

  • GCC 4.7 - we could just bundle it with MAUS? I believe we used to do it this way in the past

I guess that's an option. Will that be more reliable than saying MAUS installation requires >= gcc 4.7 either natively or through a yum install? Providing it as a third party will probably still require checking that this gcc version can be installed on the OS, etc. Will bundling it make the third party install too bulky?

  • Upgrading GEANT to 4.9.10 - fine by me

I'm not sure if a GEANT upgrade is necessary if we cleanup the static calls.
Plus, I think a new GEANT version requires checking and validating that the physics processes are OK.
Chris?

  • Multiprocessing instead of multithreading - could someone explain the difference?

Multithreading just runs different threads under the same OS-level-process sharing the context, whereas in multiprocessing you're running multiple OS-level processes each with its own memory footprint and requiring that the multiple (ROOT) outputs be merged.

  • How far has this implementation been taken in Yordan's branch?

He showed a working version on his laptop, with just the MapCppEMR mappers. But need to test on SL6 (I guess onrec03) with new gcc

#6

Updated by Franchini, Paolo over 6 years ago

Hi,

I installed gcc 4.8 on miceonrec03

[root@miceonrec03 ~]# /opt/rh/devtoolset-2/root/usr/bin/gcc --version
gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
[root@miceonrec03 ~]# /opt/rh/devtoolset-2/root/usr/bin/g++ --version
g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)

the 4.4 versions are on /usr/bin/

Also available in: Atom PDF