Feature #1432
Geometry simulation speed up attempt
Added by Bayes, Ryan over 9 years ago. Updated over 8 years ago.
100%
Description
Loading the simulation of the MICE channel currently takes on the order of 20 minutes using the CDB geometry. This is felt to be due to inefficient handling of the tessellated solids by the MICE modules/G4 interface somehow combined with the number of tessellated solid elements that make put the geometry extracted from the CAD. Two tests have been done in this regard; one test checked the load time of the tessellated solid through a judicious use of output statements and found that the slowdown is due to the placement of the tessellated solids by GEANT, and the second used a stand-alone simulation to check the load time independent of other factors in the MAUS framework which, anecdotally (I don't know if I actually recorded the load time anywhere) takes a fraction of the time.
I plan to test whether the load time may be reduced by using GDML for the definition of the geometry explicitly rather than extracting it from the MICE modules. This will take the form of a new class of the same form as the DetectorConstruction class under the common_cpp/Simulation directory. It looks to me that most of the bureaucracy will be maintained, but the ResetGeometry method will be replaced with a method that appeals to the GDML parser to define the geometry. The majority of the work will be extracting the sensitive detector definitions and visualization attributes from the GDML auxiliary maps in the detector files (the detector files on CDB were written to fulfil this requirement). I believe (again based purely on anecdote) that there are optimizations in the GDML parser that will provide some increase in loading speed for the simulation. Should this not be the case the other avenue is to test a ROOT based geometry definition.
Files
Updated by Bayes, Ryan over 9 years ago
The additional complication is that the MAUS Sensitive Detector definitions will need to be redefined to extract information from the GDML files. This will make the job a little more complicated than described above.
For the purpose of testing the loading speed it is enough to define the geometry without sensitive detectors. If this approach has an appreciable effect, then the change to the sensitive detector definitions is justified.
Updated by Bayes, Ryan over 9 years ago
Approach described above does not work because there is no evidence that the geometry actually loads into geant4. The simulation and processing of spills takes no time at all and the visualization shows no beam line elements. I am continuing investigating the use of the GDML explicitly in the MAUS simulation through a more direct emulation of the examples given in "${GEANT4}/examples/extended/persistency/gdml" which means loading the GDML world directly into the initialization of the DetectorConstruction. This will mean a loss of flexibility in the physical geometry as it will no longer be set with the "SetMiceModules" subroutines. The fields will still be controlled from the MICE modules, however.
I ran the simulation with 10 spills to provide a characteristic loading time of the geometry. In posix format the time to run the current MICE module conversion of the Step IV CAD geometry on my desktop system is
real 4896.36 s (81.606 minutes)
user 2683.60 s (44.73 minutes)
sys 195.31 s (3.26 minutes)
This is fine for long simulations, but it is a pain for debugging purposes.
The loading time is proportional to the number of Tesselated solid objects placed, and the number of facets of the placed objects. The Step I geometry takes much less time to load than the Step IV geometry --- in the same time format as above the simulation time for 10 spills on the same system as above is;
real 1026.35 s (17.101 minutes)
user 998.51 s (16.64 minutes)
sys 2.03 s
The difference is the composition of the solenoids and focus coil as well as the diffuser.
Updated by Bayes, Ryan over 9 years ago
Two additional approaches to the long geometry loading time issue has been implemented and tested.
The first is to produce a geometry that includes only the active detector volumes and magnetic fields but does not include any of the CAD generated beam line volumes. Such a geometry is useful for debugging purposes, but not for any physics analyses. This reduces the loading time to less than that of the legacy geometry. The debugging geometry of this type will be made available on CDB as an idealized geometry.
The second is to upgrade the version of GEANT4 to the latest release, 4.10.00 from 4.9.6. This was done relatively quickly, requiring updates to the build scripts, and additional use declarations for CLHEP units. I generated 10 spills on my desktop linux system with 1 particle per spill after loading a preliminary StepIV geometry to produce the the following simulation time (again in posix format);
real 1696.65 (28 min 16 s)
user 1680.95 (28 min)
sys 1.23 s.
Because the beam line is not exactly the same as the previous tests, I re-ran the simulation with 4.9.6 using the same beam and geometry. The time to compare to the previous simulation is;
real 1722.38 (28 min 42 s)
user 1680.20 (28 min)
sys 3.30 s
The difference in time between the two different simulations is negligible. The user time here is significantly less than that of the StepIV simulation described in the previous post possibly due to a combination of advances in the geometry description itself --- this test used geometry id 23 as opposed to geometry id 16 for the old test; some bugs in the CAD description have been identified and removed --- and a change in the simulation conditions for this test including spill size (now 1) and starting position (now just before TOF1). In spite of the upgrade not making a significant change to the load time, g4.10.00 may be important for other reasons (i.e. physics lists or multi-threading etc). I have committed my changes to lp:~ryan-bayes/maus/mausg4_10.
Updated by Bayes, Ryan over 9 years ago
A first run at loading the GDML files directly into the MAUS simulation is now complete enough to make a statement about the potential improvement in the loading time. I have introduced the G4GDMLparser into the MAUSGeant4Manager and used it to set the world volume of the geometry with the Step_IV.gdml file. The sensitive detectors are set using the GDML auxiliary elements. A mice module is written internally to communicate this information to the pre-existing class member functions used to build the detector. I am still using the MICE modules to define the magnetic fields.
I used the same configuration as the above test of geant4.10.00 to benchmark the simulation. The simulation took the following time (in posix format) to generate 10 spills on my desktop linux system;
real 7min 51.915s
user 4min 59.612s
sys 0min 1.637s
Assuming that all other things are equal this means that this will be the way to go forward, I think. There are some major flies in the ointment however;
1. Only the TOFs work in this mode without complaint. This is because they do not demand any internal structure definition. Tests with the Trackers indicate a problem with the SciFiPlane definition that I have not been able to work out yet. I have not investigated the issues with the other detector definitions.
2. When I ran the simulation I received an error claiming that there is a "Track stuck" for what I gather is every charged particle in the simulation. I assume that it is a step definition problem that I missed which is in the Detector Geometry code that I am now not implementing because I have not used the MICE modules to define the geometry.
That being said, the simulation did run to completion with all of the detectors except the TOFs commented from the source geometry. I was able to view the WRL output from one simulation ... the image is attached to this issue.
A lesser problem (more due to laziness than anything) is that I need a new configuration variable to dictate the GDML geometry in parallel with the simulation_geometry_file which I want to maintain for the field information. So far I have used a hard coded path to define the GDML file but that is outliving its usefulness now that this approach to the geometry is starting to look like a viable option. I have committed the changes that I have so far to lp:~ryan-bayes/maus/mausg4_10 should anyone be interested.
Updated by Bayes, Ryan over 9 years ago
I have found that some solutions to the problems in the GDMLparser implementation. The active detectors have can be included in the GDML files after a few minor reformulations. I now have GDML files for the TOF, EMR, KL, and Tracker that can be used with the GDMLparser. I have uploaded these files to CDB as the code to use them properly is not available past my branch available at lp:~ryan-bayes/maus/mausg4_10 and I am not sure how these files will behave with respect to the GDML to MICE Module conversion. So far I am sure that these files will produce TOF digits; I have not yet fully tested the KL or the EMR, and I have not yet observed tracks from the Tracker simulation generated with these files (a serious debug of the GDML files is required). After implementing these files the "Track stuck" error seems to have disappeared completely.
I have settled on a scheme to specify the GDML files for the parser. The file name is given in a stripped down version of the ParentGeometry file. This file still contains the beam line and cooling channel information. I still have to find a way to introduce the diffuser information into the GDML files, but this should be a simple matter of adding the data recorded in the MAUS_Information files to the base GDML files.
Updated by Bayes, Ryan over 9 years ago
Managed to reduce the loading time of the geometry to 350 s user time or 56 s real time using the GDML parser. The difference is likely due to time spent in reading in the files to the parser. The loading time for the legacy geometry is approximately 2 minutes 50 s, or 170 s for both the user time and the real time. Thus the use of the GDML parser with the CAD geometry meets the load time goals of twice the legacy geometry.
There is some problem with the tracking, however. Tracking for particles in the geometry as it exists in the CDB (id35) loaded using the GDML parser stop at the diffuser. This is because of a series of overlaps in the objects created by the CAD extraction. The parser tries to minimize the overlap which seems to result in a subset of the objects becoming displaced from their assigned positions. I think this is to minimize the number of overlapping planes. I am not sure why this feature (bug) is there but it can be alleviated by removing one of the overlapping objects --- not a great solution for large objects, but for small objects like screws etc. it is reasonable. This problem is instigated by rendering different materials in the geometry separately, resulting in facets of nested surfaces not matching up properly. If the FastRAD analysis could be run in one go while keeping the material information in tact, this would not be a problem. As it is the step files do not communicate the materials to FastRAD necessitating the current approach of generating subgeometries by material which allows the overlap problem. This is not a problem for the generation of the geometry using the MICE modules --- it does not seem to mind the overlapping volumes and probably does not detect them as such because of the nature of Tessellated solids.
I have worked through creating a version of the geometry that has removed these overlap volumes and I have added it to lp:~ryan-bayes/maus/mausg4_10 under files/geometry/default/Step_IV_id35/gdml. I have not added it to CDB because it is only pertinent for the GDML parser based geometry.
Updated by Bayes, Ryan almost 9 years ago
The geometry overlap problem has been solved using a new definition of the CAD geometry. This new CAD introduced some new problems.
1. FastRad generates a single file for each material. Not to get to far into the details, but this is a problem for the MICE module file translation. I have not yet addressed this problem, but it is something that will affect how we proceed with the geometry definition. The change may also motivate us (me) to write a CAD translation program into MAUS to prevent changes like this appearing in the future.
2. All of the output geometry files occupy the entire hall volume into a format that is viable for reading into MAUS via a GDMLParser invocation. I have since written a python script that renders these files into a set of geometry files by position that can be read into the GDMLParser. The resulting files load with (almost) no problems.
Unrelated to the definition of the CAD there is a problem now with the reconstruction of hits in the TOFs and the Trackers. No hits appear to be generated in the simulation files. I still don't know why this may be... the sensitive detectors have been created in the simulation. I think that there may be a dis-connect between the reconstruction algorithms of the detectors and the definition of the geometry configuration. I think that the reconstruction all refers to the MICE modules (what else?) so I am exploring ways to make sure that this information is available. If this is the problem, then the choices are either writing the GDML information to MiceModules on the fly (which is mostly done, but I have doubts regarding its persistency) or getting the reconstruction to read from the GDML (which is probably more work).
In spite of this problem, the running time under this scheme is consistent with the legacy running time (so far), and tests of energy loss and rates using virtual planes produce reasonable results.
Updated by Bayes, Ryan almost 9 years ago
I have figured out why I have been not getting hits in the SciFi planes. The reason is that the methods that I have been using to define sensitive detectors only affect the indicated logical volume and not any of the daughter logical volumes. This was not a problem for the MICE Module implementation because in most cases daughter volumes of sensitive detectors are defined after the detector has been made sensitive so the daughter volumes inherit that quality. My solution is to recursively define the daughter volumes of a sensitive detector to be sensitive themselves. By contrast the TOFs were producing hits in my original (non-recursive, top level only) implementation.
I am still looking for bugs incurred by switching to the GDML implementation. So far I have found (and corrected) a bug in the file translation that did not include the EMR in the ParentGeometryFile (which is a problem for running the geometry in debug mode, and may be a problem for reconstruction though I haven't checked yet), some problems with the KL implementation such that it did not recognize the KL Fibres (which is now fixed), and a problem with the logic for setting the diffuser (which I really needed to double check anyway to make sure that Pierrick's numbering of the diffuser planes matches mine). More involved debugging will be required to figure out why the Trackers are producing hits and digits, but no tracks. The likely culprit is a plane rotation that is required by the reconstruction that I had to change for the GDML files. I will investigate further.
Updated by Bayes, Ryan over 8 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
The GDML parser implementation is now in the merge branch. I added a short test that extracts a working Step IV geometry from a set of files in the "tests" directory and places it in the "tmp" directory and runs a simple test of the simulation with the new geometry. New geometries will be produced with a similar procedure.