Project

General

Profile

Feature #1376

JSON Conversion Overhead

Added by Dobbs, Adam almost 10 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Python API
Target version:
Start date:
20 November 2013
Due date:
% Done:

100%

Estimated time:
87.00 h
Workflow:
New Issue

Description

This is currently being worked on by Ian, but I thought it would be good to record it here anyway. I have been using to callgrind to perform code profiling on MAUS, using MC with only the tracker detectors in use, to see which bits of the code cause the bottle necks. So far, the clear answer seems to be GEANT4 and the JSON conversions. The attached is a callgrind output file taken over a 2 spill run, with 100 events per spill (associated datacard also attached). KCacheGrind is a good way too view the output. I think this shows that removing the JSON conversion between mappers should be achieved before we need to do large scale data reconstruction, or during online running for Step IV.


Files

callgrind.out.20683 (10.7 MB) callgrind.out.20683 Dobbs, Adam, 20 November 2013 12:20
datacard_mc_helical (9.1 KB) datacard_mc_helical Dobbs, Adam, 20 November 2013 12:20
callgrind.out.16026 (11 MB) callgrind.out.16026 Dobbs, Adam, 21 November 2013 23:14
datacard_mc_helical (9.18 KB) datacard_mc_helical Dobbs, Adam, 21 November 2013 23:14
itaylor-map.tar.gz (17.5 MB) itaylor-map.tar.gz Taylor, Ian, 17 April 2014 13:58
valgrind.log (1.29 MB) valgrind.log Rogers, Chris, 02 May 2014 09:33
valgrind_reduced.log (7.78 KB) valgrind_reduced.log Rogers, Chris, 02 May 2014 09:33
valgrind.log (1.2 MB) valgrind.log Rogers, Chris, 29 May 2014 12:32

Related issues

Related to MAUS - Bug #1466: Memory problem in datastructure?ClosedRogers, Chris16 May 2014

Actions
Related to MAUS - Bug #1483: Celery fails to import MapCpp modulesClosedRogers, Chris05 June 2014

Actions
#1

Updated by Dobbs, Adam almost 10 years ago

Apologies, see the attached files instead (including tracker recon turned on). Same points hold as before.

#2

Updated by Rogers, Chris over 9 years ago

  • Assignee changed from Rajaram, Durga to Rogers, Chris

Ian, feel free to comment obviously.

I have unilaterally decided to take over this issue as Ian has made not much progress for a while and I need it for my online event GUI. So looking at lp:~i-taylor/maus/map-base

There is
  • WrapMapBase a python wrapper code that wraps the C++ modules, provides python interfaces and untangles any data type conflicts at the PyObject* level. WrapMapBase makes a C-style class (i.e. malloc functions etc) rather than using the python API to generate a Python class
  • MapCppTemplate.py takes the C-style class and wraps it in a python class at the python level
  • Some build script work to do find and replace in MapCppTemplate.py to build MapCpp<MyModule>.py
  • MapCppExample... boiler plate C++ modules (one takes a Json representation, one takes a ROOT representation)
  • MapCppExample...Module a bit more python wrapper stuff
  • data_to_string Python/C API interface to DataConverters
So my jobs are:
  • Maps - a little tidying around the edges here:
    • MapCppExample...Module code should be pulled into WrapMapBase - there is not much module-specific code here (Done, 4 hours)
    • Probably I would like to call directly the Python/C API to generate the classes, rather than making a python wrapper for a C-style class. It just seems a little neater. This should go also in WrapMapBase. 8 hours

At this point I would like to do a merge with the trunk. This solves the main issues with reconstruction processing times that developers are screaming about.

  • Extend the mapper example to include python wrappers for Input, Reduce, Output; make sure that it is compatible with the framework code
  • Input:-
    • Need to be able to make a yield call in C, currently I don't know how to do that
    • The critical piece of code is:
                      value = next(input_emitter)
                      my_buffer.append(value.encode('ascii'))
      
      We need to do the cpp->json conversion here, presumably using data_to_string.
    • Guess Input is 32 hours work
  • Reduce:-
    • This is just a case of reproducing the map stuff. It should be a pretty much direct copy. 16 hours work
  • Output:-
    • Again, reproducing the map stuff, slightly different API so a bit more fiddling 24 hours work

Note that I do have a network socket and document DB that can handle json or ROOT data types; but I am dealing with that in the online issue tracker #1312 and it will follow on from this work.

Total estimate: 84 hours, at 30% - so estimated completion date is June time.

#3

Updated by Rogers, Chris over 9 years ago

For information, the change summary between trunk and taylor branch is:

N  src/common_cpp/API/Functions.hh                           
+N  src/common_cpp/API/WrapMapBase.hh
+N  src/map/MapCppExampleJSONValueInput/
+N  src/map/MapCppExampleJSONValueInput/MapCppExampleJSONValueInput.cc
+N  src/map/MapCppExampleJSONValueInput/MapCppExampleJSONValueInput.hh
+N  src/map/MapCppExampleJSONValueInput/MapCppExampleJSONValueInputModule.cc
+N  src/map/MapCppExampleJSONValueInput/sconscript
+N  src/map/MapCppExampleMAUSDataInput/
+N  src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInput.cc
+N  src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInput.hh
+N  src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInputModule.cc
+N  src/map/MapCppExampleMAUSDataInput/sconscript
+N  src/map/MapCppGlobalRecon/
+N  src/map/MapCppGlobalRecon/MapCppGlobalRecon.cc.OTHER
+N  src/map/MapCppGlobalRecon/MapCppGlobalRecon.hh.OTHER
+N  src/map/MapCppGlobalRecon/MapCppGlobalReconModule.cc
+N  src/map/Templates/
+N  src/map/Templates/MapCppTemplate.py
+N  src/py_cpp/PyDataToString.cc
+N  src/py_cpp/PyDataToString.hh
 M  src/common_cpp/API/MapBase-inl.hh
 M  src/common_cpp/API/MapBase.hh
 M  src/common_cpp/Converter/ConverterFactory-inl.hh
 M  src/common_cpp/DataStructure/Spill.hh
 M  src/common_py/maus_build_tools/module_builder.py
 M  src/map/MapCppPrint/MapCppPrint.cc
 M  src/map/MapPyGroup/MapPyGroup.py
 M  tests/cpp_unit/API/MapBaseTest.cc
 M  tests/style/cpplint_exceptions.py
Text conflict in src/common_cpp/API/MapBase-inl.hh
Conflict adding files to src/map/MapCppGlobalRecon.  Created directory.
Conflict because src/map/MapCppGlobalRecon is not versioned, but has versioned children.  Versioned directory.
Contents conflict in src/map/MapCppGlobalRecon/MapCppGlobalRecon.cc
Contents conflict in src/map/MapCppGlobalRecon/MapCppGlobalRecon.hh
Text conflict in tests/style/cpplint_exceptions.py
6 conflicts encountered.                                    

I have a rogers dev branch bzr+ssh://bazaar.launchpad.net/~chris-rogers/maus/api/

#4

Updated by Taylor, Ian over 9 years ago

As I mentioned to Chris the other day, the branch in the repository is not my local working branch, there are some major changes between the two. I have attached a tar file, which contains the current state of the art (I've stripped out some unnecessary directories, to make the file size more manageable).

Having said that, it sounds like Chris's work plan is very reasonable, and that he probably has a better handle on some aspects of the problem than I do. As such, maybe the best idea is to work with the version and plan he already has, and just dip into the latest version if he finds something in the committed code that is clearly broken.

#5

Updated by Rogers, Chris over 9 years ago

I made this into a bzr branch so I could do a diff - bzr+ssh://bazaar.launchpad.net/~chris-rogers/maus/taylor_api_update/ but it would be easier if Ian could just commit his code! I don't care if it doesn't compile, I just want to see it in an unmangled format...

#6

Updated by Taylor, Ian over 9 years ago

Fine, it ain't pretty, but it is in the same location as the old stuff. Anything that has been removed is probably in the dont_compile directory, and it is there because it won't compile until it has been updated... There are a couple of files that have been genuinely removed, mostly because I put the code elsewhere.

#7

Updated by Rogers, Chris over 9 years ago

bzr diff piped through grep -n +++ gives following list of changes

49:+++ dont_compile/APIExceptionsTest.cc    2014-04-17 15:40:12 +0000
117:+++ dont_compile/InputBaseTest.cc    2014-04-17 15:40:12 +0000
288:+++ dont_compile/InputCppDAQData/build/InputCppDAQData.py    2014-04-17 15:40:12 +0000
390:+++ dont_compile/InputCppDAQData/build/InputCppDAQData_wrap.cc    2014-04-17 15:40:12 +0000
4622:+++ dont_compile/InputCppDAQOfflineData/InputCppDAQOfflineData.cc    2014-04-17 15:40:12 +0000
4744:+++ dont_compile/InputCppDAQOfflineData/InputCppDAQOfflineData.hh    2014-04-17 15:40:12 +0000
4840:+++ dont_compile/InputCppDAQOfflineData/InputCppDAQOfflineData.i    2014-04-17 15:40:12 +0000
4865:+++ dont_compile/InputCppDAQOfflineData/build/InputCppDAQOfflineData.py    2014-04-17 15:40:12 +0000
4986:+++ dont_compile/InputCppDAQOfflineData/build/InputCppDAQOfflineData_wrap.cc    2014-04-17 15:40:12 +0000
9435:+++ dont_compile/InputCppDAQOfflineData/sconscript    2014-04-17 15:40:12 +0000
9455:+++ dont_compile/InputCppDAQOfflineData/test_InputCppDAQOfflineData.py    2014-04-17 15:40:12 +0000
9557:+++ dont_compile/InputCppDAQOnlineData/InputCppDAQOnlineData.cc    2014-04-17 15:40:12 +0000
9645:+++ dont_compile/InputCppDAQOnlineData/InputCppDAQOnlineData.hh    2014-04-17 15:40:12 +0000
9725:+++ dont_compile/InputCppDAQOnlineData/InputCppDAQOnlineData.i    2014-04-17 15:40:12 +0000
9750:+++ dont_compile/InputCppDAQOnlineData/sconscript    2014-04-17 15:40:12 +0000
9788:+++ dont_compile/InputCppDAQOnlineData/test_InputCppDAQOnlineData_old    2014-04-17 15:40:12 +0000
9885:+++ dont_compile/MAUSExceptionTest.cc    2014-04-17 15:40:12 +0000
9939:+++ dont_compile/MapBaseTest.cc    2014-04-17 15:40:12 +0000
10154:+++ dont_compile/MapCppExampleJSONValueInput/build/MapCppExampleJSONValueInput.py    2014-04-17 15:40:12 +0000
10225:+++ dont_compile/MapCppGlobalRecon/MapCppGlobalReconModule.cc.OTHER    2014-04-17 15:40:12 +0000
10305:+++ dont_compile/MapCppGlobalRecon/build/MapCppGlobalRecon.py    2014-04-17 15:40:12 +0000
10375:+++ dont_compile/MapCppGlobalRecon/sconscript    2014-04-17 15:40:12 +0000
10390:+++ dont_compile/MapCppGlobalReconPrint/build/MapCppGlobalReconPrint.py    2014-04-17 15:40:12 +0000
10460:+++ dont_compile/MapCppGlobalReconPrint/build/MapCppGlobalReconPrint_wrap.cc    2014-04-17 15:40:12 +0000
14519:+++ dont_compile/MapCppGlobalReconPrint/sconscript    2014-04-17 15:40:12 +0000
14534:+++ dont_compile/MapCppKLCellHits/build/MapCppKLCellHits.py    2014-04-17 15:40:12 +0000
14629:+++ dont_compile/MapCppKLCellHits/build/MapCppKLCellHits_wrap.cc    2014-04-17 15:40:12 +0000
18690:+++ dont_compile/MapCppKLDigits/build/MapCppKLDigits.py    2014-04-17 15:40:12 +0000
18785:+++ dont_compile/MapCppKLDigits/build/MapCppKLDigits_wrap.cc    2014-04-17 15:40:12 +0000
22846:+++ dont_compile/MapCppPrint/build/MapCppPrint.py    2014-04-17 15:40:12 +0000
22941:+++ dont_compile/MapCppPrint/build/MapCppPrint_wrap.cc    2014-04-17 15:40:12 +0000
27001:+++ dont_compile/MapCppSimulation/build/MapCppSimulation.py    2014-04-17 15:40:12 +0000
27096:+++ dont_compile/MapCppSimulation/build/MapCppSimulation_wrap.cc    2014-04-17 15:40:12 +0000
31157:+++ dont_compile/MapCppTOFDigits/build/MapCppTOFDigits.py    2014-04-17 15:40:12 +0000
31252:+++ dont_compile/MapCppTOFDigits/build/MapCppTOFDigits_wrap.cc    2014-04-17 15:40:12 +0000
35313:+++ dont_compile/MapCppTOFMCDigitizer/build/MapCppTOFMCDigitizer.py    2014-04-17 15:40:12 +0000
35414:+++ dont_compile/MapCppTOFMCDigitizer/build/MapCppTOFMCDigitizer_wrap.cc    2014-04-17 15:40:12 +0000
39886:+++ dont_compile/MapCppTOFSlabHits/build/MapCppTOFSlabHits.py    2014-04-17 15:40:12 +0000
39981:+++ dont_compile/MapCppTOFSlabHits/build/MapCppTOFSlabHits_wrap.cc    2014-04-17 15:40:12 +0000
44042:+++ dont_compile/MapCppTOFSpacePoints/build/MapCppTOFSpacePoints.py    2014-04-17 15:40:12 +0000
44137:+++ dont_compile/MapCppTOFSpacePoints/build/MapCppTOFSpacePoints_wrap.cc    2014-04-17 15:40:12 +0000
48198:+++ dont_compile/MapCppTrackerDigits/build/MapCppTrackerDigits.py    2014-04-17 15:40:12 +0000
48295:+++ dont_compile/MapCppTrackerDigits/build/MapCppTrackerDigits_wrap.cc    2014-04-17 15:40:12 +0000
52425:+++ dont_compile/MapCppTrackerMCDigitization/build/MapCppTrackerMCDigitization.py    2014-04-17 15:40:12 +0000
52527:+++ dont_compile/MapCppTrackerMCDigitization/build/MapCppTrackerMCDigitization_wrap.cc    2014-04-17 15:40:12 +0000
57028:+++ dont_compile/MapCppTrackerMCNoise/build/MapCppTrackerMCNoise.py    2014-04-17 15:40:12 +0000
57126:+++ dont_compile/MapCppTrackerMCNoise/build/MapCppTrackerMCNoise_wrap.cc    2014-04-17 15:40:12 +0000
61293:+++ dont_compile/MapCppTrackerRecon/build/MapCppTrackerRecon.py    2014-04-17 15:40:12 +0000
61395:+++ dont_compile/MapCppTrackerRecon/build/MapCppTrackerRecon_wrap.cc    2014-04-17 15:40:12 +0000
65849:+++ dont_compile/ModuleBaseTest.cc    2014-04-17 15:40:12 +0000
66064:+++ dont_compile/OutputBaseTest.cc    2014-04-17 15:40:12 +0000
66243:+++ dont_compile/ReduceBaseTest.cc    2014-04-17 15:40:12 +0000
66432:+++ dont_compile/ReduceCppPatternRecognition/ReduceCppPatternRecognition.cc    2014-04-17 15:40:12 +0000
66544:+++ dont_compile/ReduceCppPatternRecognition/ReduceCppPatternRecognition.hh    2014-04-17 15:40:12 +0000
66627:+++ dont_compile/ReduceCppPatternRecognition/ReduceCppPatternRecognition.i    2014-04-17 15:40:12 +0000
66642:+++ dont_compile/ReduceCppPatternRecognition/build/ReduceCppPatternRecognition.py    2014-04-17 15:40:12 +0000
66739:+++ dont_compile/ReduceCppPatternRecognition/build/ReduceCppPatternRecognition_wrap.cc    2014-04-17 15:40:12 +0000
70860:+++ dont_compile/ReduceCppPatternRecognition/h_spills.json    2014-04-17 15:40:12 +0000
70875:+++ dont_compile/ReduceCppPatternRecognition/s_spills.json    2014-04-17 15:40:12 +0000
70890:+++ dont_compile/ReduceCppPatternRecognition/sconscript    2014-04-17 15:40:12 +0000
70903:+++ dont_compile/ReduceCppPatternRecognition/test_ReduceCppPatternRecognition.py    2014-04-17 15:40:12 +0000
71012:+++ dont_compile/ReduceCppTofCalib/ReduceCppTofCalib.cc    2014-04-17 15:40:12 +0000
71262:+++ dont_compile/ReduceCppTofCalib/ReduceCppTofCalib.hh    2014-04-17 15:40:12 +0000
71393:+++ dont_compile/ReduceCppTofCalib/ReduceCppTofCalib.i    2014-04-17 15:40:12 +0000
71408:+++ dont_compile/ReduceCppTofCalib/build/ReduceCppTofCalib.py    2014-04-17 15:40:12 +0000
71504:+++ dont_compile/ReduceCppTofCalib/build/ReduceCppTofCalib_wrap.cc    2014-04-17 15:40:12 +0000
75591:+++ dont_compile/ReduceCppTofCalib/noDataTest.txt    2014-04-17 15:40:12 +0000
75602:+++ dont_compile/ReduceCppTofCalib/processTest.txt    2014-04-17 15:40:12 +0000
75608:+++ dont_compile/ReduceCppTofCalib/sconscript    2014-04-17 15:40:12 +0000
75621:+++ dont_compile/ReduceCppTofCalib/test_ReduceCppTofCalib.py    2014-04-17 15:40:12 +0000
75705:+++ dont_compile/example_load_json_file.json    2014-04-17 15:40:12 +0000
75737:+++ dont_compile_tests/MapCppSimulationTest.cc    2014-04-17 15:40:12 +0000
75771:+++ itaylor-files/datacard_sim    2014-04-17 15:40:12 +0000
75867:+++ itaylor-files/iantest.py    2014-04-17 15:40:12 +0000
75941:+++ itaylor-files/iantest_map.py    2014-04-17 15:40:12 +0000
76018:+++ itaylor-files/iantest_mem.py    2014-04-17 15:40:12 +0000
76073:+++ itaylor-files/maus_simulation_output.json    2014-04-17 15:40:12 +0000
76084:+++ itaylor-files/os    2014-04-17 15:40:12 +0000
118240:+++ itaylor-files/test_maps.py    2014-04-17 15:40:12 +0000
118306:+++ itaylor-files/xaa.json    2014-04-17 15:40:12 +0000
118312:+++ itaylor-files/xab.json    2014-04-17 15:40:12 +0000
118318:+++ itaylor-files/xac.json    2014-04-17 15:40:12 +0000
118324:+++ itaylor-files/xad.json    2014-04-17 15:40:12 +0000
118330:+++ itaylor-files/xae.json    2014-04-17 15:40:12 +0000
118336:+++ itaylor-files/xaf.json    2014-04-17 15:40:12 +0000
118342:+++ src/common_cpp/API/APIExceptions.hh    2014-04-17 15:40:12 +0000
118427:+++ src/common_cpp/API/Functions.hh    2014-04-17 15:40:12 +0000
118563:+++ src/common_cpp/API/IInput.hh    2014-04-17 15:40:12 +0000
118586:+++ src/common_cpp/API/IMap.hh    2014-04-17 15:40:12 +0000
118622:+++ src/common_cpp/API/IModule.hh    2014-04-17 15:40:12 +0000
118639:+++ src/common_cpp/API/InputBase-inl.hh    2014-04-17 15:40:12 +0000
118695:+++ src/common_cpp/API/InputBase.hh    2014-04-17 15:40:12 +0000
118727:+++ src/common_cpp/API/MapBase-inl.hh    2014-04-17 15:40:12 +0000
118817:+++ src/common_cpp/API/MapBase.hh    2014-04-17 15:40:12 +0000
118893:+++ src/common_cpp/API/ModuleBase-inl.hh    2014-04-17 15:40:12 +0000
118971:+++ src/common_cpp/API/ModuleBase.hh    2014-04-17 15:40:12 +0000
119077:+++ src/common_cpp/API/OutputBase-inl.hh    2014-04-17 15:40:12 +0000
119126:+++ src/common_cpp/API/OutputBase.hh    2014-04-17 15:40:12 +0000
119148:+++ src/common_cpp/API/WrapInputBase.hh    2014-04-17 15:40:12 +0000
119408:+++ src/common_cpp/API/WrapMapBase.hh    2014-04-17 15:40:12 +0000
119619:+++ src/common_cpp/Converter/ConverterExceptions.hh    2014-04-17 15:40:12 +0000
119636:+++ src/common_cpp/Converter/DataConverters/JsonCppJobFooterConverter.cc    2014-04-17 15:40:12 +0000
119649:+++ src/common_cpp/Converter/DataConverters/JsonCppJobHeaderConverter.cc    2014-04-17 15:40:12 +0000
119662:+++ src/common_cpp/Converter/DataConverters/JsonCppRunFooterConverter.cc    2014-04-17 15:40:12 +0000
119675:+++ src/common_cpp/Converter/DataConverters/JsonCppRunHeaderConverter.cc    2014-04-17 15:40:12 +0000
119688:+++ src/common_cpp/Converter/DataConverters/JsonCppSpillConverter.cc    2014-04-17 15:40:12 +0000
119704:+++ src/common_cpp/DataStructure/Data.cc    2014-04-17 15:40:12 +0000
119721:+++ src/common_cpp/DataStructure/Data.hh    2014-04-17 15:40:12 +0000
119766:+++ src/common_cpp/DataStructure/JobFooterData.cc    2014-04-17 15:40:12 +0000
119784:+++ src/common_cpp/DataStructure/JobFooterData.hh    2014-04-17 15:40:12 +0000
119831:+++ src/common_cpp/DataStructure/JobHeaderData.cc    2014-04-17 15:40:12 +0000
119849:+++ src/common_cpp/DataStructure/JobHeaderData.hh    2014-04-17 15:40:12 +0000
119896:+++ src/common_cpp/DataStructure/LinkDef.hh    2014-04-17 15:40:12 +0000
119944:+++ src/common_cpp/DataStructure/MAUSEvent.hh    2014-04-17 15:40:12 +0000
119993:+++ src/common_cpp/DataStructure/RunFooterData.cc    2014-04-17 15:40:12 +0000
120011:+++ src/common_cpp/DataStructure/RunFooterData.hh    2014-04-17 15:40:12 +0000
120058:+++ src/common_cpp/DataStructure/RunHeaderData.cc    2014-04-17 15:40:12 +0000
120076:+++ src/common_cpp/DataStructure/RunHeaderData.hh    2014-04-17 15:40:12 +0000
120120:+++ src/common_cpp/Utils/InterrogateJsonEvent.cc    2014-04-17 15:40:12 +0000
120239:+++ src/common_cpp/Utils/InterrogateJsonEvent.hh    2014-04-17 15:40:12 +0000
120287:+++ src/common_cpp/Utils/InterrogateMAUSEvent.cc    2014-04-17 15:40:12 +0000
120405:+++ src/common_cpp/Utils/InterrogateMAUSEvent.hh    2014-04-17 15:40:12 +0000
120454:+++ src/common_cpp/Utils/JsonWrapper.cc    2014-04-17 15:40:12 +0000
120471:+++ src/common_cpp/Utils/PyObjectWrapper-inl.hh    2014-04-17 15:40:12 +0000
120626:+++ src/common_cpp/Utils/PyObjectWrapper.hh    2014-04-17 15:40:12 +0000
120707:+++ src/input/InputCppDAQOfflineData/sconscript    1970-01-01 00:00:00 +0000
120731:+++ src/input/InputCppDAQOnlineData/sconscript    1970-01-01 00:00:00 +0000
120769:+++ src/input/InputCppDAQOnlineData/test_InputCppDAQOnlineData_old    1970-01-01 00:00:00 +0000
120866:+++ src/input/InputCppRoot/InputCppRoot.i    1970-01-01 00:00:00 +0000
120931:+++ src/input/InputCppRoot/InputCppRootModule.cc    2014-04-17 15:40:12 +0000
121006:+++ src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInput.cc    2014-04-17 15:40:12 +0000
121070:+++ src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInput.hh    2014-04-17 15:40:12 +0000
121083:+++ src/map/MapCppExampleMAUSDataInput/MapCppExampleMAUSDataInputModule.cc.OTHER    2014-04-17 15:40:12 +0000
121162:+++ src/output/OutputCppRoot/OutputCppRoot.cc    2014-04-17 15:40:12 +0000
121557:+++ src/output/OutputCppRoot/OutputCppRoot.hh    2014-04-17 15:40:12 +0000
121592:+++ src/py_cpp/InterrogatePyCapsule.cc    2014-04-17 15:40:12 +0000
121815:+++ src/py_cpp/InterrogatePyCapsule.hh    2014-04-17 15:40:12 +0000
121900:+++ src/py_cpp/PyDataConverters.cc    2014-04-17 15:40:12 +0000
122150:+++ src/py_cpp/PyDataConverters.hh    2014-04-17 15:40:12 +0000
122240:+++ src/reduce/ReduceCppPatternRecognition/ReduceCppPatternRecognition.i    1970-01-01 00:00:00 +0000
122254:+++ src/reduce/ReduceCppPatternRecognition/s_spills.json    1970-01-01 00:00:00 +0000
122269:+++ src/reduce/ReduceCppPatternRecognition/sconscript    1970-01-01 00:00:00 +0000
122282:+++ src/reduce/ReduceCppPatternRecognition/test_ReduceCppPatternRecognition.py    1970-01-01 00:00:00 +0000
122391:+++ src/reduce/ReduceCppTofCalib/ReduceCppTofCalib.hh    1970-01-01 00:00:00 +0000
122522:+++ src/reduce/ReduceCppTofCalib/ReduceCppTofCalib.i    1970-01-01 00:00:00 +0000
122536:+++ src/reduce/ReduceCppTofCalib/noDataTest.txt    1970-01-01 00:00:00 +0000
122547:+++ src/reduce/ReduceCppTofCalib/processTest.txt    1970-01-01 00:00:00 +0000
122553:+++ src/reduce/ReduceCppTofCalib/sconscript    1970-01-01 00:00:00 +0000
122566:+++ src/reduce/ReduceCppTofCalib/test_ReduceCppTofCalib.py    1970-01-01 00:00:00 +0000
122652:+++ tests/cpp_unit/API/MAUSExceptionTest.cc    1970-01-01 00:00:00 +0000
122710:+++ tests/cpp_unit/API/example_load_json_file.json    1970-01-01 00:00:00 +0000
122741:+++ tests/cpp_unit/Map/MapCppSimulationTest.cc    1970-01-01 00:00:00 +0000
122774:+++ tests/cpp_unit/Utils/InterrogateJsonEventTest.cc    2014-04-17 15:40:12 +0000
122887:+++ tests/cpp_unit/Utils/InterrogateMAUSEventTest.cc    2014-04-17 15:40:12 +0000
123111:+++ tests/style/cpplint_exceptions.py    2014-04-17 15:40:12 +0000

Some of it is clearly junk (e.g. swig build output etc). I'll try to extract the good stuff.

#8

Updated by Taylor, Ian over 9 years ago

Ignore everything in dont-compile for now.

Stuff in itaylor-files might be useful for testing conversion between types, but not much more.

py_cpp probably has some good stuff, e.g. for figuring out what is in a PyCapsule.

And then everything in common_cpp is the meat of it. No promises on quality...

#9

Updated by Rogers, Chris over 9 years ago

My understanding is this boils down to

  • src/py_cpp/InterrogatePyCapsule - type-agnostic interfaces to data types for parsing e.g. event type, spill number
  • src/common_cpp/API/Functions.hh src/py_cpp/PyDataConverters and src/common_cpp/Utils/PyObjectWrapper - python libraries for converting between data types on python side
  • src/common_cpp/API/WrapInputBase python wrapper for C Inputter
  • src/common_cpp/API/WrapMapBase python wrapper for C Mapper
  • Tests for above.

Plus some massaging of the other stuff to get it all to work. Probably I would lean on PyROOT and do "InterrogatePyCapsule" on the python side rather than writing a C library, but probably more versatile to do in C... anyhow this all looks very useful.

Note that I could find nowhere wthat did an implicit conversion from THIS representation to THAT representation. So I am implementing it myself.

#10

Updated by Rogers, Chris over 9 years ago

Migrating across the PyDataConverters - the MAUSEvent<TemplateClass> has been changed to a MAUSEvent; this makes the PyDataConverter easier to code (we only need to code it once) but means changing everything else and makes type safety weaker (we pass around void* instead of TemplateClass*). I don't see a compelling reason not to add a bit of extra code in the PyDataConverter to make sure we get the type right here... this would reduce the impact elsewhere...

#11

Updated by Rogers, Chris over 9 years ago

  • Estimated time set to 11.00 h

So I spent today fiddling with concept of using PyROOT instead of PyCapsule. The nice thing about working with PyROOT is that it leaves the functions accessible on the python side, making e.g. InterrogatePyCapsule straight forward to do in python. So I committed a couple of functions to replace conversion from json to spill/etc. In the end this pertains to the Inputters and Reducers, so probably I am getting ahead of myself - but I wanted to check I could get it all to work before committing to a data format (even though it should be more or less behind the scenes).

I still need to do the WrapMapBase python class stuff, and now I need to fiddle a bit to get it to work with PyRoot, hope I am not going backwards.

#12

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 11.00 h to 14.00 h

Spent this morning rehashing the converter code to automatically choose the input type based on the input data; and choose the output type based on user request. Still a work in progress...

#13

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 14.00 h to 21.00 h

Working through the conversion code still. Implemented conversions for

  • string -> string
  • string -> json
  • string -> MAUS::Data
  • string -> PyObject* (py json)
  • json -> string
  • json -> json
  • json -> MAUS::Data
  • json -> PyObject* (py json)
  • MAUS::Data -> string
  • MAUS::Data -> json
  • MAUS::Data -> MAUS::Data
  • MAUS::Data -> PyObject* (py json)
  • PyObject* -> string
  • PyObject* -> json
  • PyObject* -> MAUS::Data
  • PyObject* -> PyObject* (py json)

Logic is only non-trivial for MAUS <-> json (which is already implemented) and PyDict* <-> json (which is 15 lines of code to call the usual json.loads/json.dumps but from C++ side). I have almost finished testing this; working through the error conditions now. The PyObject* wrapping of all of this stuff was done yesterday (but not tested). Once I have finished testing these C++ conversions I will go on to test the PyObjectWrapper, then pull it all into PyWrapMapBase and test that.

#14

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 21.00 h to 24.00 h

I have the explicit conversion code all working now. I am going through testing at the implicit conversion code now. Explicit means that we need to know the input data type, so we do ConverterFactory.convert<std::string, MAUS::Data>() to e.g. convert from a string to a MAUS::Data. Implicit means that we only need to know the output data type (and we let python wrap the rest), so we do e.g. PyObjectWrapper::unwrap_pyobject<MAUS::Data>(a_py_object) to convert from any data format to a MAUS::Data format.

Next step is to add the PyObjectWrapper::unwrap_pyobject calls to the C++ map API, then fiddle with the python side and interfaces thereunto...

#15

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 24.00 h to 30.00 h

Now have the implicit type conversion finished and tested and the MapBase finished and tested. I changed the MapBase API to take only one template type. Previously, for example, process looked like

OUTPUT* process<INPUT, OUTPUT>(INPUT* input)

Now it looks like

void process<TYPE>(TYPE* input)

where any transformations are done in-place (well at least we don't have to allocate any new memory here...)

Going on to look at making a Python wrapper for the MapBase API (to avoid SWIG). I already started this right at the beginning, so maybe not too big a job.

#16

Updated by Rogers, Chris over 9 years ago

  • Category changed from Code Management to Python API
  • Estimated time changed from 30.00 h to 40.00 h

I now have the top level API done and tested for maps only. Things to do before I can merge the map API into the trunk:

  • minor modifications within framework itself (i.e. make sure that everything gets handed off to e.g. celery as a string)
  • style tests
  • (maybe, if time) modify existing maps to use the API
  • run valgrind, check memory usage
  • edit maus user guide

Then on to inputters, reducers, outputters...

#17

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 40.00 h to 43.00 h

Unit tests and style tests are now all passing and the mapper API has gone onto the test server. I am also running a valgrind job against the C++ unit tests...

#18

Updated by Rogers, Chris over 9 years ago

I added some valgrind files (valgrind.log, valgrind_reduced.log). These are from MapCppExampleMAUSDataInput. I believe this is a reasonably good cross-check of the system. I already fixed a memory leak in Spill.DAQ stuff. I can also see memory leaks in:

  • Spill.Recon.SciFiEvent.<stuff>
  • Spill.MC.Track
  • JsonCppSpillConverter

Let me dig a little...

#19

Updated by Rogers, Chris over 9 years ago

So on the JsonCppSpillConverter - I did some detailed study. It looks like PyObjectWrapper::wrap

TPython::ObjectProxy_FromVoidPtr(void_data, "MAUS::Data", true);

is wrapping the cpp_data and storing a shallow pointer. PyObjectWrapper::parse_root_object_proxy
void * vptr = static_cast<void*>(TPyReturn(py_cpp));

Appears also to be a shallow copy, so I believe that the memory never gets deep copied. Now, when I do Py_DECREF(py_cpp), I never see the memory cleaned up. So it may be either - Python is doing it's usual asynchronous memory management and never gets to calling delete before the program ends; or PyROOT does not correctly call delete on wrapped PyObjects. Digging into the ROOT code, bindings/pyroot/src/PyRootType::meta_dealloc( PyRootClass* pytype ) it looks like PyROOT does call the destructor correctly.

So in summary, I believe this is not a memory leak. Gulp.

#20

Updated by Rogers, Chris over 9 years ago

Committed as r711 for maps only.

#21

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 45.00 h to 50.00 h

I have now working examples with MapCppSimulation (a Json type) and MapCppGlobalPID (a MAUS type). I did a little bit of fiddling with the error handling to make it work a little more elegantly. I am still working through the MapCppGlobalPID tests, some of which are failing for unrelated issues (nothing serious so I am fixing anything I find as I go along).

#22

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 50.00 h to 56.00 h

I made it through Global, KL, EMR, TOF; in the middle of Tracker now which is the last group of mappers. The main effort is in enforcing "constness" during the spill processing, which was not previously enforced for most of the mappers. Will need to go back and add docstrings, check comments are still correct. Hopefully that won't take too long.

#23

Updated by Rogers, Chris over 9 years ago

Two issues in the tracker data structure

  1. The Tracker branch of DAQ is not linked in the JsonCppProcessor. Adam Dobbs is looking into it.
  2. Some of the tracker data was registered in Json as a PointerArray - when really it should have been a ReferenceArray. I fixed this one myself - diff below:
cr67@ctr ~/MAUS/maus_rogers_1376 $ bzr diff src/common_cpp/JsonCppProcessors/
=== modified file 'src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.cc'
--- src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.cc    2013-09-26 17:22:51 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.cc    2014-05-16 12:01:12 +0000
@@ -19,8 +19,7 @@
 namespace MAUS {

 SciFiHelicalPRTrackProcessor::SciFiHelicalPRTrackProcessor()
-                   : _sf_spoint_array_proc(new SciFiSpacePointProcessor),
-                     _double_array_proc(new DoubleProcessor) {
+                   : _double_array_proc(new DoubleProcessor) {

     RegisterValueBranch("tracker", &_int_proc,
                         &SciFiHelicalPRTrack::get_tracker,

=== modified file 'src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.hh'
--- src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.hh    2013-04-12 21:45:50 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiHelicalPRTrackProcessor.hh    2014-05-16 12:01:00 +0000
@@ -36,7 +36,7 @@
  private:
     IntProcessor _int_proc;
     DoubleProcessor _double_proc;
-    PointerArrayProcessor<SciFiSpacePoint> _sf_spoint_array_proc;
+    ReferenceArrayProcessor<SciFiSpacePoint> _sf_spoint_array_proc;
     ValueArrayProcessor<double> _double_array_proc;
 };
 } // ~namespace MAUS

=== modified file 'src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.cc'
--- src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.cc    2014-01-10 15:46:44 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.cc    2014-05-16 11:55:36 +0000
@@ -18,8 +18,7 @@

 namespace MAUS {

-SciFiSpacePointProcessor::SciFiSpacePointProcessor()
-                         : _sf_cluster_array_proc(new SciFiClusterProcessor) {
+SciFiSpacePointProcessor::SciFiSpacePointProcessor() {

     RegisterValueBranch("used", &_bool_proc,
                         &SciFiSpacePoint::is_used,

=== modified file 'src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.hh'
--- src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.hh    2012-08-10 18:08:05 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiSpacePointProcessor.hh    2014-05-16 11:55:01 +0000
@@ -39,7 +39,7 @@
     DoubleProcessor _double_proc;
     StringProcessor _string_proc;
     ThreeVectorProcessor _three_vec_proc;
-    PointerArrayProcessor<SciFiCluster> _sf_cluster_array_proc;
+    ReferenceArrayProcessor<SciFiCluster> _sf_cluster_array_proc;
 };
 } // ~namespace MAUS

=== modified file 'src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.cc'
--- src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.cc    2012-08-14 05:01:56 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.cc    2014-05-16 12:01:36 +0000
@@ -18,8 +18,7 @@

 namespace MAUS {

-SciFiStraightPRTrackProcessor::SciFiStraightPRTrackProcessor()
-                         : _sf_spoint_array_proc(new SciFiSpacePointProcessor) {
+SciFiStraightPRTrackProcessor::SciFiStraightPRTrackProcessor() {

     RegisterValueBranch("tracker", &_int_proc,
                         &SciFiStraightPRTrack::get_tracker,

=== modified file 'src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.hh'
--- src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.hh    2012-12-11 17:30:31 +0000
+++ src/common_cpp/JsonCppProcessors/SciFiStraightPRTrackProcessor.hh    2014-05-16 12:01:29 +0000
@@ -36,7 +36,7 @@
  private:
     IntProcessor _int_proc;
     DoubleProcessor _double_proc;
-    PointerArrayProcessor<SciFiSpacePoint> _sf_spoint_array_proc;
+    ReferenceArrayProcessor<SciFiSpacePoint> _sf_spoint_array_proc;
 };
 } // ~namespace MAUS
#24

Updated by Dobbs, Adam over 9 years ago

Thanks Chris. Why does this make a difference though, the data was always there when I looked in the output root files? I don't really understand the processor framework that well...

#25

Updated by Dobbs, Adam over 9 years ago

Also, when I modified the DAQ data processor to account for the tracker raw data, I noticed that most of the unit tests associated with the processor are commented out (only the EMR left in). Is there a reason for this?

#26

Updated by Rogers, Chris over 9 years ago

  • Data structure/serialisation - I will explain next time you are at RAL if you want. I don't want to write the clusters out twice is the point.
  • DAQ data processors - No reason that I am aware of that they should be commented.
#27

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 56.00 h to 59.00 h

Gah, I didn't have time to do as much as I wanted because I was helping with EPICS build tools. So I finished all the Maps, and I am now tidying some edge stuff. Everything compiles, I fixed a bug where the docstrings weren't coming through, I did some fiddling in the error handler, some unit tests are still failing however. Maybe I will finish Maps over the weekend, there should not be much left...

#28

Updated by Rogers, Chris over 9 years ago

I spent a few hours figuring out a problem in the C++ data structure. Looks like GlobalEvent does not produce an internally consistent state on deep copy - TRefs are shallow copied, leaving TRef pointers to the old data structure. This is memory unsafe (likely ends up in a segmentation fault). Note that it is a requirement of ROOT serialisation that we can do a deep copy. The fiddle was that I only found this out at the Cpp->Json stage, and I assumed there was a problem with the Cpp->Json conversion of TRefs. In the process, I realised that the TRef conversion stuff is unnecessary and only makes life more complicated - oh well, I will take it out because I like a simple life (the functionality was already implemented by PointerAsReference stuff)...

I guess I had better fix the global recon data structure then!

In the good news, I am finding some crap in the data structure and fixing it.

#29

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 59.00 h to 67.00 h

Looks like this bug has infected the tracker data structure also. I have fixed it - so now all unit tests are passing. It meant interfering with the copy constructor in SciFiEvent. Nb the clusters weren't copied at all here... oops...

I got distracted by trying to bring some type checking into the TRef stuff, but in the end abandoned this. Looks like the ROOT way is to convert everything to a TObject* and then abandon hope. Could at least do a dynamic_cast to get some type checking, but this is too hard to fiddle with now and I want to get another merge cycle done on this issue.

#30

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 67.00 h to 70.00 h

This is turning into a pig - I am hung up on fixing excrement in the DataStructure...

#31

Updated by Rogers, Chris over 9 years ago

Valgrind came back. Happily there wasn't much crud coming from MAUS. Unhappily there were a whole load of JsonCpp errors (this was leading to a segmentation fault). Log attached. I guess the JsonCpp library is a bit busted...

#32

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 70.00 h to 75.00 h

I spent a lot of time crying over the datastructure stuff. ROOT is a pain. I think the segmentation fault is coming because ROOT can't handle pointer-as-reference elegantly. In GlobalRecon this is handled using TRefArray. In Tracker, it is not handled and this causes MAUS to fail. I cowardly ducked the issue by setting the datastructure to ignore the tracker pointer-as-references... and leave it for tracker group to resolve...

So I am passing all but two unit tests, style tests, now having another go at integration tests.

http://en.wikipedia.org/wiki/1376

#33

Updated by Rogers, Chris over 9 years ago

So jobs left to do on this:

PyWrapInputBase:-
  • Need to be able to make a yield call in C, currently I don't know how to do that
  • The critical piece of code is:
    value = next(input_emitter)
    my_buffer.append(value.encode('ascii'))
    
  • Guess Input is 16 hours work
PyWrapReduceBase:-
  • This is just a case of reproducing the map stuff. It should be a pretty much direct copy. 16 hours work
PyWrapOutputBase:-
  • Again, reproducing the map stuff, slightly different API so a bit more fiddling 16 hours work
Framework:-
  • Can we pass binary-serialised data between celery nodes? If not no problem - 8 hours work

Let's see how it goes...

#34

Updated by Rogers, Chris over 9 years ago

Add in another job - migrate existing inputters/outputters across - 16 hours

Reducers will migrate as part of #1312.

#35

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 75.00 h to 81.00 h

Okay, so I have written "all the code" needed for Inputters and pulled in existing Inputters into the API. It looks okay, InputCppRoot is good, InputCppDAQData I have a smelly segmentation fault. There is some inheritance tree, perhaps this is causing issues. It didn't take as long as I thought it might.

#36

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 81.00 h to 84.00 h

Looks like everything is working and has gone into test. I added PyWrapInputBase and PyWrapInputBaseEmitter, they should be pretty well tested by the existing InputCpp unit tests I hope so I didn't add any more (is this okay?). Let's see how the test job goes

#37

Updated by Rogers, Chris over 9 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Inputters have now gone into the trunk.

Outputters - well Outputters are unique in that they need to handle multiple data types - we want to write to disk the Spill/JobHeader/RunHeader/JobFooter/RunFooter, which is generated by the API. Given that there is only one C++ outputter, OutputCppRoot (I will probably add OutputCppSocket as part of #1312) I am not so bothered.

I think the best bet is to declare this issue closed and look at Reducers and Outputters as part of #1312... there are API changes in there anyway (we move to a C++ Image data structure).

#38

Updated by Rogers, Chris over 9 years ago

  • Estimated time changed from 84.00 h to 87.00 h
#39

Updated by Dobbs, Adam over 8 years ago

Update. Following a meeting of Durga, Chris and Adam on 4th March 15, we have the following jobs left to do:

  • Write Reduce and Output API - Rogers
  • Correct data mangling in Reducers to use new API - Rogers
  • Recfactor CKOV mapper to cpp using MAUS::Data* - Rajaram
  • Refactor KL, TOF mappers to use MAUS::Data* - Rajaram
  • Refactor MapPyRecon - Dobbs
  • Refactor InputCppDAQData to use MAUS::Data* - Dobbs, Rajaram
  • Refactor framework/single_thread.py, framework/multithread.py framework/utilities.py to remove json - Rajaram
  • Refactor MapPyBeamMaker - Rogers

All finish dates presently set at 1 June 15.

#40

Updated by Dobbs, Adam over 8 years ago

I have had a go at refactoring InputCppDAQData to remove JSON in favour of MAUS::Data*. I think it works, however I haven't come up with a way of updating test_InputCppDAQOfflineData.py yet to test it. The code compiles. analyze_data_offline.py now breaks when single_thread.py calls utilities.py and finds a Data* object instead of a json object. The code is available on launchpad at:

lp:~phuccj/maus/speedup

Durga, it might make sense for you to start from here when you come to look at utilities.py and single_thread.py.

Also available in: Atom PDF