How to write histogram reducers¶
Reducers can be written to create histograms which are then updated when successive spills are received by the reducer. MAUS ships with reducers which create histograms using matplotlib and PyROOT. These reducers are:
ReducePyHistogramTDCADCCounts
- This takes in a spill, extracts the TDC and ADC counts, updates a matplotlib histogram and outputs an image of this embedded in a JSON document.
- Source file:
src/reduce/ReducePyHistogramTDCADCCounts/ReducePyHistogramTDCADCCounts.py
ReducePyTOFPlot
- This takes in a spill, extracts slab hits and space points information, updates a collection of PyROOT histograms and outputs images of these embedded in JSON documents. The histograms are output for every N spills successfully handled (N is configurable by the user).
- Source file:
src/reduce/ReducePyTOFPlot/ReducePyTOFPlot.py
Histogram reducers do not (and should not) save their histogram images to files. Instead, they create JSON documents containing the image data:
{"image": {"keywords": [...list of image keywords...], "description":"A textual description of the image", "tag": "TAG", "image_type": "EXTENSION", "data": "...base 64 encoded image..."}}
where:
TAG
is a simple name or tag that can be used to create an image file name.EXTENSION
is the desired file extension, usually just the image type e.g.eps
orpng
.
For example,
{"image": {"keywords":["TDC", "ADC", "counts"], "description":"Total TDC and ADC counts to spill 2", "tag": "tdcadc", "image_type": "eps", "data": "...base 64 encoded image..."}}
MAUS has an output worker, OutputPyImage
which can save image files when given these documents, which is described below.
MAUS also provides super-classes for histogram reducers which provide functions to build these documents for you.
Before you start to write your own histogram reducer, or customise the examples listed above, it's useful to understand these histogram reducer super-classes which handle certain actions and provide useful functions you can use (and which are sub-classed by the above).
Histogram reducer super-classes¶
There are two histogram reducer super-classes, one for matplotlib and one for PyROOT:
ReducePyMatplotlibHistogram
- Source file:
src/reduce/ReducePyMatplotlibHistogram/ReducePyMatplotlibHistogram.py
- Source file:
ReducePyROOTHistogram
- Source file:
src/reduce/ReducePyROOTHistogram/ReducePyROOTHistogram.py
- Source file:
You do not need to look at the source code for these, but an understanding of what they offer and what they do will help you write your own reducers.
These each take care of handling the following operations for you.
Initialisation - the __init__
function¶
- Spill count - count of spills read to date, initially
0
. - Image type - initially
eps
(Enhanced PostScript). - Auto-numbering - should image names be auto-numbered using the spill count, initially
False
. - ROOT batch mode (
ReducePyROOTHistogram
only) - should PyROOT be run in interactive mode, initially0
(False). - Supported image types (
ReducePyROOTHistogram
only) - a list of image types supported by PyROOT (currently["ps", "eps", "gif", "jpg", "jpeg", "pdf", "svg", "png"]
).
Birth and configuration - the birth
function¶
- Reading and validation of configuration parameters.
histogram_auto_number
which determines if image names are auto-numbered using the spill count.histogram_image_type
which specifies the data format of histogram images output.- If omitted then the default of
eps
is used. - For
ReducePyROOTHistogram
, if a value is provided by the user then it will be validated using the supported image types (currently["ps", "eps", "gif", "jpg", "jpeg", "pdf", "svg", "png"]
). - For
ReducePyMatplotlibHistogram
, if a value is provided by the user then it will be validated using a matplotlibFigureCanvas
to see if that file type is supported by matplotlib (currently[svg, ps, emf, rgba, raw, svgz, pdf, eps, png]
).
- If omitted then the default of
root_batch_mode
(ReducePyROOTHistogram
only) which determines if PyROOT be run in interactive mode.
- Sub-class-specific configuration, via invocation of
_configure_at_birth
, see below.
Processing of spills - the process
function¶
- Converting a spill from a string to a JSON document.
- Sub-class-specific processing, via invocation of
_update_histograms
, see below. - Converting a list of output JSON documents to a string.
- Handling of errors occurring in
_update_histograms
.
Death and clean-up - the death
function¶
- Cleaning up.
- Sub-class-specific clean-up, via invocation of
_cleanup_at_death
, see below. - For
ReducePyROOTHistogram
, cleaning up of any zombie PyROOT objects.
What the spill count counts¶
Both histogram reducer super-classes keep count of the number of spills received in an attribute self.spill_count
. This holds the number of spills received by the reducer. This is not the same as the number of spills used to update the histograms since some spills received may have errors or be missing information required to update the histogram.
Other useful functions¶
Both ReducePyROOTHistogram
and ReducePyMatplotlibHistogram
provide other functions which you may find useful.
ReducePyROOTHistogram
provides:
get_image_doc(self, keywords, description, tag, canvas)
which can be used to create JSON documents with image data in a form suitable forOutputPyImage
. It:- Prints the contents of the given PyROOT
canvas
in the form of data in the current image type and saves this into a temporary file. - Reloads this temporary file.
- Creates a JSON image document with the image data base 64 encoded, the given
keywords
(a list of strings) anddescription
string (a simple textual description of the image content) and the given imagetag
. - If auto numbering of images has been enabled then the current spill number will be added to the
tag
zero-padded to make a 6 digit number (e.g. 000123). - The JSON document is then returned.
- Prints the contents of the given PyROOT
ReducePyMatplotlibHistogram
provides:
_get_image_doc(self, keywords, description, tag, canvas)
which can be used to create JSON documents with image data in a form suitable forOutputPyImage
. It:- Prints the contents of the matplotlib
FigureCanvas
in the form of data in the current image type and saves this into a string buffer. - Creates a JSON image document with the image data base 64 encoded, the given
keywords
(a list of strings) anddescription
string (a simple textual description of the image content) and the given imagetag
. - If auto numbering of images has been enabled then the current spill number will be added to the
tag
zero-padded to make a 6 digit number (e.g. 000123). - The JSON document is then returned.
- Prints the contents of the matplotlib
_create_histogram(self)
which creates and returns a matplotlibFigureCanvas
object, with figure size 6x6, axes and a grid._rescale_axes(self, histogram, xmin, xmax, ymin, ymax, xfudge = 0.5, yfudge = 0.5)
which rescales the X and Y axes of a histogram in aFigureCanvas
to ensure that the given X and Y ranges are visible.- The fudge factors can be provided to avoid matplotlib warning about
Attempting to set identical bottom==top
which arises if the axes are set to be exactly the maximum of the data.
- The fudge factors can be provided to avoid matplotlib warning about
Sub-classing histogram reducer super-classes - what you need to implement¶
Your reducer sub-class needs to provide three functions.
Initialisation - __init__(self)
¶
- Your class constructor.
- This should first invoke the super-class constructor to do super-class-specific initialisation e.g.
ReducePyROOTHistogram.__init__(self)
- or
ReducePyMatplotlibHistogram.__init__(self)
- Then it should perform initialisation of attributes specific to your class. For example:
ReducePyROOTHistogram
initialises the refresh rate (the number of spills to process before outputting a histogram).ReducePyMatplotlibHistogram
initialises the TDC and ADC counts.
Birth and configuration - _configure_at_birth(self, config_doc)
¶
- Called by
birth
, this function takes a JSON configuration document. - It should extract any additional sub-class-specific configuration from this. For example:
ReducePyROOTHistogram
checks for arefresh_rate
configuration parameter.ReducePyMatplotlibHistogram
initialises the TDC and ADC counts.
- It should create the histogram plot objects. For example:
ReducePyROOTHistogram
createsROOT.TH1F
andROOT.TCanvas
objects.ReducePyMatplotlibHistogram
creates a matplotlibFigureCanvas
.
- If configuration and creation is successful it should return
True
. - Any errors should be raised as exceptions e.g. if there is a missing mandatory configuration parameter then
ValueError
could be thrown.
Processing of spills - _update_histograms(self, spill)
¶
- Called by
process
, this function should extract information from the spill and update the histograms. - It should check that the spill has the information needed.
- If not it can either ignore the spill or raise an error. The super-class will manage the insertion of the error into the spill. For example:
ReducePyROOTHistogram
does:if not self.get_slab_hits(spill): raise ValueError("slab_hits not in spill")
ReducePyMatplotlibHistogram
does:if "digits" not in spill: raise KeyError("digits field is not in spill")
- The function can then update the histograms.
- The function must return a list of one or more spills. This can be one of:
[{}]
- a list with an empty spill. You may want to return this when handlingend_of_run
spills, see below.[spill]
- a list with the input spill. You may want to do this if you only output histograms after every N spills have been read, so when the spill count isn't divisible by N you can just return the input spill. This is done byReducePyROOTHistogram
:# Refresh canvases at requested frequency. if self.spill_count % self.refresh_rate == 0: self.update_histos() return self.get_histogram_images() else: return [spill]
[image,...]
- a list of one or more JSON image documents. How you build this is up to you but you can use the super-class utility functions. For example:ReducePyTOFPlot
calls the following, whereself.canvas_nsp
contains a PyROOTROOT.TCanvas
:image_list = [] ... doc = ReducePyROOTHistogram.get_image_doc( \ self, keywords, description, tag, self.canvas_nsp) image_list.append(doc)
ReducePyHistogramTDCADCCounts
does the following, whereself._tdcadchistogram
contains a matplotlibFigureCanvas
:image_doc = ReducePyMatplotlibHistogram._get_image_doc( \ self, self._keywords, self._description, self._tag, \ self._tdcadchistogram) return [image_doc]
- The function must also handle
end_of_run
spills- At the end of a run, reducers receive an
end_of_run
spill. This is a spill with adaq_event_type
field with valueend_of_run
. This is so that, in cases where a reducer only takes action every N spills, it can take any final actions (e.g. output the final histograms). - You need to detect and handle this spill. If your reducer outputs the histogram for every spill then it can just return an empty spill e.g.
ReducePyHistogramTDCADCCounts
does this:def _update_histograms(self, spill): ... if (spill.has_key("daq_event_type") and spill["daq_event_type"] == "end_of_run"): return [{}] ...
- If however it only outputs histograms for every N spills then this is where the final histograms should be output e.g.
ReducePyTOFPlot
does this:def _update_histograms(self, spill): ... if (spill.has_key("daq_event_type") and spill["daq_event_type"] == "end_of_run"): if (not self.run_ended): self.update_histos() self.run_ended = True return self.get_histogram_images() else: return [{}] ...
- At the end of a run, reducers receive an
Death and clean-up - _cleanup_at_death(self)
¶
If your reducer needs to do specific clean-up functions then it can also implement this function.
- Called by
death
, this does any sub-class-specific cleanup. - This should first invoke the super-class function to do super-class-specific clean-up e.g.
ReducePyROOTHistogram.__cleanup_at_death__(self)
- or
ReducePyMatplotlibHistogram.__cleanup_at_death__(self)
- If clean-up is successful it should return
True
. - If there is no sub-class specific clean-up required then you don't need to provide this function.
Remember, do NOT save image files¶
Reducers should not save files, that is the responsibility of output workers. Histogram reducers should output a JSON document with the base 64 encoded image data in the form described above.
These can be saved using the OutputPyImage
worker.
How to save the images to files - OutputPyImage
¶
As histogram reducers output JSON documents with histogram image data and are not meant to save histograms themselves, how then do you save the images?
This is the role of the OutputPyImage
output worker. OutputPyImage
takes in JSON documents of form:
{"image": {"keywords": [...list of image keywords...], "description":"...a description of the image...", "tag": "TAG", "image_type": "EXTENSION", "data": "...base 64 encoded image..."}}
It decodes the base 64 encoded image data and saves it in a file DIRECTORY/PREFIXTAG.EXTENSION
where:
DIRECTORY
is a directory specified in animage_directory
configuration parameter ("data card") provided whenOutputPyImage
is first created.- If this directory does not exist then it will be created.
- If no such configuration parameter is given then the current directory is used.
PREFIX
is a file prefix specified in animage_file_prefix_ configuration parameter provided when @OutputPyImage
is first created.- If no such configuration parameter is given then the default of
image
is used.
- If no such configuration parameter is given then the default of
TAG
is the value of thetag
field in the JSON document.EXTENSION
is the value of theimage_type
field in the JSON document.
In addition, a file DIRECTORY/PREFIXTAG.json
will also be saved with the image meta-data. This will be the image
JSON document but without the data
field i.e.:
{"image": {"keywords": [...list of image keywords...], "description":"...a description of the image...", "tag": "TAG", "image_type": "EXTENSION"}}
So, for example, if OutputPyImage
is configured with parameters:
image_directory="/home/user/plots" prefix="histogram_"
and receives a JSON document of form:
{"image": {"keywords":["TDC", "ADC", "counts"], "description":"Total TDC and ADC counts to spill 2", "tag": "tdcadc", "image_type": "eps", "data": "...base 64 encoded image..."}}
then it will save the image data into a file /home/user/plots/histogram_tdcadc.eps
and the JSON file, /home/user/plots/histogram_tdcadc.json
will contain:
{"image": {"keywords":["TDC", "ADC", "counts"], "description":"Total TDC and ADC counts to spill 2", "tag": "tdcadc", "image_type": "eps"}}
OutputPyImage
does not validate that the image data is consistent with the image type - it just decodes the base 64 encoded image and saves it.
Why is the image data base 64 encoded in the JSON document?¶
JSON documents are converted to and from strings as they pass through MAUS workers. This can cause exceptions to be thrown if passing raw image data for certain formats (e.g. PNG). Base 64 encoding the image data prevents such exceptions arising, and allows any sort of image data to be passed around MAUS in a JSON document.
Updated by Jackson, Mike about 11 years ago ยท 32 revisions