Project

General

Profile

Bug #1817

Alignment corrections in CDB

Added by Rajaram, Durga almost 6 years ago. Updated over 5 years ago.

Status:
Open
Priority:
Normal
Category:
Database
Start date:
01 February 2016
Due date:
% Done:

0%

Estimated time:

Description

We want to store in a CDB table, alignment and analysis-based corrections to the surveyed geometry.

This makes it easier to keep track of the alignment stuff and allows one to, if needed, apply the corrections for chosen run ranges without having to release a new geometry when the alignment gets updated.

But this would mean that the geometry_download script will have to
  • pull down the corrections from the CDB (if corrections exist for a run)
  • update the positions/rotations in the GDML and ParentGeometry.dat files

As far as the end-user goes, there should be no difference in how the geometry is used.

Propose the following schema:


ID (int)
Module_Name (string)
dx(float)
dx_err (float)
dy (float)
dy_err (float)
dz (float)
dz_err (float)
dx_rot (float)
dx_rot_err (float)
dy_rot (float)
dy_rot_err (float)
dz_rot (float)
dz_rot_err (float)
valid_from (timestamp)//

I am imagining the Module_Name would be e.g. TOF1/Tracker0/Tracker1/EMR/ etc

For the API,

get_corrections_for_run(run, module_name) 
get_corrections_for_date(date, module_name)

set_corrections(module, list_of_corrections)

#1

Updated by Rajaram, Durga almost 6 years ago

From email exchange: Ryan said he would like to think about it a little more.

The API specification should come from Ryan since the geometry service will have to access the corrections and update the downloaded geometry files.

#2

Updated by Bayes, Ryan almost 6 years ago

Regarding the schema all of the required information is present. The error is useful from a physics standpoint but I am not sure of the relevance for the implementation in the geometry (unless it is intended to set some kind of priority). I will say more words about the "ID" below because I am not sure what is meant.

We must decide what pieces of information will be stored here; will it be corrections to the positions or will it be the corrected positions. From the standpoint of transparency to the user it would be more useful to store the corrections so that the original (surveyed) values can be determined from the set of geometry files output by the geometry download. To make this work however the corrections must be made specific to a reference geometry; otherwise the corrections are meaningless. The ID for the reference geometry should be given explicitly, unless that is what "ID" in the proposed specification means; it is not clear to me whether that is the geometry ID or a self referential ID. This may considered to be taken into account from the time stamp, but consider the case if a new geometry is introduced, say after the solenoids are re-positioned say, but the calibrations are not similarly updated. If it is meant to be the reference geometry ID I apologize for the previous sentences.

In terms of how the download would work, I imagine that the proposed API would produce an XML object that can be written to the Maus_Information file with the cooling channel and beam line information. The geometry ID must be matched to the ID reference in the alignment download (can we call it that? It sounds more specific than "correction"). That xml object can then be used to correct the positions of the detector objects in both the GDML (Step_IV.gdml at this time) and the ParentGeometryFile.dat output. To make this work the Module_Name should be matched to the field names in the MICE_Information/Detector_Information node contained in the Maus_Information.gdml file. The fields are named "TOF0", "TOF1", "TOF2", "KL", "Ckov1", "Ckov2", "Tracker0", "Tracker1", and "EMR". The name of the physical volume stored in these objects provide the reference to correct the GDML files. The pseudo code would then look like

- download alignments (GDMLtoCDB.py)
- update Maus_Information file with alignments (GDMLFormatter)
- match entries with detector objects.
- update positions in detector positions in GDML based on alignments.
- write mice modules as normal. 

I would propose that the corrections (alignments) to the magnetic fields be added to this list incidentally on the grounds that these are also analysis based.

I am not sure that downloading the corrections by specific module is sufficient for record keeping purposes. I think it is much more useful to download all the applicable corrections (alignments) at once. This is both for the purpose of record keeping and as a procedural point... right now the geometry download is partitioned from the formatting. Applying the corrections is a function of the geometry formatting so it does not make sense to make requests to the CDB for every detector node independently. The API should be expanded to consider this use case;

get_corrections_for_run_xml(run)
get_corrections_for_date_xml(run)

get_corrections_for_run(run, module_name) 
get_corrections_for_date(date, module_name)

set_corrections(module, list_of_corrections)
#3

Updated by Rajaram, Durga almost 6 years ago

Bayes, Ryan wrote:

Regarding the schema all of the required information is present. The error is useful from a physics standpoint but I am not sure of the relevance for the implementation in the geometry (unless it is intended to set some kind of priority). I will say more words about the "ID" below because I am not sure what is meant.

Yes, the error is not necessary for the geometry implementation. But since it comes out "naturally" from the alignment analysis, I think it's better to store it even if it is not used. It is possible that some analysis might want to know what the error on the alignment corrections are.

We must decide what pieces of information will be stored here; will it be corrections to the positions or will it be the corrected positions. From the standpoint of transparency to the user it would be more useful to store the corrections so that the original (surveyed) values can be determined from the set of geometry files output by the geometry download.

Yes -- it will/should be corrections to the positions/rotations, and not corrected positions/rotations.

To make this work however the corrections must be made specific to a reference geometry; otherwise the corrections are meaningless. The ID for the reference geometry should be given explicitly, unless that is what "ID" in the proposed specification means; it is not clear to me whether that is the geometry ID or a self referential ID. This may considered to be taken into account from the time stamp, but consider the case if a new geometry is introduced, say after the solenoids are re-positioned say, but the calibrations are not similarly updated. If it is meant to be the reference geometry ID I apologize for the previous sentences.

Yes -- it is meant to the geometry ID that was used to generate the alignment numbers. Sorry this wasn't clear.

In terms of how the download would work, I imagine that the proposed API would produce an XML object that can be written to the Maus_Information file with the cooling channel and beam line information. The geometry ID must be matched to the ID reference in the alignment download (can we call it that? It sounds more specific than "correction"). That xml object can then be used to correct the positions of the detector objects in both the GDML (Step_IV.gdml at this time) and the ParentGeometryFile.dat output. To make this work the Module_Name should be matched to the field names in the MICE_Information/Detector_Information node contained in the Maus_Information.gdml file. The fields are named "TOF0", "TOF1", "TOF2", "KL", "Ckov1", "Ckov2", "Tracker0", "Tracker1", and "EMR". The name of the physical volume stored in these objects provide the reference to correct the GDML files. The pseudo code would then look like

That sounds fine to me. I'm guessing Janusz will have some kind of validator on the server side to make sure that the module name in the setter matches one of the allowed names. Since we're designing this from scratch it would be worthwhile making sure the list of allowed module names includes possibilities from the Cooling Demo stage as well. I'm thinking RF won't need any geometry-related corrections, but if there are things which should go in, we should try to list them now.

I would propose that the corrections (alignments) to the magnetic fields be added to this list incidentally on the grounds that these are also analysis based.

That's a good point. But I think that should be handled in a different table since at this point it's not clear to me the kinds of things that will come out of the field alignment (corrections to coil currents from ripples? temperature? hall probes?). The schema for that would probably look very different and I'd rather keep it separate from the detector alignment table.

I am not sure that downloading the corrections by specific module is sufficient for record keeping purposes. I think it is much more useful to download all the applicable corrections (alignments) at once. This is both for the purpose of record keeping and as a procedural point... right now the geometry download is partitioned from the formatting. Applying the corrections is a function of the geometry formatting so it does not make sense to make requests to the CDB for every detector node independently. The API should be expanded to consider this use case;

You are right. That makes sense. Your proposed API sounds good.

#4

Updated by Rajaram, Durga almost 6 years ago

So, for Janusz, based on Ryan's comments, this is the final schema:

ID (autoincrement int ?)
Geometry ID (int)
Module_Name (string)
dx(float)
dx_err (float)
dy (float)
dy_err (float)
dz (float)
dz_err (float)
dx_rot (float)
dx_rot_err (float)
dy_rot (float)
dy_rot_err (float)
dz_rot (float)
dz_rot_err (float)
valid_from (timestamp)//

For the API:

get_corrections_for_run_xml(run) // returns xml with ALL modules for given run
get_corrections_for_date_xml(date) // returns xml with ALL modules for given date

get_corrections_for_run(run)  // returns python dict
get_corrections_for_date(date) // returns python dict

set_corrections(correction_list_of_dictionaries, geometry_id, valid_from_time)

Ryan - are you OK with this API?

Can you comment on whether an XML response like this will be OK for you?
e.g.

<GeometryID=NNNN  ID = />
    <ModuleName="TOF0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>
    <ModuleName="TRACKER0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>

Also,
  • For the setter:
    • Should all module names be required in the setting?
      • i.e. should user 0-fill modules for which corrections are not available, or should it be left to the DB to 0-fill? Any preference?
    • Are you case-sensitive when it comes to module names?
      • for e.g. to tie them into the GDML when you get a response back
#5

Updated by Rajaram, Durga almost 6 years ago

Upon further discussion with Janusz, a few issues we need to think about:

  • I think we need to allow the table to support alignment corrections at a run-to-run level
    • This means that in the table
      run number
      is a more logical and appropriate key than
      valid_from_time
    • This also means that the table needs to have a
      creation_time
      in order to be able to support updating numbers (from refinement in analysis, bug fixes, improved reco etc) for the same run
    • So, the setter call would be something like
      set_corrections(correction_list_of_dictionaries, run_number, geometry_id)
      *** *Q*: should the run_number be constrained to be >= the geometry_id valid_from_time? (and <= geometry_id valid_until_time)
      **** I think so, since as Ryan pointed out, we need to be able to tie the corrections to the geometry they're correcting?
      *** *Q:*: do we want to support a client call like <pre>get_corrections_for_geometry(geometry_id)?
      **** This will be very tricky if there are corrections corresponding to different run-ranges within the same geometry period
      
      Thoughts?
      
      Inputs from geometry and analysis will be appreciated.
#6

Updated by Bayes, Ryan almost 6 years ago

I think the xml response looks good (although I expect there to be a <Corrections> ... <\Corrections> group (or something like it) containing the results. Given the above API definitions I can start to write the downloaders. Once the final version comes out I can make corrections as necessary.

I believe that the xml parser algorithms are case sensitive so the case does matter.

Regarding the look-up key, I think that using run number rather than a "valid_from_time" is a matter of generalization and taste. In principle the run_number changes monotonically (but not uniformly) in time so there should be no difference. The download algorithms can (and should) be set up to reference keys using either field although for the purpose of reconstruction there will obviously be a preference for run number. Either way there should be a creation_time (which I thought was a given for some reason).

In the model of run to run corrections a client call like "get_corrections_for_geometry" could work if it is understood to return a range of results indexed by a valid from time. To make it work for a range of runs I imagine that you would need a start run and an end run, I think.

#7

Updated by Bayes, Ryan almost 6 years ago

A possibly small correction.

I was writing the parsing algorithms and it struck me that the xml output that we agreed to be,
<GeometryID=NNNN  ID = />
    <ModuleName="TOF0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>
    <ModuleName="TRACKER0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>

should be

<GeometryID  value="NNN" />
    <ModuleName name="TOF0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>
    <ModuleName name="TRACKER0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>

I hope that is a simple thing to fix.

#8

Updated by Martyniak, Janusz almost 6 years ago

Ryan,

Your XML document proposal does not take into account (it does not carry) a run number. In fact it does not have to, as long as the setter call includes it (see update #5). It looks like we need creation time to 'order' the updates, so the last is valid and the run number to place the correction within the validity of a geometry ID. Now a question arises: while it gives us more flexibility because it allows to store multiple run by run correction for a given geometry ID it might make setting a bit complex for a user/expert. Say, you have geometry which spans across 100 runs and you want to apply 50 corrections for this run period (2 runs per correction on average in this case) would you need:

<GeometryID  value="NNN" />
   <Run  value=5075 />
     <ModuleName ... />

this format of the document ? This would work for a XML type of a setter (which takes the document above), but a higher level of a setter would either need to take a run range somehow or take a more complex multilevel dict Python object than just a list of dictionaries (modules). Do we need a non XML setter at all ?
And obviously if you have 100 runs and only one correction for all of them you would have to include the same stuff 100 times. RunMin and RunMax in the XML doc instead ?
cheers JM

#9

Updated by Rajaram, Durga almost 6 years ago

Martyniak, Janusz wrote:

Ryan,

Your XML document proposal does not take into account (it does not carry) a run number. In fact it does not have to, as long as the setter call includes it (see update #5).

Ok, I agree, you will need the run number range in the setter if the user wants to specify a run-range for the corrections -- a "run start" and "run end". "Run number"s are more natural for a user than "valid_from_time"s.

However, this should be optional.
i.e.
  • If the user does not specify the run start, the server will insert a RunStart equal to the earliest run for this geometry ID.
  • Similarly, if user does not specify run end, then the server will set the corrections valid until the last run valid for this geometry period.

It looks like we need creation time to 'order' the updates, so the last is valid and the run number to place the correction within the validity of a geometry ID.

Fine

Now a question arises: while it gives us more flexibility because it allows to store multiple run by run correction for a given geometry ID it might make setting a bit complex for a user/expert. Say, you have geometry which spans across 100 runs and you want to apply 50 corrections for this run period (2 runs per correction on average in this case) would you need:
....
this format of the document ? This would work for a XML type of a setter (which takes the document above), but a higher level of a setter would either need to take a run range somehow or take a more complex multilevel dict Python object than just a list of dictionaries (modules).

The support for run-by-corrections will be addressed by the run-start, run-end feature described above.

Do we need a non XML setter at all ?

I would say yes. Most users in MICE are familiar with python so it is simpler asking them to provide a python dict containing the corrections, rather than an xml doc. The getter of course will return an XML doc (see below) which the geometry manager can process.

The server will validate the data contained in the dictionary.

So, to summarize:
  • the setter will be like so
    • set_corrections(python_dict_of_corrections, geometryID, run-start(optional), run-end(optional))
      • server will validate that the run-start and run-end (if supplied) are within the geometry validity period
      • server will validate the data contained in the dictionary
      • a creation time will be associated with each set of corrections uploaded
  • the getter will provide APIs like so:
    •   get_corrections_for_run_xml(run) // returns xml with ALL modules for given run
        get_corrections_for_date_xml(date) // returns xml with ALL modules for given date
      
  • the XML response will look like so
    • <GeometryID  value="NNN" />
          <RunStart  value=7321 />
          <RunEnd   value=7543 />
          <ModuleName name="TOF0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>
          <ModuleName name="TRACKER0"  dx=012.345 dy=.... dz=... dx_rot=... dy_rot=... dz_rot=..../>
      

Ryan -- can you OK or edit this, so Janusz can proceed with it?

#10

Updated by Bayes, Ryan almost 6 years ago

This looks very good. Janusz should proceed.

#11

Updated by Rajaram, Durga over 5 years ago

Recap of discussion at CM44
  • Get rid of runmax
  • Allow for run-start
  • So the table design will be
    ID (autoincrement int) -- automatic
    Geometry ID (int) -- user/setter is required to supply this
    Run start (int) -- optional
    Creation time (timestamp) -- automatically inserted by server
    
#12

Updated by Martyniak, Janusz over 5 years ago

I have installed the first implementation of the Geometry Corrections related code on preprod.
  • the setting can be done in 2 ways:
    def set_corrections(modules, geometry_id, comment='')
    def set_corrections_xml(corr_xml) 

modules - a list of dictionaries, one dict per module containing all
the key-value pairs:

name (string)
dx(float)
dx_err (float)
dy (float)
dy_err (float)
dz (float)
dz_err (float)
dx_rot (float)
dx_rot_err (float)
dy_rot (float)
dy_rot_err (float)
dz_rot (float)
dz_rot_err (float),

like so:

modules=[{'name':'TOF0','dx':12.1,'dx_err':0.01, 'dy':22.20,'dy_err':0.02, 'dz':32.30, 'dz_err':0.029,.....}, {......}]

In case of the second call, the XML document has a format:

"<GeometryID  value='12'>
  <ModuleName name='TOF0' dx='12.1' dxerr='0.01' dy='22.2' dyerr='0.02' dz='32.3' dzerr='0.03' dxrot='0.11' dxroterr='0.011' dyrot='0.12' dyroterr='0.021' dzrot='0.13' dzroterr='0.031'/>
  <ModuleName name='TRACKER0' dx='12.4' dxerr='0.001' dyerr='0.002' dzerr='0.003' dy='22.5' dz='32.6' dxrot='0.14' dyrot='0.15' dzrot='0.14' dxroterr='0.0011' dyroterr='0.0021' dzroterr='0.0031' />
</GeometryID>" 

This is also a format of a document returned by one of 2 getters:

def get_correction_for_geometry_id(gid)
def get_corrections_for_run_xml(run)

The Python code is on my launchpad tree as usual: bazaar.launchpad.net/~janusz-martyniak/mcdb/mice.cdb.client.api-python/

I'm still working on some bits of the server and the client, so some changes might be uploaded to launchpad in near future.

#13

Updated by Martyniak, Janusz over 5 years ago

I have changed the name of a getter by id to read:

def get_correction_for_geometry_id_xml(gid)

Also added some unit tests. Rev 53.
best, JM

#14

Updated by Martyniak, Janusz over 5 years ago

Hi,
I modified the API to accept a run number to be provided together with the geometry id. For a setter it acts as a minimum run number as outlined in #9.
The run number is optional for the setter:

def set_corrections(modules, geometry_id, comment='', runmin=0)

If an xml counterpart of the setter is used, the runmin attribute may be provided with the XML message (see #13).
The geometry_id and runmin are checked for compatibility and an exception is thrown if runmin does not belong to geometry_id validity period.

The getters are as follows:

 
def get_correction_for_geometry_id_xml(gid, run)
def get_corrections_for_run_xml(run)

The run number parameter in the first call is obligatory, and the run existence is not checked in this case (this could be modified). If gid and run are not compatible an exception is thrown.

The second method checks for run existence as a part of matching the gid is corresponds to.

Installed on preprod, with fake corrections for gid=12. You have to update your Python client from launchpad.

Also available in: Atom PDF