Project

General

Profile

Feature #1091

Implementation of pointer conversion from C++ to Json

Added by Rogers, Chris about 9 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
common_cpp
Target version:
Start date:
06 August 2012
Due date:
% Done:

100%

Estimated time:
Workflow:
New Issue

Description

Here I mean cross linking between items in the tree. So for example you have

---SciFiEvent---
 |           |
Digits   Clusters

but you want to record which clusters make up each space point. The correct way to do that is to have a pointer from the SpacePoint to the Cluster. But you can't do

---SciFiEvent---
      |
    Cluster
      |
    Digit

because before the SpacePoint is reconstructed you still need to attach Digit to the SciFiEvent. You can't have e.g.

After digitisation:

---SciFiEvent---
      |
    Digit

After Cluster finding

---SciFiEvent---
      |
    Cluster
      |
    Digit

because that breaks the data structure. So what you really need is cross links

---SciFiEvent---
 |           |
Digits -> Clusters

where the -> is a pointer. The proper way to do this is to add a new converter in JsonCppConverters/Primitives.hh that is a pointer type.

#1

Updated by Rogers, Chris almost 9 years ago

From draft json schema documentation

http://tools.ietf.org/html/draft-pbryan-zyp-json-ref-02#section-3

3. Syntax
   A JSON Reference is a JSON object, which contains a member named
   "$ref", which has a JSON string value.  Example:

   { "$ref": "http://example.com/example.json#/foo/bar" }

   If a JSON value does not have these characteristics, then it SHOULD
   NOT be interpreted as a JSON Reference.

   The "$ref" string value contains a URI [RFC3986], which identifies
   the location of the JSON value being referenced.  It is an error
   condition if the string value does not conform to URI syntax rules.
   Any members other than "$ref" in a JSON Reference object SHALL be
   ignored.

4. Resolution

   Resolution of a JSON Reference object SHOULD yield the referenced
   JSON value.  Implementations MAY choose to replace the reference with
   the referenced value.

   If the URI contained in the JSON Reference value is a relative URI,
   then the base URI resolution MUST be calculated according to
   [RFC3986], section 5.2.  Resolution is performed relative to the
   referring document.

   If a URI contains a fragment identifier, then the fragment should be
   resolved per the fragment resolution mechansim of the referrant
   document.  If the representation of the referrant document is JSON,
   then the fragment identifier SHOULD be interpreted as a
   [JSON-Pointer].

Implementation

Propose that we use this, but only allow internal references (we aren't a web app after all). Propose we add a function to ObjectProcessor "RegisterPointerReference"

    template <class ChildType>
    void RegisterPointerReference(std::string branch_name,
                    ProcessorBase<ChildType>* child_processor,
                    ChildType* (ObjectType::*GetMethod)() const,
                    void (ObjectType::*SetMethod)(ChildType* value),
                    bool is_required);

In order to fill data, we have to work in two phases e.g. for CppToJSON:

  1. Fill the data tree recursively
    • Where a PointerReference is encountered, the JSON memory address of the parent and C++ address of the reference target are stored in a container; the child pointer is filled with None
    • Where a PointerValue is encountered, the JSON address of the child and C++ address of the child is stored in a container.
    • When child objects are evaluated, the parent adds any references to its own reference container.
  2. void ObjectProcessor::EvaluateReferences() function attempts to replace all the None/NULL values for PointerReferences with the appropriate data for all child branches
  3. bool ObjectProcessor::AreReferencesFilled() function returns True if all references are filled; else False

This works in both C++ -> JSON and JSON -> C++ conversion. Need to think about what the container should be (std::map isn't good enough I think, may need custom container). Should we always attempt to EvaluateReferences or only do so at the end/after user request?

Test Cases

  • Parent has two child references with different branch names but reference same object?
  • Cross references - branch_1["a"] references branch_2 and branch_2["b"] references branch_1. Should be okay.
  • Circular references - branch_1 refererences branch_2 and branch_2 references branch_1. Should be okay, but EvaluateReferences will clearly fail as the data was never stored.
  • Trying to evaluate result of a failed evaluation - i.e. reference target is None/NULL. Should throw an exception.
  • Edit to add: strings vs arrays - object reference with a "0" vs array reference to entry 0 - should handle correctly.

Other options that I won't Implement

The main issue with implementation here is that
  1. on conversion from C++ to JSON the JSON object may not have been created
  2. on conversion from JSON to C++ the C++ object may not have been created

A resolution would be to force the user to declare the JSON location of the object. Then at conversion from C++ to JSON the JSON object can be stored directly. At conversion from JSON to C++ we have to force the evaluation of the JSON tree. This implementation was rejected because if we have crossed references (references from branch 1 to branch 2; and references from branch 2 to branch 1) we may need to simultaneously evaluate two sections of the tree, at which point things get convoluted. It just feels difficult.

I also considered using the RegisterPointerBranch function rather than adding an identical one. In this case, MAUS determines at runtime which pointer stores the actual data and which pointer stores the reference. I decided not to do this because I thought the user would want to explicitly decide which pointer has the data and which is merely the reference. I really didnt want some weird situation where some data in e.g. an array is stored as reference and some data is stored as real data.

#2

Updated by Rogers, Chris almost 9 years ago

By the way, Adam - you requested the feature, please feel free to comment on whether this meets your needs or am I barking up the wrong tree...

#3

Updated by Dobbs, Adam almost 9 years ago

Hi Chris. The technical details of how it is implemented are going over my head at the moment. The main thing I wanted to do was just as you described in issue header, being able to refer from a higher level object (such as a track) to a lower level object (such as a spacepoint) without breaking the data structure or introducing needless replication of data.

A use case would be if, say, someone wanted to look at the proportion of triplet type spacepoints that make up tracks, compared to the triplet proportion of all spacepoints. How it is implemented I don't really mind so long as the functionality is there, particularly in the ROOT output.

It is also worth noting that the restriction on nested levels in ROOT, referred to issue #1137, may well limit the usefulness I had foreseen for this feature. That said, I still believe it needs doing.

Ad

#4

Updated by Rogers, Chris almost 9 years ago

Alternately this could be a workaround for #1137... that is to say, you have

Spill -> ReconEvent -> Digits
                    -> Clusters
                    -> SpacePoints
                    -> Tracks

These are all plottable as they are at level 3. But then to relate the Clusters to Digits you use the References as outlined above.

Spill -> ReconEvent -> Digits
                    -> Clusters -R> Digits
                    -> SpacePoints -R> Clusters
                    -> Tracks -R> SpacePoints

which is useful in more detailed analysis (where you need a script anyhow)

#5

Updated by Rogers, Chris almost 9 years ago

Note also YAML reference handling:

http://en.wikipedia.org/wiki/YAML

We won't do it this way... but interesting to note.

#6

Updated by Rogers, Chris almost 9 years ago

Committed first pass at successful c++ to json conversion (in rogers/maus/devel_2) for json objects only. List of things:

  • Conversion json -> C++
  • Implement ArrayPointerRef or equivalent for arrays of cross references
  • Get new path info into all json types (at the moment only objects and primitives)
  • Implementation at top level i.e. SpillConverter
  • For JsonToCpp, append path information prior to conversion
  • Strip path information out after conversion
  • Handle required_branch
  • Error handling - multiple pointers to same data; unallocated reference
  • Test funny topology of references
  • What happens if there is an unrelated exception during data processing? Need to make sure that RefManager is deathed
  • Many cosmetic improvements
  • Documentation
  • Worry: the clear/delete phase will get forgotten on Manager classes leading to errors
#7

Updated by Rogers, Chris almost 9 years ago

Criticism of implementation (could be TODO)

  • There is a lot of static/singleton type stuff. This originates from the fact I store a static std::map<pointer_in, pointer_out> on the Resolvers. One could instead make a thin wrapper around std::map with the same inheritance structure as for the resolvers to remove staticness. That would make the whole algorithm more robust, and is more-or-less necessary if we need to parse multiple data trees simultaneously.
    • Indeed, this would also have some improvement in execution speed as currently we call ClearData on all of the Resolvers, to clear static data which has probably been cleared elsewhere.
  • There is some cut n paste between JsonToCpp and CppToJson side in ReferenceResolvers, with a few subtle differences that make it difficult to really copy. Probably this could be cleaned up with an appropriate abstraction, at the cost of some obscurity perhaps.
  • It may be better to have a (templated) function call like AddReference(...) for each Reference type on the RefManager. This would hide some obscurity/implementation details from poor Johnny user. OTOH this is already hidden at ObjectProcessor level - most users shouldn't have to dig even as far as the RefManager.
  • ReferenceResolver should have its own subdirectory from JsonCppProcessors/Common; and split the files; and move ObjectProcessors etc up to Common; and move tests to appropriate subdirectories... I will raise as a new issue though I think...
  • Would like to speed things up - main slow down is probably from putting the path into JSON, which could be optimised.
#8

Updated by Rogers, Chris almost 9 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100
#9

Updated by Rogers, Chris almost 9 years ago

Merged in r837

#10

Updated by Rogers, Chris almost 9 years ago

  • Target version changed from Future MAUS release to MAUS-v0.4.1

Also available in: Atom PDF