Skip to content
This repository has been archived by the owner on Jun 3, 2023. It is now read-only.

An updated to the JSON distribution schema #6

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

BobbyRBruce
Copy link
Member

This distribution reformatting has the following benefits:

  • The min and max of a given bin are clearly stated.
  • Permits statistics with bins of different sizes or sparce bins.
  • The 'numBins' and 'binSize', 'min', and 'max' stats are no explicitly
    (and redundantly) stated. They may be derived easily if needed.

This distribution reformatting has the following benefits:
- The min and max of a given bin are clearly stated.
- Permits statistics with bins of different sizes or sparce bins.
- The 'numBins' and 'binSize', 'min', and 'max' stats are no explicitly
(and redundantly) stated. They may be derived easily if needed.
Copy link
Contributor

@powerjg powerjg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this schema! I think it gives the best flexibility.

However, I think it's a good idea to also allow a more compact format with just numbers as values. Basically a distribution could look like two things:

"hist" : {
  "value": [
    {"min": 0, "max": 1, "count": 7},
    {"min": 1, "max": 2, "count": 2},
    {"min": 2, "max": 3, "count": 4}
  ]
}

OR

"hist" : {
  "value": [7,2,3],
  "binSize": 1
}

I see how my previous implementation of the schema didn't quite work the way I expected. Do you think it's possible to have a secondary more compact representation? I think I could figure out the right json-schema syntax to get it to work if we can agree on a compact representation.

I have one comment below... I think that "value" should be "count" in the bins.

Thoughts?

simstats.schema.json Outdated Show resolved Hide resolved
BobbyRBruce and others added 3 commits January 18, 2021 11:35
* Added new properties of a Distribtion
* Allowed the value of a distribution to be an array of numbers. This
allows for a simple distribution with a fixed bin size, starting from
zero.
@BobbyRBruce
Copy link
Member Author

Hey @powerjg , thanks for the comments. I've change value to count, and altered the schema so the "value" of a distribution may be an array of numbers. However, I don't know how to specify via the schema that when an array of numbers is given, the binSize must be specified. Likewise, I don't know how to state binSize value should not be specified if using the more detailed "value" (where the min and max for each bin is specified).

I've also added more fields to the Distribution (overflow, underflow, sum, etc.).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants