Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple climatological normal periods #231

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

rod-glover
Copy link
Contributor

@rod-glover rod-glover commented Jan 16, 2025

Resolves #229

NOTE: This branch extends i-226-hxtk-obs_raw but will be rebased when #227 is merged.

TODO:

@rod-glover rod-glover marked this pull request as draft January 16, 2025 20:19
@rod-glover
Copy link
Contributor Author

There are still some outstanding questions to be answered regarding this design. These are copied directly from the Confluence document Data model(s) for multiple climatological normal periods.

Climatological Station

  1. A Climatological Station, unlike a Station, does not sensibly have a history. Therefore this entity combines select attributes of the existing Station and History tables.
  2. What does the term "base station" mean as used in Multiple climatological normal periods on the PCDS - Conceptual work plan ?
  3. Multiple climatological normal periods on the PCDS - Conceptual work plan includes an attribute "Composite (binary): Indicates if station is a long-record or composite station"
    1. What is a long-record station? (And what if any is its relationship to the term "base station"?)
    2. Can the "long-record-ness" of a Climatological Station be deduced from the History records associated to it?
    3. Are there any other types of station we might want to consider? Any reason not to use instead of composite a more open-ended value named "type" or the like?
  4. How should we represent position? With an attribute of type geometry, or with latitude and longitude attributes, or both (which has caused us problems in the existing History table)?
  5. The relation histories will be implemented by a cross-table. That table could, if desired, include info about how each History participates in the Climatological Station. That can be added later if desired.

Climatological Variable

  1. We've followed CF Metadata Standards, 7.4 Climatological Statistics, in using climatology_bounds and value_time to indicate the climatological parameters of the statistic. Is there any reason not to do this?
  2. Given that, duration can be computed from climatology_bounds, hence is redundant and can be removed (normalization). Nevertheless it might be convenient, at the risk of not being consistent with climatology_bounds. Omit
  3. Attribute precision is in the existing table Variable a numeric value that is abused – e.g., 3 digits before the decimal and 2 digits after is represented as the real number 3.2. Let's not do that here. How should we do it? Propose either a string (with format restrictions) or a two-element vector of integers.
  4. What if anything is the use of net_var_name in this context? Does it need to be of type citext?

Climatological Value

  1. Multiple climatological normal periods on the PCDS - Conceptual work plan includes an attribute "Years per month: years of data going into each month".
    1. So the number of years going into a month (or other annual subdivision) varies by month? This is a novel datum; I'd like to understand it better.
    2. We should generalize this to arbitrary periods, so "Years per annual subdivision" or other generic term, suitably turned into an identifier, say num_contributing_years.
  2. See remark re. CF Metadata Standards, value_time above.

@rod-glover rod-glover force-pushed the i-229-multi-climo-normals branch from 2101ba8 to f9ac3e3 Compare January 22, 2025 22:02
@rod-glover
Copy link
Contributor Author

More questions:

  1. Many columns in the corresponding observation tables are (a) fixed-length strings and (b) nullable. In the new tables they are neither.
    • Fixed-length strings aren't really: they all are stored as variable length by PG. So there is no particular reason to specify lengths, except perhaps to prevent long strings from being entered erroneously. Questionably, some columns such as comments have fixed lengths, and others that might be expected to be short (possibly) fixed, such as country, do not.
    • Why do the original tables allow null values in so many places. E.g., All columns in History. Nulls in most of these columns cannot be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support multiple climatological normal periods
1 participant