-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix discrepancies with legacy portal in station / history counts #83
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor questions and comments. Looking great!
const formatDate = d => d ? d.toISOString().substr(0,10) : 'unknown'; | ||
|
||
const lexCompare = (a, b) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for sorting in the metadata table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is for sorting lists of things, e.g., lists of names, obs freqs. These arise because a station can have many histories, hence many names, obs freqs, etc. (These lists are reduced to their unique elements.) lexCompare
implements standard lexicographic ordering for lists.
I plan to document all this in the code / README before I merge.
{ | ||
minWidth: 80, | ||
maxWidth: 100, | ||
id: 'Uniq Obs Freqs', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly does the "Uniq" in this label refer to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stations can have multiple histories, hence multiple freqs; these are reduced to their unique elements (see also comment above).
// If we don't do this, and we use strict date matching, then a station with | ||
// several histories fully covering an interval will not be selected, even | ||
// though it should be. The question of what "adjacent" means is a bit tricky | ||
// ... would depend in part on history.freq to distinguish too-large gaps, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about gaps. The users of the data necessarily has to assume that there will be gaps... that's just the nature of weather data collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting. First, let's eliminate what I'm calling "strict" date matching, which is the condition
min_obs_time < uiStartDate < uiEndDate < max_obs_time
, plus allowance for nils
meaning complete containment in the observation interval. This is currently not used.
Currently, in this app, legacy PDP matching is used, which is the looser condition
min_obs_time < uiEndDate && uiStartDate < max_obs_time
, plus allowance for nils
meaning overlap with the observation interval
Currently, for both legacy PDP and this app, date matching is done separately for each history in a station. If this isn't right, it can be adjusted in this app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant about gaps is this: Consider a station with multiple, say 2, histories spanning dates A
to B
(hx1) and C
to D
(hx2).
Suppose the user specifies start and end dates s
, e
such that A < s < B < C < e < D
. The legacy matching rule will operate as follows:
- hx1:
A < e && s < B === true
: match - hx2
C < e && s < D === true
: match
That's as desired.
But when we have, say, A < B < s < e < C < D
then:
- hx1:
A < e && s < B === false
: no match - hx2
C < e && s < D === false
: no match
When start or end date fall into the gap between the histories' intervals, the matching fails. Even one of them in the gap means that one of the two histories will not match.
If we decide that the two histories really form one contiguous period, then this matching rule is incorrect. It will bite us if there are large gaps (C - B
) into which a start or end date could easily fall. In that case, we'd want a test that used only A
to D
as the interval, and matched both histories on that basis. That's not hard, but it's different than we have now.
Also, it gets more complicated if we decide that a "large" gap (C - B
) means something different than a small gap.
Also, maybe there's a complication with multiple histories that overlap: A < C < B < D
or the like.
Questions:
- Does the gap problem matter?
- Legacy PDP's effective definition of "station" is "history", and effectively ignores
meta_station
records. SDP treats history records linked by a common station as related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One further note: Because they are drawn from station_obs_stats_mv
, which is updated directly from observations, min_obs_time
and max_obs_time
are never null (oops: except if there are no observations), and therefore the checking for nulls in the matching rules can (maybe) be simplified accordingly. This is true both for legacy PDP and for SDP.
But they are not like edate
, in which null means "ongoing". Ongoing or not, these values are non-null except when there are no observations at all for that history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this discussion to a separate issue.
I still have some multiple history-per-station updates to push. Just a couple of oversights. I'm also looking at why the app is so sluggish. My timing results suggest that it is rendering, not internal computation (e.g., filtering stations) that is so slow. Got a couple of ideas of how to improve it. |
The sluggishness is definitively in rendering the station markers. If If If With full marker content (tooltip + popup), the delay is about 6 s. |
Further investigation shows that |
1643ace
to
579adc9
Compare
I addressed your questions above. |
If you think that it's in good shape, I trust you. A cursory review of the range filtering shows it to be working like I expect it to. This is probably a separate issue, but I noticed that the delete feature on the map doesn't work? One can hit the garbage can, get a cursor that says "Click on a feature to remove." But clicking has no effect. "Clear All" works OK. |
I'll put in an issue for the delete problem. Not sure why -- it works in CE, but not here. However, we are using different versions of React Leaflet and Leaflet in this app. |
Resolves #81
Resolves #43
This PR depends on pacificclimate/station-data-portal-backend#25, which adds reliable
min_obs_time
andmax_obs_time
attributes to station history responses. This enables us to compare times on the same basis as the legacy portal.This PR includes the following changes:
with
statement!Notes: