-
Notifications
You must be signed in to change notification settings - Fork 10
SCT with fg
We validate observations based on spatial properties and first-guess values by applying a Spatial Consistency Test (SCT). The SCT compares observed values to expected values derived from neighboring observations and first-guess values. If the observed values deviate significantly from the expected ones, the observation is flagged as suspect. The significance of these deviations is determined by how many times the difference exceeds the estimated observation error variance. This process is similar to a hypothesis test, where we evaluate whether an observation is good or suspect. The SCT score summarizes this comparison, and if the score exceeds a specified threshold, the observation is considered suspect or bad (terms used interchangeably).
All spatial analysis calculations are based on Optimal Interpolation (OI), as described in:
Lussana, C., Uboldi, F., and Salvati, M.R. (2010), A spatial consistency test for surface observations from mesoscale meteorological networks, Q.J.R. Meteorol. Soc., 136: 1075-1088. https://doi.org/10.1002/qj.622
Definitions
Pseudo-Algorithm
Function Signature
Diagnostic File
- Good Observation: An observation that accurately represents the actual atmospheric state, with reasonable accuracy and precision compared to its nearest neighbors.
- Suspect or Bad Observation: An observation that does not accurately represent the atmospheric state or has significantly different accuracy or precision than its neighbors.
- Centroid Observation: The center point of two concentric circles—the outer circle and inner circle—with radii outer_circle and inner_circle, respectively.
- Outer Circle: The area used to select observations for assessing the quality of one or more observations simultaneously. It may include observations that help evaluate others but are not themselves assessed for quality.
- Inner Circle: The area that allows multiple observations to be flagged at the same time. Checking more observations simultaneously speeds up the quality control process but increases the risk of misclassifying good observations as suspect.
- SCT-score: Represents the "likelihood" that an observation is suspect. A higher SCT-score increases the chance of an observation being flagged as suspect, depending on the user-defined threshold.
-
SCT iteration
- Gradually tighten thresholds with each iteration, making it harder to flag observations as suspect.
- Detection Loop: Identify suspect observations.
- Cluster Preservation Loop: Save data that blends well with neighbors.
- Stray Data Redemption Loop: Bringing back good observations that got caught in the wrong crowd.
- Flag Assignment Step: Assign a final flag to each observation based on detection, cluster preservation, and redemption results.
- Exit Condition: Terminate the SCT iteration if no suspect observations are found or the maximum number of iterations is reached.
The final flag is assigned by the Stray Data Redemption Loop, which relies on the flags from the previous two loops. Subsequent SCT iterations do not change the suspect flags from earlier iterations, but they may flag previously good observations as suspect.
- Loop over all observations:
- Check if the current observation qualifies as a centroid observation.
- If yes, gather all neighbors within the outer circle that were not flagged as suspect in previous SCT iterations.
- If the observation is isolated, exit without flagging it.
- For all selected observations, perform analysis and leave-one-out analysis.
- Estimate observation error variance within the outer circle.
- Compute the SCT score for all selected observations within the inner circle based on analysis, leave-one-out analysis, and observation error variance.
- Flag as suspect any observations (in the inner circle) with an SCT score exceeding the specified threshold.
- Loop over all observations flagged by the Detection Loop:
- Treat each flagged observation as a centroid observation.
- Retrieve all neighbors close to the centroid (same neighbors as in the Detection Loop).
- For each selected observation, perform both analysis and leave-one-out analysis. These are based on an iterative Optimal Interpolation (OI) scheme with four iterations. In the first iteration, the background is the first-guess values. For subsequent iterations, the background is the leave-one-out analysis. The final analysis and leave-one-out analysis should be closer to observed values, especially when clusters of observations reconstruct similar atmospheric patterns.
- Estimate the observation error variance within the outer circle.
- Compute the SCT score for the centroid observation based on the analysis, leave-one-out analysis, and observation error variance.
- Flag the centroid observation as suspect if the SCT score exceeds the threshold.
- Loop over all observations flagged by the Cluster Preservation Loop:
- Treat each flagged observation as a centroid observation.
- Retrieve all neighbors close to the centroid and consider only non-flagged neighbours
- Break out if observation isolated and flag it as suspect
- For all selected observations, perform analysis and leave-one-out analysis.
- Estimate observation error variance within the outer circle.
- Compute the SCT score for the centroid observation based on the analysis, leave-one-out analysis, and observation error variance.
- Flag the centroid observation as suspect if the SCT score exceeds the threshold.
ivec titanlib::sct_with_fg(const Points& points,
const vec& values,
const vec& background_values,
float values_min,
float values_max,
int num_min,
int num_max,
float inner_radius,
float outer_radius,
int num_iterations,
float min_horizontal_scale,
float max_horizontal_scale,
float vertical_scale,
const vec& pos,
const vec& neg,
const vec& eps2,
const vec& min_obs_var,
bool diagnostics,
const std::string& filename_diagnostics,
vec& sct_scores,
const ivec& obs_to_check)
Description:
- points: Longitude, latitude, and elevation of observation locations
- values: Observed values
- background_values: First-guess values at observation locations
- values_min: Minimum acceptable observed value (set equal to values_max to ignore)
- values_max: Maximum acceptable observed value (set equal to values_min to ignore)
- num_min: Minimum required observations within the outer radius (must be > 1)
- num_max: Maximum observations used for the test (must be > num_min)
- inner_radius: Radius for flagging [m]
- outer_radius: Radius for computing OI [m]
- num_iterations: Maximum iterations (stops if no new flags are set)
- min_horizontal_scale: Minimum horizontal decorrelation length [m]
- max_horizontal_scale: Maximum horizontal decorrelation length [m]
- vertical_scale: Vertical decorrelation length [m]
- pos: Allowed positive deviation
- neg: Allowed negative deviation
- eps2: Observation-to-background error variance ratio (e.g., 0.5 means observations are trusted twice as much as the background)
- min_obs_var: Minimum observation error variance (reflects estimated representativeness error or expected observation uncertainty at min_horizontal_scale)
- diagnostics: Should we write the diagnostics on a file? True or False
- filename_diagnostics: Diagnostics filename
- obs_to_check: Observations to be checked (1 = check, 0 = ignore)
Returns:
- Flags indicating suspect observations (1 = suspect, 0 = good)
- sct_scores SCT (Gross error) score per observation (higher values indicate a greater likelihood of measurement or large representativeness error)
Header:
it;loop;curr;i;index;lon;lat;z;yo;yb;ya;yav;dh;sig2;flags_d;scores_d;flags_c;scores_c;saved_c;flags_r;scores_r;saved_r;flags;sct_scores;
Description:
- it: SCT iteration.
- loop: Loop index—1: Detection Loop, 2: Cluster Preservation Loop, 3: Stray Data Redemption Loop, 4: Flag Assignment Step.
- curr: Index of the centroid observation.
- i: Index of an observation within the outer circle.
- index: Index of an observation in the outer circle, referencing the full observation vector.
- lon: Longitude.
- lat: Latitude.
- z: Elevation (m a.m.s.l.).
- yo: Observed value.
- yb: Background value.
- ya: Analysis.
- yav: Leave-one-out analysis.
- dh: Horizontal decorrelation length for the Gaussian correlation function used in OI (m).
- sig2: Observation error variance within the outer circle.
- flags_d: Flag assigned by the Detection Loop.
- scores_d: SCT score from the Detection Loop.
- flags_c: Flag assigned by the Cluster Preservation Loop.
- scores_c: SCT score from the Cluster Preservation Loop.
- saved_c: Indicates whether the Cluster Preservation Loop saved this observation.
- flags_r: Flag assigned by the Stray Data Redemption Loop.
- scores_r: SCT score from the Stray Data Redemption Loop.
- saved_r: Indicates whether the Stray Data Redemption Loop saved this observation.
- flags: Final assigned flag.
- sct_scores: SCT score, either from the latest SCT iteration or from when the observation was flagged as suspect.
Copyright © 2019-2023 Norwegian Meteorological Institute