Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checks on CatalogueExport file #236

Open
MaximMoinat opened this issue Apr 25, 2022 · 3 comments
Open

Add checks on CatalogueExport file #236

MaximMoinat opened this issue Apr 25, 2022 · 3 comments

Comments

@MaximMoinat
Copy link

We are seeing some outliers in the DatabaseCatalogue due to data quality issues. For example, a negative Cumulative Observation Time (Observation Period). This makes the data visualisations hard to interpret.

One way to improve this would be to include checks on the file imported for the database dashboard. If e.g. negative times are found, then the user should get a message back. Some checks will only trigger a warning and others result in the upload being rejected.

Note: data quality checks are also being done by the DQD. We should not try to redesign that, but only focus on data issues that would give problems with the Dashboard visualisations.

@aspedrosa
Copy link
Contributor

On this tool, we are assuming that all data coming from CatalogueExport is correct, if it is not, shouldn't this be corrected at the data generation level there (CatalogueExport)?

@MaximMoinat
Copy link
Author

We could indeed do a clean up in the R script of the CatalogueExport. However, there we do not know what kind of visualisations are made and what outliers would create issues. So ideally, each (new) visualisation has some expectations on in what ranges it expects data. This would then have to be implemented on the NetworkDashboard side.

@MaximMoinat
Copy link
Author

Your solution as proposed in issue #232 would also work here.

Then, we can move these data checks and provide warnings when generating the data in the CatalogueExport. Still, it would be really nice if the NetworkDashboard also gives a warning/error when uploading unexpected data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants