-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
content-sqlite: verify database integrity during module load #4340
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Problem: if content-sqlite encounters a fatal error, it may not unregister itself from the content cache in the broker before exiting the module thread. Fix the control flow so the backing store is always unregistered on failure.
Problem: if database is corrupt, content-sqlite may load successfully then fail a load/store request, resulting in secondary failures such as a KVS module load failure, and content-flush hang in rc3. Add the sqlite "quick_check" pragma which detects a corrupt database during module initialization. Fixes flux-framework#4338
Thanks, I tacked on a bit of testing to try to improve the numbers. |
Problem: there is no coverage for content-sqlite's handling of a corrupt database file. Attempt to create a couple of forms of database corruption and ensure that the content-sqlite module fails to load.
Would it be worthwhile to copy the |
Maybe, although I haven't seen that database get corrupted on my test system. Be aware the overhead (when the pragma is evaluated during initialization) is proprtional to the number of rows in the database. |
ahhh, which there would no cleanup in the job-archive dbs long term. I guess its ok to leave this out for the time being in |
Codecov Report
@@ Coverage Diff @@
## master #4340 +/- ##
==========================================
+ Coverage 83.62% 83.65% +0.03%
==========================================
Files 389 389
Lines 65548 65550 +2
==========================================
+ Hits 54814 54836 +22
+ Misses 10734 10714 -20
|
I couldn't quite hit the code path I wanted by selectively mangling the database file, although I did cover another uncovered path. Oh well, let's call this good. Thanks for the review! Setting MWP. |
Problem: if the content-sqlite database is corrupted, this is typically discovered at runtime when the KVS attempts to load/store a blob, which causes secondary fallout. It would be better to catch that condition early.
Add the quick_check sqlite pragma. Since pragmas are executed before the module enters the reactor, if this check fails, the module load fails, and rc1 aborts before the higher level services are loaded.
This addresses #4338