-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database connection error handling #268
Comments
moved discussion here @sonallux |
Anything related to connection losses on idle connections in the connection pool should be resolved with jvalue/node-dry-pg#2. When performing a query, the connection pool will automatically try to establish a new connection if there is no active connection in the pool. Therefore those errors should not concern the services as the The First, we must distinguish between database schema violations errors (e.g. client errors) and network-related errors. I am going to focus only on the network-related errors now. Further, we should also distinguish between the origins of the database queries because each of them will need a different error handling. I have currently identified these three origins: Service startup (table initilization)Here we do have two options:
The wait and retry option is convenient in development. But if the ODS should ever run in production, I would move to the fail-fast option, because the database initialization is only needed on the very first start of the database. On all other service startups executing the database initialization is unnecessary and can break things if it is not idempotent. When running in production database migrations are also a scenario that will arise at some point. For me, database initialization and database migration are actually very similar and ideally should be handled similarly. But as database migrations are a complex topic on its own, they should be handled separately when it is needed. REST requestFor me, the only option is to return a 5XX error. This is already done as the error of the rejected query Promise just bubbles up till it is caught by the default express error handler, which returns a 500 response (see #247) Async message/eventI think we are currently just logging the error. This is definitely not an appropriate error handling mechanism. In those cases, I would use the feature of rejecting/nacking messages back to the message broker, so they do not get lost. Then the message can either be redelivered or put in a dead-letter queue. Further pointsHere are some further points that can influence the above decisions. Most of the points do not affect us now. But I would like to mention them here, as they are getting important when running the ODS in production with live traffic.
|
I agree with everything you say! Since we have retries on startup configured via end variables, we can set the retries to 0 on future Kubernetes deployments and let k8s restart the containers on failure. The schema initialization + migrations should be handled differently than like right now. But that can happen later on when we have a version deployed. |
How will error handling work in the services if connections to the db break?
Should they be able to use a subscribeToError function to perform custom error handling?
Or do we want to handle that synchronously when requesting queries? Should we catch that specific case and throw WebExceptions 500 "Database currently not reachable" or something similar?
Originally posted by @georg-schwarz in jvalue/node-dry-pg#2 (comment)
The text was updated successfully, but these errors were encountered: