You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Imagine a simple scenario where Electric is running inside a Kubernetes pod with no persistent storage. Some shapes get created, so Electric creates a publication in Postgres and starts processing transactions.
When the pod is restarted, a new file system is created for it with no traces of the previous shape storage. Electric will no longer be able to process incoming transactions from Postgres, causing the latter to build up its WAL backlog indefinitely.
Electric should be able to detect this failure mode and drop transactions when there is no active shape collector process instead of keeping those transactions around by refusing to advance the replication slot.
The text was updated successfully, but these errors were encountered:
I think this is also related to #1774 - if Electric boots up and has no shapes, it should update the replication slot accordingly. I feel that there should be a mechanism/service that keeps the publication properly configured, and perhaps that should also inform how to handle "deprecated" transactions
For multi-tenancy, losing storage would mean we don't know where the databases are, so we couldn't clean up the publications. But multi-tenancy is an advanced use case so perhaps we can get away with not dealing with it.
Electric should be able to detect this failure mode and drop transactions when there is no active shape collector process instead of keeping those transactions around by refusing to advance the replication slot.
If I read correctly, when a fresh Electric finds a pre-exiting replication slot it will find that it is in an inconsistent state with local metadata (which is empty) and doesn't use it. This is a conservative approach in cases the owner of the replication slot connects later.
if Electric boots up and has no shapes, it should update the replication slot accordingly.
@msfstef, meaning just drop the replication slot and recreate it right? shall we make sure this intention intentional, --force-recreatea-replication-slot or being more optimistic since it's important to make sure that we cleanup the WAL and shapes are cheap anyways (cc @KyleAMathews@robacourt )
Imagine a simple scenario where Electric is running inside a Kubernetes pod with no persistent storage. Some shapes get created, so Electric creates a publication in Postgres and starts processing transactions.
When the pod is restarted, a new file system is created for it with no traces of the previous shape storage. Electric will no longer be able to process incoming transactions from Postgres, causing the latter to build up its WAL backlog indefinitely.
Electric should be able to detect this failure mode and drop transactions when there is no active shape collector process instead of keeping those transactions around by refusing to advance the replication slot.
The text was updated successfully, but these errors were encountered: