Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync enhancements #36

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Sync enhancements #36

wants to merge 8 commits into from

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Jul 4, 2023

Currently, all k8sv1.Objects are constantly bulk upserted, regardless of whether their properties or just their sub-resources have changed. For example, when a container is terminated, the pod is updated first and then all its containers (not just the one that was terminated), pod_{owner, volume,condition}, and so on. As you can see, so many useless queries are executed just because a container has changed its state. This PR addresses exactly this shortcoming of config sync.

Initial Sync

To make this work reliably, all tables have a properties checksum attribute from now on, which contains the SHA1 of all columns. In addition, each of the sync pipelines provides its own internal cache store, which is populated with the respective properties checksum when initially starting the sync. This takes place as follows:

  • Sync#warmup() retrieves all database records for the specified sync subject and their relations using DB#YieldAll() recursively. All base types (i.e. entities implementing the contracts.Resource interface) are handed over to the controller, where the fingerprint of all entities loaded from the DB is added to the sync cache store simultaneously.
    if err := s.store.Add(e.(contracts.Entity).Fingerprint()); err != nil {
    return err
    }
    // The controller doesn't need to know about the entities of a k8s sub resource.
    if resource, ok := e.(contracts.Resource); ok {
    if err := c.Announce(resource); err != nil {
    return err
    }
    }

    The Fingerprint is tiny and usually contains the PropertiesChecksum, the entity and its parent unique identifier like ID, PodId. However, when the sync pipeline is being started with the NoWarmup sync feature, the process described above is a no-op.

Upsert events

When k8s propagates an Update/Add event for an object, the objects are always bulk upserted. Here's what this process looks like.

  • It initially copies the first item from the specified stream and checks if this entity type provides relations. If so, it first traverses all these relations recursively and starts a separate goroutine and caches the streams for the started goroutine/relations type.
  • After the relations have been resolved, another goroutine is started which consumes from the specified chan and performs the following actions for each of the streamed entities:
    • If the consumed entity doesn't satisfy the contracts.Entity interface, it will just forward that entity to the next stage.
    • When the entity does satisfy the contracts.Entity, it applies the preExecution callback on this entity (which should compare the checksums), and forwards the entity to the forward chan only if the filter function returns true and initiates a database upsert stream.
    • Regardless, whether the function returns true, it will stream each of the child entity with the relation.Stream() method to the respective cached stream of the relation.

However, when the first item doesn't satisfy the database.HasRelations interface, it will just use only two stages for the streamed entities to be upserted:

  • The first stage just consumes from the source stream and applies the preExecution callback (if any) on each of the entities. This won't forward entities for which the preExec function didn't also return true as well.
  • The second stage just performs a database upsert queries for entities that were forwarded from the previous one.

Delete events

The workflow for delete events is quite similar to that of upsert events but operates only by ID and not by an entire entity object.

The `MarshalJSON()` method of all our custom types currently returns
`nil` if its value is invalid, which isn't a valid json value. In
order to fix this, the methods have to return a `null` string instead.
@cla-bot cla-bot bot added the cla/signed CLA is signed by all contributors of a PR label Jul 4, 2023
@yhabteab yhabteab force-pushed the sync-enhancements branch 5 times, most recently from a5eafe2 to 4d8db74 Compare July 7, 2023 07:50
@yhabteab yhabteab force-pushed the sync-enhancements branch 2 times, most recently from 18ea9e8 to 30975ad Compare July 7, 2023 09:38
@yhabteab yhabteab added the enhancement New feature or request label Jul 7, 2023
@yhabteab yhabteab requested a review from lippserd July 7, 2023 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla/signed CLA is signed by all contributors of a PR enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant