Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexer-alt: event indices #19926

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

indexer-alt: event indices #19926

wants to merge 3 commits into from

Commits on Oct 19, 2024

  1. indexer-alt: committer handles empty batches, checkpoint stream

    ## Description
    
    This change handles two edge cases related to out-of-order commits that
    were uncovered through the work on event indices. In both cases the
    committer could get stuck, because none of the conditions guarding
    select arms were met, but the scenario was different in each case:
    
    - In the first case, the committer could get stuck because there were no
      more pending rows to write, but there were still pre-committed
      checkpoints to handle, and the logic to update watermarks based on the
      pre-committed checkpoints was guarded by a testing that `pending_rows
      > 0`.
    - In the second case, the committer could get stuck because the pipeline
      shutdown before any checkpoints came through, meaning it had no work
      to do.
    
    The fix was to allow the main `poll.tick()` arm to run if the receiving
    channel was closed, or there were pending precommits left. A short
    circuit was also added to move empty batches directly into the precommit
    list because we can treat them as already written out.
    
    ## Test plan
    
    Ran the following pipeline twice: The first time, it should exit without
    writing any data (except the watermark), and the second time it should
    just exit because there is no data between its watermark and the given
    last checkpoint:
    
    ```
    sui$ cargo run -p sui-indexer-alt --                                             \
      --database-url "postgres://postgres:postgrespw@localhost:5432/sui_indexer_alt" \
      --remote-store-url https://checkpoints.mainnet.sui.io                          \
      --last-checkpoint 1000 --pipeline ev_struct_inst
    ```
    amnn committed Oct 19, 2024
    Configuration menu
    Copy the full SHA
    edd5480 View commit details
    Browse the repository at this point in the history
  2. indexer-alt: event indices

    ## Description
    
    Adding pipelines to index all tables used to filter events. They differ
    from the equivalent schemas in the existing indexer in the following
    ways:
    
    - They only mention the transaction sequence number, and not the event
      sequent number. To use these tables, we first filter down to the
      transaction containing the event, and then scan the events in that
      transaction.
    - Struct instantiations are stored as a separate name field and then a
      BCS encoded type tag. This is to reduce their footprint (package IDs
      weight twice as much when stored as text compared to BCS), and because
      we only ever filter using an exact match, so we don't need to store
      the instantiation as text.
    
    ## Test plan
    
    Ran the indexer locally and spot checked the events.
    amnn committed Oct 19, 2024
    Configuration menu
    Copy the full SHA
    bb72581 View commit details
    Browse the repository at this point in the history
  3. indexer-alt: simplify event indexing

    ## Description
    
    Instead of having a separate table for each component of the cascading
    index, have a single table, and add multiple indices to it for each of
    the cascading cases.
    
    This should reduce the footprint and ingress to the DB, but mildly
    increases the risk that the DB picks a bad query plan.
    
    ## Test plan
    
    Run all the existing tests, and also run an experiment to confirm that
    the DB can successfully plan queries against this kind of schema.
    
    Our initial fear was that if we had multiple indices on a single table,
    then the DB may pick the wrong index, and there is still a chance that
    might happen if we add indices for disparate filters to the same table
    (i.e. we combine the event emit module and event struct instantiation
    fields into one table), but we can guarantee that the reader will only
    issue one query to each of these merged tables, and it should entirely
    overlap with one of its indices.
    amnn committed Oct 19, 2024
    Configuration menu
    Copy the full SHA
    e57832f View commit details
    Browse the repository at this point in the history