Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add optional received_p99 timestamp to commit log #295

Merged
merged 3 commits into from
Oct 18, 2023

Conversation

lynnagara
Copy link
Member

The value from the received field can be used in the future for subscription scheduling if this is provided. This is better than the orig_message_ts field as received is assigned at the very start of the pipeline when Sentry receives the event (as opposed to when Snuba gets the event). Switching to this field means any delays in ingestion will be properly accounted for when determining the window on which to schedule subscriptions.

This PR also:

  • deprecates the legacy decoder since we have fully switched over to the new format
  • switches orig_message_ts from datetime to float. Converting between the two in encode/decode is pointless, and it introduces the possibility of timezone issues. Simpler to just keep it a unix timestamp everywhere.

The value from the received field can be used in the future for
subscription scheduling if this is provided. This is better than
the `orig_message_ts` field as `received` is assigned at the very
start of the pipeline when Sentry receives the event (as opposed to
when Snuba gets the event). Switching to this field means any
delays in ingestion will be properly accounted for when determining
the window on which to schedule subscriptions.

This PR also:
- deprecates the legacy decoder since we have fully switched over to the new format
- switches orig_message_ts from datetime to float. Converting between
the two in encode/decode is pointless, and it introduces the possibility of
timezone issues. Simpler to just keep it a unix timestamp everywhere.
@lynnagara lynnagara requested a review from a team as a code owner October 16, 2023 20:16
@lynnagara lynnagara merged commit aa6bd34 into main Oct 18, 2023
8 checks passed
@lynnagara lynnagara deleted the commit-log-codec-received branch October 18, 2023 17:05
lynnagara added a commit to getsentry/snuba that referenced this pull request Oct 18, 2023
This supports the subscriptions to opt into using received_p99 for scheduling instead of the current orig_message_ts

Needs getsentry/arroyo#295
ayirr7 pushed a commit to getsentry/snuba that referenced this pull request Oct 19, 2023
This supports the subscriptions to opt into using received_p99 for scheduling instead of the current orig_message_ts

Needs getsentry/arroyo#295
lynnagara added a commit that referenced this pull request Oct 30, 2023
We experienced issues when deploying #295
to production. It was discovered that very old commit log entries are still present.
This is likely because the commit log topic has cleanup.policy=compact set, which
causes messages to never expire if they have a unique key.

The process of fixing this is underway: getsentry/snuba#4941
and getsentry/ops#8361

However we have to keep support for the legacy commit log format around for a while
until this work is complete.
lynnagara added a commit that referenced this pull request Oct 30, 2023
We experienced issues when deploying #295
to production. It was discovered that very old commit log entries are still present.
This is likely because the commit log topic has cleanup.policy=compact set, which
causes messages to never expire if they have a unique key.

The process of fixing this is underway: getsentry/snuba#4941
and getsentry/ops#8361

However we have to keep support for the legacy commit log format around for a while
until this work is complete.
ayirr7 added a commit to getsentry/snuba that referenced this pull request Nov 1, 2023
…rs (#4912)

* add migrations for tables

* update to use anyLast (this doesn't work)

* try with argMax aggregate function

* rename migrations

* renumber migrations

* renumber migrations again

* add raw_timestamp column

* fix test distributed storage set key

* add granularity-specific retention day columns to raw table

* Sum aggregate function instead of count.

* remove storage set key creations

* remove extra column and add max timestamp aggregation

* add gauges storage set key to migration group

* feat(rust): Update all dependencies in lockfile (#4892)

* feat: Write received_p99 to commit log (#4872)

This supports the subscriptions to opt into using received_p99 for scheduling instead of the current orig_message_ts

Needs getsentry/arroyo#295

* Revert "feat: Write received_p99 to commit log (#4872)"

This reverts commit c7db591.

Co-authored-by: lynnagara <[email protected]>

* initial pass, writable storage

* fix output type for gauges messages

* note on dlq

* write path working

* add entity, readable storage for querying

* switch to mat view version2

* add tests for gauges processor

* add gauges entity key

* fix readable gauge storage schema

* remove avg from gauges migration

* Revert "feat: Write received_p99 to commit log (#4872)"

This reverts commit c7db591.

Co-authored-by: lynnagara <[email protected]>

* ref(subscriptions): Move --delay-seconds from CLI arg to yaml definition (#4915)

The main motivations for this are:
1. The amount of delay depends on the synchronization timestamp used, and this is defined at the storage level in code. For example if "orig_message_ts" is used, a longer delay will be applied than if "received_p99" is used, since received will be set earlier in the pipeline.
2. The same CLI args get applied in all Sentry deployments, and this makes it easier to keep them in sync
3. Rolling out different values per storage via CLI will probably break some of our templates and require too much rework.

There are no functional changes here since we have 60 configured everywhere right now.

* feat(rust): Add strategy that does json schema validation (#4901)

* spans: add profile_id to tests  (#4827)

* add test for spans profile_id and fix bug where test was dependent on local timezone

* test: Refactor API tests to not reference sessions so it can be removed (#4920)

* fix merge conflict

* remove avgs support in dataset (storage, entity) and processor

* fix aggregate function in entity

* add some comments

---------

Co-authored-by: Lyn Nagara <[email protected]>
Co-authored-by: getsentry-bot <[email protected]>
Co-authored-by: lynnagara <[email protected]>
Co-authored-by: Dalitso Banda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants