pipeline 🤝 replay ingestion changes #23395

pauldambra · 2024-07-02T12:11:37Z

Feature request

pipeline and replay teams are going to trade time to improve heatmap_data and $exception event ingestion

## why change heatmap data ingestion?

we get many heatmap_data items per event that carries them, so if we're under heavy load we automatically take a multiple of that load and it's hard to scale/react because the magnification is happening inside main event processing
we want to make these changes without breaking main event ingestion. Also improves development speed by proxy.
failure isolation, e.g. incident time we have more easier leavers we can pull
not slowing down analytics ingestion
cost we can optimize the independent clearly different work

TODO

move $heatmap_data from being a passenger on other events to on its own $$heatmap event @pauldambra
- feat: move heatmaps to their own event posthog-js#1287
update ingestion runner to make sure heatmap data keeps flowing (if necessary) @pauldambra
- feat: separate heatmap processing #23505
create kafka topic for heatmap raw topic - team pipeline
add a new plugin-server role and deployment for heatmap (running dupe of historical ingestion code, i.e. analytics without overflow) - team pipeline
update capture-rs to send $$heatmap events are written to dedicated kafka topic (we won't be changing capture-py as it's going to die soon) @xvello
Update plugin-server code to optimize heatmap role so it only does validation and writing to the heatmap ingestion topic. It can't do any other processing (it can do no other processing no $set, no exports etc ( heatmaps are free so we'll keep processing cheap) - team pipeline
- team look-up for token resolution & are heatmaps enabled or not
- no PG (no persons, groups, ...)
- no processEvent plugins
- event written to a dedicated table
investigate writing one message to CH kafka topic which is exploded in the materialized view that ingests them instead of sending one kafka message per heatmap data item @pauldambra ??

why change $exception data ingestion

we want to add more processing to these events, that will require changes to speed of processing, infra requirements, etc, we want to make these changes without breaking main event ingestion. Also improves development speed by proxy.
failure isolation, e.g. incident time we have more easier leavers we can pull
not slowing down analytics ingestion
cost we can optimize the independent clearly different work

TODO

setup /i/v0/x ingestion route
make sure any $exception event can be configured to be sent to the new route @pauldambra
- feat: send errors one way posthog-js#1289
add topic, capture-rs, new plugin-server role & deployment todo's as above - team pipeline
Update plugin-server code to optimize exception role so it only does things it needs - team pipeline
- only person lookup (no writes)
- keep processEvent plugins
- keep groups resolution and PoE

Debug info

No response

The text was updated successfully, but these errors were encountered:

pauldambra · 2024-07-02T12:12:25Z

@tiina303 dumped my thoughts here since we've probably gone beyond slack

i'd be happy to discover i'm wrong so feel free to say what tasks i'm missing / or can delete / or shouldn't be trying to do etc etc

pauldambra added the enhancement New feature or request label Jul 2, 2024

This was referenced Jul 3, 2024

Sprint - July 8 to July 19, 2024 #23325

Closed

feat: move heatmaps to their own event PostHog/posthog-js#1287

Merged

feat: send errors one way PostHog/posthog-js#1289

Merged

pauldambra mentioned this issue Jul 17, 2024

Sprint - Jul 22 to Aug 2, 2024 #23768

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline 🤝 replay ingestion changes #23395

pipeline 🤝 replay ingestion changes #23395

pauldambra commented Jul 2, 2024 •

edited

Loading

pauldambra commented Jul 2, 2024

pipeline 🤝 replay ingestion changes #23395

pipeline 🤝 replay ingestion changes #23395

Comments

pauldambra commented Jul 2, 2024 • edited Loading

Feature request

TODO

why change $exception data ingestion

TODO

Debug info

pauldambra commented Jul 2, 2024

pauldambra commented Jul 2, 2024 •

edited

Loading