Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new transform): Add buffered gate transform #21071

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ilinas
Copy link

@ilinas ilinas commented Aug 14, 2024

See issue: #15263

A simple implementation of ring buffer / backtrace event handling that I ended up naming the gate transform. Keeps events in a buffer until a trigger is encountered and the buffer is flushed. When the buffer is full, the oldest events are being dropped, and it works pretty much like the filter transform.
The code is essentially a simple VecDeque.

Example configuration:

transforms:
  app_gate:
    type: gate
    inputs:
      - app_logs
    pass_when: '"info" == .level'
    open_when: '"error" == .level'
    auto_close: true
    tail_events: 20
    max_events: 200

Submitting as a draft for now.

@bits-bot
Copy link

bits-bot commented Aug 14, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the domain: transforms Anything related to Vector's transform components label Aug 14, 2024
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR! Apologies, I somehow missed your comments on the other issue, I think this is a really interesting and useful idea.

I noticed you opened this as a draft to start. What level of feedback are you looking for currently? I'm on board with the general idea and am happy to leave more detailed feedback in line.

@ilinas
Copy link
Author

ilinas commented Aug 19, 2024

Hi @jszwedko, no worries. I am glad that you think this is useful.

I am new to both Rust and Vector code, so this implementation is based on my observations of how other similar transforms were implemented. It's highly possible that I missed something completely obvious, so some sanity check would be highly appreciated.

I know I am missing the docs, but first I wanted to make sure that the general functionality and the config parameters are sensible?

Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delayed response here. I took another look over and feel like it is generally heading in the right direction. think the behavior will be a bit difficult to describe to users, but I can't immediately think of a better configuration model so I think we can rely on some examples on the documentation page to help.

Again I think this is a super nifty feature and a great fit for Vector's use-cases so I appreciate you proposing it!

If you are interested in pushing this forward, some next steps I see:

.transpose()?,
self.max_events.unwrap_or(100),
self.auto_close.unwrap_or(true),
self.tail_events.unwrap_or(10),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition would be that tail_events would be 0 though I can't articulate why exactly.

Comment on lines +37 to +38
/// Automatically close the gate after the buffer has been flushed.
pub auto_close: Option<bool>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to disallow auto_close and close_when being specified at the same time?

It also seems like auto_close might be the equivalent of close_when: "true" but that may not terribly discoverable behavior. We could treat the absence of close_when as auto-close 🤔

pub pass_when: Option<AnyCondition>,

/// A logical condition used to open the gate.
pub open_when: Option<AnyCondition>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to require this option? It's unclear to me what the behavior should be if there is no open_when 🤔


impl FunctionTransform for Gate {
fn transform(&mut self, output: &mut OutputBuffer, event: Event) {
let (pass_gate, event) = match self.pass_when.as_ref() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could return early after this block if pass_gate is true to avoid evaluating the other conditions and pushing/popping from the dequeue.

} else if open_gate {
self.current_state = GateState::Open;
self.buffer.drain(..).for_each(|evt| output.push(evt));
self.events_counter = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe set this in the close_gate block and call it tail_events_counter to make it clearer.

}

#[cfg(test)]
mod test {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some more test permutations here to cover other cases.

@ilinas
Copy link
Author

ilinas commented Sep 16, 2024

Thank you so much for your comments @jszwedko.

You are right that some of the option combinations are a bit confusing.

I tried to cover two different use cases here:

  1. Traditional backtrace: when something important happens, e.g. '"error" == .level', flush the buffer and close the gate.
  2. Manual gate: when you want to manually control the flow, e.g. open_when: '"session started" == .message' and close_when: '"session ended" == .message'.

Maybe it would be more logical to limit the transform to just the first use case? Especially because in the second use case you don't really benefit from having a buffer.

Could rename the transform to something like buffer or backtrace and just have flush_when instead of open_when, and completely ditch close_when and auto_close?

transforms:
  app_buffer:
    type: buffer
    inputs:
      - app_logs
    pass_when: '"info" == .level'
    flush_when: '"error" == .level'
    tail_events: 20
    max_events: 200

@jszwedko
Copy link
Member

Thanks for the additional thoughts! I see what you are saying. Maybe it would be easier to just focus on one use-case at a time. It might end up being a better model to have two separate transforms to support the two use-cases. Would you want to refocus this PR just on one of them and then we can have a second PR focused on the other? It sounds like you'd like to target the "backtrace" use-case first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: transforms Anything related to Vector's transform components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants