Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timestamps to Zeek events #331

Closed
wants to merge 3 commits into from
Closed

Conversation

J-Gras
Copy link
Contributor

@J-Gras J-Gras commented Apr 6, 2023

This adds timestamps to the Zeek event message type for zeek/zeek#2929. Currently, the changeset would be missing a protocol version bump. @Neverlord brought up that a new protocol version is a far-reaching change and suggested to introduce a new message type for timestamped events. My gut feeling is that a clear-cut by bumping the version would be the easiest as it avoids the complexity of maintaining some meaningful interoperability. However, I can see this would be a rather radical measure for a tiny feature. To decide which path to go, we are looking for some feedback (tagging @ckreibich, @rsmmr, @timwoj and @awelzel).

@J-Gras J-Gras force-pushed the topic/jgras/event-ts branch from 8745708 to 53d214b Compare April 6, 2023 21:14
@Neverlord
Copy link
Member

Neverlord commented Apr 7, 2023

Some thoughts:

  • I think bumping the network protocol version should be reserved for actual changes to Broker's data model or serialization format. Bumps also should ideally be made in a way that allows translating between versions so that we can have v1/v2 clients exchanging messages without loss of data. I think it's even debatable whether zeek.hh belongs into the Broker code base at all. It could as well be part of the Broker manager in Zeek since it's purely application-specific.
  • Just pointing this out: this change also affects the rendering of Zeek events for WebSocket clients. So either we are willing to have clients just deal with it, or we add a /v2 endpoint and either discontinue /v1 or emulate the old rendering. But of course then there's the question what to do with things that look like Zeek events that we receive from clients. Do we add a timestamp? This would also leak Zeek-specific data representations into Broker, which we tried to avoid until now.
  • The way Event::valid() is implemented ignores trailing elements. So instead of inserting the timestamp into the middle, we could put it at the end instead. That way, old versions of Broker could still read it. Events from old library version wouldn't have the timestamp. We could either deal with that by relaxing the validity check and make the timestamp optional, or we simply drop events from legacy nodes in the network. By putting the timestamp at the end, at least we don't break things "as badly" and JSON clients could arguably also easier deal with some trailing data they don't (yet) handle properly.

I didn't check the Zeek side, but unless we really need to expose the timestamp to the Python bindings, I would avoid touching the Python bindings. We already have a ticket for removing them #303.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 8, 2023

So instead of inserting the timestamp into the middle, we could put it at the end instead.

Actually, that was my first thought. Looking at the Zeek and Python APIs, I was under the impression that the data model wouldn't allow us to append fields as we couldn't disambiguate between alternate event prototypes and the use of an additional timestamp. I think I was wrong here. At the "broker-level" this seems to be possible, as events aren't flat but nested vectors. That is, arguments are represented by a dedicated vector that can be followed by fields of other types, right? So this would be my preferred solution.

@J-Gras J-Gras force-pushed the topic/jgras/event-ts branch from 53d214b to 0cf2e12 Compare April 12, 2023 13:12
@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 12, 2023

I just moved the timestamp to the end of the message :)

Just pointing this out: this change also affects the rendering of Zeek events for WebSocket clients. So either we are willing to have clients just deal with it, or we add a /v2 endpoint and either discontinue /v1 or emulate the old rendering.

No preference here. Given that websockets are a relatively new feature, I guess both options would be okay.

I didn't check the Zeek side, but unless we really need to expose the timestamp to the Python bindings, I would avoid touching the Python bindings.

We can remove that part. However, seems handy for now and probably doesn't hurt.

@J-Gras J-Gras marked this pull request as ready for review April 12, 2023 14:38
@awelzel
Copy link
Contributor

awelzel commented Apr 14, 2023

The trailing timestamp seems reasonable to me, but haven't thought too much about interoperability. Might be a bit of a question what the Zeek side does if there's an event with no timestamp (the 0.0 default), but that could fall back to current network_time on a layer above.

The one point that comes to mind is whether we should make the timestamp nanosecond uint64_t here in broker and convert to double unix seconds up in Zeek land (where that's just what's used right now). There should not be Zeek events with timestamps before 1 Jan 1970, even when reading really old pcaps and non that are 500+ years in the future.

Reference zeek/zeek#1643

@awelzel
Copy link
Contributor

awelzel commented Apr 17, 2023

One more thought: Could we go for a a more extensible header-style field instead of the atomic field that is just a timestamp? That is, append a vector<vector<count, data>> where the count identifies what's contained.

Mainly, while the current idea is to only attach network time to an event, what if in the future more information is need or different approaches want to be evaluated? That would require another broker change and another round of appending-a-field-trying-to-stay-compatible. An extensible header format avoids that at the cost of extra complexity.

So the event headers (maybe there's a better name) would just be the following currently and populated up in Zeek land: Broker wouldn't actually need to know about about TIMESTAMP_NETWORK - that's a const up in Zeek land.

[
  [TIMESTAMP_NETWORK, double|uint64_t timestamp],
]

I would hope the extra marshaling / processing overhead is negligible compared to the involved IO and that encoding overhead is acceptable.

Some examples that are not too far fetched IMO:

  • Propagating wallclock time of the event in addition to network time (latency analysis on an event level?)
  • Ever increasing event clock or epoch from the sender (or even some form of logical event clock)
  • Vector clocks of event clocks received from other nodes at the time of sending.

Trying out or adding any of these wouldn't require a broker level change and just a Zeek patch instead.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 17, 2023

The one point that comes to mind is whether we should make the timestamp nanosecond uint64_t here in broker and convert to double unix seconds up in Zeek land (where that's just what's used right now).

I don't see a benefit in doing so as long as Zeek doesn't support nanosecond timestamps.

Could we go for a a more extensible header-style field instead of the atomic field that is just a timestamp?

I think this makes sense once there is a need for another field. For now that scenario is theoretical. To avoid the over-engineering trap, I would leave it as is for now.

@awelzel
Copy link
Contributor

awelzel commented Apr 18, 2023

I don't see a benefit in doing so as long as Zeek doesn't support nanosecond timestamps.

Hmm, rather strong opinion about the nanoseconds/timestamp: broker already defines a timestamp type that can be contained within data (it happens to be nanoseconds). It actually seems quite off to go with a double when there's a more appropriate type available. Also voids future changes in broker when/if Zeek ever changes to nanoseconds.

/// A point in time anchored at the UNIX epoch: January 1, 1970.
using timestamp = std::chrono::time_point<clock, timespan>;

I think this makes sense once there is a need for another field. For now that scenario is theoretical. To avoid the over-engineering trap, I would leave it as is for now.

For the more extensible/generic approach, I don't think it's theoretical at all: You currently have a case of adding the network timestamp to an event and you weren't able to do that without patching broker. Adding a bit more flexibility is mostly forward looking to allow experimentation/extensions without needing another patch or discussions about protocol changes. But well, that's my input / opinion. I'll leave it to @Neverlord or others.

@J-Gras J-Gras force-pushed the topic/jgras/event-ts branch from a6ee3a7 to d0d8ce0 Compare April 19, 2023 15:05
@timwoj
Copy link
Member

timwoj commented Apr 19, 2023

Hmm, rather strong opinion about the nanoseconds/timestamp: broker already defines a timestamp type that can be contained within data (it happens to be nanoseconds). It actually seems quite off to go with a double when there's a more appropriate type available. Also voids future changes in broker when/if Zeek ever changes to nanoseconds.

In light of zeek/zeek#1643 maybe happening sometime in the future, I'd rather it be sent in nanoseconds just to make life easier later.

@J-Gras
Copy link
Contributor Author

J-Gras commented Apr 19, 2023

The fact that there is a built-in broker type for timestamps somehow disappeared from my radar. I agree that it makes sense to use this. Open questions:

  • Do we want to drop this PR in favour of a generic approach to add Zeek-internal data to events?
  • How to deal with the websocket interface in this context?

@ckreibich
Copy link
Member

A couple of thoughts:

  • I like the idea of an extensible mechanism to add metadata to events. It does have a whiff of over-designing, but in this case I don't think it's hard to envision additional information we'd like to tuck on to events. To name a few: name of originating node/package/script/context, additional levels of timing for more fine-grained latency measurements, other troubleshooting info, perhaps also things like transaction IDs that currently feel a bit clunky as explicit event arguments, or protocol version info. (These bring up a question of how we might make event metadata accessible to the script layer, btw — something to think about.)
  • I likewise agree to nanoseconds at the Broker level.
  • I think it's the right call here to maintain the Broker bindings for a bit still. We love the WebSocket transport but it is still obscure — we have a few (two?) clients using it, but there's nothing we can give people who'd like to implement their own clients without learning the protocol details (no Python package that makes it dead-easy to send/receive events, say).
  • It seems quite clear to me that we're lacking a versioning concept at the Zeek layer, above Broker. We're still maintaining this as if only the Zeek cluster exists, implying that all nodes run the same version. When establishing a connection in Broker, I'd like to be able to include a key/val table that conveys such things as client name, protocol version, perhaps dialect info, and I'd like to receive one in return. Think HTTP request/response headers. A basic handshake that informs endpoints prior to the exchange of any other message types about how/whether to continue.
  • I'm not currently clear on whether the goal is for Zeek to be able to receive events over Broker that may or may not have timestamps, or whether timestamps will be a requirement. In the absence of explicit versioning I'd prefer the former, assuming the code isn't too ugly (I think it wouldn't be). That flexibility would reduce the surprise for non-Zeek endpoints, only encountering more data than before when receiving, while their sending code could remain intact. Could you clarify?

include/broker/zeek.hh Outdated Show resolved Hide resolved
@J-Gras J-Gras force-pushed the topic/jgras/event-ts branch from 7efcf82 to 4a4389e Compare May 2, 2023 10:51
@J-Gras
Copy link
Contributor Author

J-Gras commented May 2, 2023

Discussing how to implement the generic approach to add metadata to events, it turned out that getting this right probably needs to address a couple of related problems. I tried to capture these in new issues. To prevent blocking zeek/zeek#2929, it seems we should move on here and reiterate once the generic metadata mechanism landed.

I'm not currently clear on whether the goal is for Zeek to be able to receive events over Broker that may or may not have timestamps, or whether timestamps will be a requirement. In the absence of explicit versioning I'd prefer the former, assuming the code isn't too ugly (I think it wouldn't be). That flexibility would reduce the surprise for non-Zeek endpoints, only encountering more data than before when receiving, while their sending code could remain intact. Could you clarify?

Thinking a bit about the use cases I had in mind, it would indeed make sense to handle missing timestamps in Zeek instead of defaulting to 0. At the Zeek-level, we can set the timestamp to current network time for events received without a timestamp. To me, that seems to be reasonable.

I have done the necessary adaptions, added a compatibility test in Zeek (zeek/zeek@1aead4f), and rebased on current master. I guess we are ready for another review/merge.

@awelzel
Copy link
Contributor

awelzel commented May 2, 2023

Discussing how to implement the generic approach to add metadata to events, it turned out that getting this right probably needs to address a couple of related problems. I tried to capture these in new issues. To prevent blocking zeek/zeek#2929, it seems we should move on here and reiterate once the generic metadata mechanism landed.

Hmm, I was missing the vector<vector<count, data>> encoding on the Zeek side and honestly I still hold that this should exist in the first iteration. I was afraid this got lost during the discussion/meeting.

If you think it's too much I'm happy to give it a whirl.

@J-Gras
Copy link
Contributor Author

J-Gras commented May 2, 2023

If you think it's too much I'm happy to give it a whirl.

Feel free to open another PR. If you get that merged this week, I am happy to close this one and adapt #zeek/zeek/2929 to the new mechanism.

@awelzel awelzel mentioned this pull request May 2, 2023
@awelzel
Copy link
Contributor

awelzel commented May 2, 2023

If you think it's too much I'm happy to give it a whirl.

Feel free to open another PR. If you get that merged this week, I am happy to close this one and adapt #zeek/zeek/2929 to the new mechanism.

Proposal is here: #343

I didn't remove the constructor and ts accessors, so this is actually compatible at this point with the Zeek branch.

I'll try to get @Neverlord 's thoughts tomorrow.

@J-Gras
Copy link
Contributor Author

J-Gras commented May 4, 2023

I am closing this in favour of #343.

@J-Gras J-Gras closed this May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants