-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node: Add flush function #822
Comments
Hey, @mikicho, this is the purpose of 'await analytics.closeAndFlush()'. Currently, there is no way to do a synchronous flush. |
@silesky yeah, I don't want to close the connection, just flush all events in the queue.
I don't it to be sync, I can await for it:
|
@mikicho there is no persistent connection to close. The close means stopping analytics from accepting new events so the queue is able to completely drain. A scenario where you flush without close but still allow events into the pipeline isn't particularly useful. You can always reinstantiate analytics. I am not sure what your tests look like, but between our event emitters and closeAndFlush, I guarantee you have a way to to test the thing you want to test. |
@silesky why did you close this issue? |
@mikicho I closed this issue because I am not necessarily convinced at this point that it's a valid use case -- seems quite coupled to a very uncommon testing scenario that could be set up differently. I would recommend that you set up your tests so you can re-instantiate analytics between tests. |
I can't do so because I'm doing integration tests, not units, so instantiating the server I can open a pull request to implement this if this is the problem. |
@mikicho feel free to open a PR if you think it's a low lift, but I'm reluctant to introduce APIs just for testing, as I still think this is quite an edge case that could be solved by writing your tests differently. Instantiating a server between a few tests seems perfectly fine, but there could be other ways to do it as well where you only reset that one module: you could rewrite how you instantiate analytics so it is wrapped in an object with a reset function. This seems like overkill. |
First, thank you!
Test cases or test suites? 😅
I have something like that:
So I need a way to wait for the event that As a workaround, I copied I just didn't think it would be difficult to add a Do you have other ideas on how to support my case? |
Try registering an event listener like this: it('should fire "User Created" event', () => {
// you want analytics to to be instantiated with maxEventsInBatch: 1 -- you can do this with mocks OR with a test env variable.
const scope = nock('https://api.segment.io/v1/batch').reply(201); // https://github.com/nock/nock/
const pDrain = new Promise(resolve => analytics.once('drained', resolve))
await client.post('/users', {...});
await pDrain
expect(scope.isDone()).toBeTruthy()
}) |
There is a valid usecase for flush without close, and thats when you want to flush without close! |
There are other ways to listen to when flush occurs without an explicit flush function (eg. drain event emitter), as outlined in this thread. #822 (comment) |
Its a case of having to initialize it at all that is the problem. we treat segment as a service, as we do our DB, etc. For all our services we initialize them once. This is a common practice, and suggested by AMZ. It seems segment is the only "service" that has issues keeping the event loop active, which is why we need to flush it at all. Its really weird that you think its normal to manage our apps lifecycle around segment, and not vice versa. |
Hey @makeable I edited my post above to remove the part about lambda execution, as that's not really relevant to the discussion. Sorry, I think you responded to the old comment. |
Hi silesky, yes i was replying to your original comment. I had a look at the link, and it raised an issue that i have navigating the documentation website. Where does it mention that drained event? I am used to API docs actually documenting the api, but it seems a lot of guesswork compiling an interface from the list of things sprinkled around that documentation. Sure, I can read the code, but its really frustrating when you just want to spend 5 mins to implement analytics.track and it turns into 3 hours. |
I should respond to your earlier suggestion of reading the lambda example. Unfortunately, the lambda example is a VERY naive example, and probably not very representative of how non-trivial node apps are run on lambda. You are awaiting in the main handler for the track callback to be fired That is the ONLY way that example can work. The reality is that you may be using segment much deeper in the stack, possibly in an unmanaged async call, as many libs are not written with suspension in mind and dont expect you to wait on them - i.e. fire and forget. The issue is further exacerbated when using an async handler, as the lambda runtime handles callbacks and async differently when it comes to suspension. Its not uncommon to wrap the handler for this reason, and in that wrapper you would handle the flushing, which is exactly what I hoped to do. I will look at using this drained event, but I will also need to know if there are any in the queue, otherwise i could be waiting on a drained event that never comes. Is there some way to do that? |
Sorry, that sounds really frustrating. We will be focusing on improving our documentation -- I see we don't have much emitter docs besides a link, this is a new repo/interface, and sometimes we don't really notice the gaps until people tell us. We have AWS lamda documentation on our repo (our suggestion is to reinstantiate it every time), but if that's not your preference, I can understand how you could go down a confusing rabbit hole. I will also be discussing bring back an explicit flush method as more of a feature request -- we removed it because the main reason people were using it was for graceful shutdown, and it wasn't working as expected for that use case since events were still allowed to enter the queue. However, this thread makes it sound like there is another subset of users who use it for testing / draining and the drained event is a little confusing (especially if it's not a documented recipe!). |
Yeah, as I explained above, the example is very naive, and as for reinstantiating at every request, that would mean we are wrapping our code in segment handling, which seems to be working around segment, rather than simply instrumenting our code. I did consider instantiating for each track call, but realised I would just end up with more queues to flush :) |
@makeable typo: can't understand-> can understand 😄 |
Heh, i assumed as much :) Any word on how i can tell if there are queued requests? As mentioned, just waiting on a drained event wont work if there is nothing in the queue |
@makeable yep, but they are 'private' methods / fields atm -- what is your comfort level there? Had a team discussion -- we probably don't want to recommend using any private methods... we are discussing a way to expose this to you. Want to avoid a band-aid. |
@makeable Thanks for engaging with us here! Just to make sure we're all on the same page:
Is that right? Are you ok with calling
For context, why do we even need flush at all? 2 reasons:
Now, it would be great if we could make the HTTP request immediately if both the batch size is 1, and all plugins are synchronous, but that's a digression! As @silesky mentioned, our guidance so far has been to instantiate a new analytics client within the Lambda handler (each lambda invocation), and call I think it makes sense to support a flush that doesn't close the client. One scenario we need to address is what to do with new events when there is a pending flush: wait to resolve flush until those have also been sent, or hold off on sending them until the current flush has completed. In your scenario, would you expect any track calls to be made after calling flush? Sorry for the wall of text, thanks again for sharing your use-case with us! |
We need this for testing to empty the queue before assertion.
maxEventsInBatch
still do it async, which cause flaky tests@segmentio update: team is starting the process of looking at a reimplemention of the vanilla flush function, though requirements are a bit more complex than the old node SDK because of the existence of async plugins. See: #822 (comment)
The text was updated successfully, but these errors were encountered: