Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_buffered is inconsistent #321

Open
cboulay opened this issue Jul 23, 2018 · 20 comments
Open

max_buffered is inconsistent #321

cboulay opened this issue Jul 23, 2018 · 20 comments

Comments

@cboulay
Copy link
Collaborator

cboulay commented Jul 23, 2018

outlet
inlet

When creating an outlet or inlet, the user can specify the maximum amount of data that can be buffered, in seconds. The data type for this argument is int32_t.

In the non-typed wrappers (Python, Matlab), users can specify floats and these will get typecast and rounded, creating problems. We can make these wrappers check the argument to make sure it is an int > 0, but this intersects with a separate issue.

In some use-cases, especially when a stream is carrying a processed signal, only the latest sample is desired and anything older is to be dropped. In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was. This has come up several times recently.

The docstring for data_receiver here suggests that max_buflen is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples. (Gets written to a string, then read in from the string by tcp_server as number of samples).

This is probably a bug. Can other devs please check this to make sure I didn't miss something?
@tstenner @mgrivich

Is there a minimum number of samples that must be buffered? I think it is 1, but some of the checks I see look to see if it is >= 0... and 0 causes the stream to be useless because there's no buffer.

Should max_buflen/max_buffered be changed to a float type so < 1 second can be used?
Or, probably better, can it be made to represent sample number instead of seconds?

In either case, this will require most apps to be changed.

@dmedine
Copy link
Contributor

dmedine commented Jul 24, 2018 via email

@tstenner
Copy link
Collaborator

In these cases it is convenient to have a buffer of only 1 sample, otherwise the user is forced to fetch until there are no samples remaining then use whatever the last sample was.

Pull everything in a chunk and take only the interesting samples?
We already need to break the API for the next version, so we might as well think about other solutions, i.e. adding a function to drop the all samples except for the newest N.

The docstring for data_receiver here suggests that max_buflen is number of seconds, as above, and indeed the seconds-version is passed directly to the data_receiver constructor during inlet creation, but if you follow how it's used then it ends up being number of samples.

If that's the case, I'd correct the docs and be happy the (imho) more sane interpretation for the parameter is correct.

Should max_buflen/max_buffered be changed to a float type so < 1 second can be used?
Or, probably better, can it be made to represent sample number instead of seconds?

+1 for number of samples, changing the type would require a recompilation for all apps otherwise they would push an int which would then get converted to about 0.

@tstenner
Copy link
Collaborator

I don't have time right now to modify the source code and check what the minimum number of samples in the buffer can be---probably 1 is ok but what might happen is that the connection will break off and have to reconnect. I don't know ASIO well enough to make an informed guess, though. Regardless, this buffer must be greater than 0.

I think either all samples get transmitted or they don't even reach the network level, so asio shouldn't care.

Perhaps the correct thing to do is to overload the constructor so that it could be specified as an int or a float

That will cause more problems than it solves and float

It would be a drag to have to know the sampling rate for this parameter

But number of samples works in all cases and is a lot easier to reason about in terms of speed and resource usage. Also the buffer doesn't work in seconds but rather in number of samples, so it would have to be translated somewhere anyway.

The work around is programming a timeout capability in the data acquisition loop.

Or a 'drop everything older than the last XY samples' function which would be quite easy.

@cboulay
Copy link
Collaborator Author

cboulay commented Jul 24, 2018

It would be a drag to have to know the sampling rate for this parameter

But number of samples works in all cases and is a lot easier to reason about in terms of speed and resource usage. Also the buffer doesn't work in seconds but rather in number of samples, so it would have to be translated somewhere anyway.

For the outlet this is easy because it (should!) know its sampling rate, if any. I think @dmedine was referring to the consumer/inlet, where you don't necessarily care what the sampling rate is at inlet creation time, and it is more likely that when you think about insurance against dropped connections that you're thinking in time, not samples. But I think that anyone that actually specifies this value will also be able to query the sampling rate.

It really bothers me that 'time' is an int, especially when this int (should!) always get multiplied with sampling rate.
What happens when the sampling rate is not a round number (I've seen devices like this)? Having a buffer that's a fraction of a second shorter than the specified buffer lengthduration is probably harmless.
What happens when the sampling rate is less than 1.0? Edit: If max_buffered=1 then the stream is never resolvable on the network.

It also bothers me that it is called max_buflen on the inlet, but it's not a length, it's a duration... and it's right before another argument called max_chunklen which IS a length (in number of samples).

Pull everything in a chunk and take only the interesting samples?

That is only guaranteed to work if the max chunk size is greater than the buffer size, otherwise you have to pull chunks until you come up empty then use the last N samples of the last non-empty chunk. This is a pretty common anti-pattern that real-time users are required to do.

+1 for number of samples, changing the type would require a recompilation for all apps otherwise they would push an int which would then get converted to about 0.

Even if we don't change the type, the outlet apps that have specified a buffer duration did so with the expectation that that it would guard against connection interruptions... but if we go from duration to n-samples then the buffer would end up being much much shorter than expected and would probably underrun at the slightest interruption. That being said, most apps that use the C++ API opt not to provide this argument so it defaults to 360 seconds, and we can simply change the default to 0 which gets caught during creation and is modified to int(360*fs) which is functionally equivalent to what happens now. So it's quite possible that any apps that didn't explicitly set the buffer duration will not have to be recompiled.

I think he's still on vacation, but I will try to get @chkothe 's input when he gets back.

@tstenner
Copy link
Collaborator

Pull everything in a chunk and take only the interesting samples?

That is only guaranteed to work if the max chunk size is greater than the buffer size, otherwise you have to pull chunks until you come up empty then use the last N samples of the last non-empty chunk. > This is a pretty common anti-pattern that real-time users are required to do.

Full agreement here, but that's the easiest way to do it at the moment.

That being said, most apps that use the C++ API opt not to provide this argument so it defaults to 360 seconds, and we can simply change the default to 0 which gets caught during creation and is modified to int(360*fs) which is functionally equivalent to what happens now.

Default arguments only change on compilation time, so for version n+1 we could output a warning if the parameter is 360 and remove the warning in n+2 in the hope that everyone has caught up.

@cboulay
Copy link
Collaborator Author

cboulay commented Jul 24, 2018

BTW, I just tested in Python creating an outlet with max_buffered=1 using either FS=0.9 or FS=1.0. The 0.9 never becomes visible to the inlet but the 1.0 works perfectly.

import pylsl
import time
import numpy as np

FS = 0.9
NCHANNELS = 8

info = pylsl.StreamInfo(name='Test', type='EEG', channel_count=NCHANNELS, nominal_srate=FS,
                        channel_format='float32', source_id='myuid34234')

# next make an outlet
outlet = pylsl.StreamOutlet(info, max_buffered=1)

print("now sending data...")
n_pushed = 0
t_start = time.time()
while True:
    time.sleep(max([0, (n_pushed / FS) - time.time() + t_start]))
    outlet.push_sample(np.random.rand(NCHANNELS))
    n_pushed += 1

@dmedine
Copy link
Contributor

dmedine commented Jul 24, 2018 via email

@dmedine
Copy link
Contributor

dmedine commented Jul 24, 2018

Oops! The above statement is incorrect. If this parameter is 0, the tcp_server is simply never created (https://github.com/sccn/labstreaminglayer/blob/master/LSL/liblsl/src/tcp_server.cpp#L396-L398)

@dmedine
Copy link
Contributor

dmedine commented Jul 24, 2018 via email

@dmedine
Copy link
Contributor

dmedine commented Jul 24, 2018

I also like @tstenner's suggestion that we add a function to somehow take the newest N samples on demand if requested. This is tricky to implement, and I don't want to get in the habit of adding a whole bunch of patchy features to address many different special cases, but I feel quite hesitant to change this---even though, in truth, I would guess that the majority of people that are actually specifying a non-default value here are people who want a 1 sample buffer length.

If this is in fact true, a simpler (but kludgier) solution is to add an optional argument to the inlet constructor to override the buffer length and force it to be 1 sample.

@cboulay
Copy link
Collaborator Author

cboulay commented Jul 24, 2018

RE the Python example, I guess my point was that this is a mistake that a user could conceivably make and there are currently no warnings or safe guards against it. IMO this needs to be fixed in the core, not the wrapper. If we're modifying the core lib anyway, then let's fix it the right way, as soon as we agree what that is.

If we leave it as int-seconds, then we need to fix and clarify some docstrings, raise errors when the buffer is 0, fix the problem with data_receiver, and it would be helpful to add a pull_latest_sample_<> for each of the pull_sample_<> functions.

Data receiver gets initialized with max_buflen seconds in the stream_inlet_impl here.
data_receiver constructor declaration, constructor implementation where the buffer_len argument is supposed to be in seconds but here is passed directly to the sample_queue_ initializer, which is an instance of consumer_queue, and its constructor expects number of samples.
Anyway, all of the above is correct (except for the doc strings and we should fixup all of the inconsistent argument names) because the number that actually gets passed to the stream_inlet_impl is in fact n-samples.

If we go with int-samples, then there are a bunch of externally-facing things to change, but it makes the details of the implementation much simpler. And as you were saying, if anyone is interested in using a non-default value then they should be proficient enough to be able to do other things like query the sampling rate. Edit: Not quite what you said, but I was referring to the proficiency of these power users.

@cboulay
Copy link
Collaborator Author

cboulay commented Jul 24, 2018

Also, optional arguments aren't a thing in C functions, so anything that uses the C API will need to provide these arguments explicitly. That includes at least the Python and Matlab wrappers.

@mgrivich
Copy link
Collaborator

mgrivich commented Jul 24, 2018 via email

@mgrivich
Copy link
Collaborator

mgrivich commented Jul 24, 2018 via email

@cboulay
Copy link
Collaborator Author

cboulay commented Jul 24, 2018

One of the key features of LSL is data synchronization.
Another key feature is easy inter-process communication.
A target audience of people using LSL for inter-process communication are people that are doing neural signal processing and real-time feedback (brain-computer interfaces). Many of these users that I've helped recently, and me too when I first started using LSL, didn't realize that you are by default fetching the oldest data in the buffer, which can be quite old! If your inlet app has some initialization time between the time you create the inlet and the time you fetch a sample/chunk then you're starting out behind. If you aren't carefully watching your signal processing rate vs your sampling/chunk rate then you may never catch up. In other words, it's really easy to always be giving feedback of old data and there's nothing to tell you that this is happening. This is terrible pitfall for people that are doing neurofeedback, which is a common use-case for LSL. This isn't hypothetical. This has happened.

At the very least we need some very strong hints in the doc strings and examples about how to avoid this problem.

Setting a buffer to only 1 sample seems like a very bad idea. If LSL throws away data to make room for new data, I would call that a critical bug, not a feature that should be used.

But it is designed to do that. If this is a critical bug then LSL should error-out whenever there is a buffer overflow and then people like me won't be misinformed about how LSL should be used.

Right now, I'd say just fix the wrappers so that they don't allow the user to request not integer values for buffer.

In my Python example above, I did request an integer value for the buffer (=1) on the outlet end, but the sampling rate was < 1.0 so it still failed silently. If I set the buffer to 2 (so 1.8 seconds = 1 sample rounded down) then it works again, but if I set the inlet buffer to 1 second (still an int) then you get Stream transmission broke off (Input stream error.); re-connecting... which is at least something that a user can request help on.

@mgrivich
Copy link
Collaborator

mgrivich commented Jul 24, 2018

A target audience of people using LSL for inter-process communication are people that are doing neural signal processing and real-time feedback (brain-computer interfaces). Many of these users that I've helped recently, and me too when I first started using LSL, didn't realize that you are by default fetching the oldest data in the buffer, which can be quite old! If your inlet app has some initialization time between the time you create the inlet and the time you fetch a sample/chunk then you're starting out behind. If you aren't carefully watching your signal processing rate vs your sampling/chunk rate then you may never catch up. In other words, it's really easy to always be giving feedback of old data and there's nothing to tell you that this is happening. This is terrible pitfall for people that are doing neurofeedback, which is a common use-case for LSL. This isn't hypothetical. This has happened.

The solution here is better documentation. Perhaps a FAQ and/or examples for people who care about real-time latency. If you can't pull till timestamp == 0, you haven't caught up. You can also check lsl_local_clock() vs the timestamps of incoming data to measure lag more explicitly.

In my Python example above, I did request an integer value for the buffer (=1) on the outlet end, but the sampling rate was < 1.0 so it still failed silently. If I set the buffer to 2 (so 1.8 seconds = 1 sample rounded down) then it works again, but if I set the inlet buffer to 1 second (still an int) then you get Stream transmission broke off (Input stream error.); re-connecting... which is at least something that a user can request help on.

I'd be fine with having the core library round up rather than down, so that this of error doesn't happen. When Christian wrote it, it appears that he didn't anticipate regular sampling rates of < 1 Hz.

@tstenner
Copy link
Collaborator

We are not really breaking the API. We are fixing functions that have always been bugged.

Which still crashes with combinations of programs compiled against the wrong library version so it's by definition an ABI break, since the types change the API changes as well even though most compilers will only warn about it.

I also like @tstenner's suggestion that we add a function to somehow take the newest N samples on demand if requested. This is tricky to implement, and I don't want to get in the habit of adding a whole bunch of patchy features to address many different special cases

With a queue implementation that has a queryable element count, it's (in pseudocode) a simple

queue.drop(std::max(0, queue.count() - sample_count));
return queue.pop_all();

This is in principle the same as pulling everything and discarding the older samples, but it leaves out potentially costly copies and conversions. I'd rather implement it as one drop_and_keep_newest_n_samples method so we don't get another 8 overloads for all pull functions.

Again, I don't see why this kind of functionality is desirable. Data loss is the worst, and adding functions to implement data loss seems to be asking for trouble. If the user really doesn't care about old data, they are free to pull it and not use it.

This requires several buffers to read the data, discard the oldest data and put the newest samples back together. Simple example: there are 13 samples available and you want the newest 6, so you have to pull 6 samples until there are no more samples available while rotating the current and last buffer, then copy the newest 1 sample together with the older 5 samples. It's a common use case so I'd prefer the more error proof "drop all except the latest 6 samples, pull 6 samples, done`.

@dmedine
Copy link
Contributor

dmedine commented Jul 25, 2018 via email

@tstenner
Copy link
Collaborator

Right, it's tricky to implement.

Not with the Boost Queue, see spsc_queue::read_available().

@mgrivich
Copy link
Collaborator

I would be surprised if the current implementation has significant performance overhead.

If we discover excessive copies being made, those should be replaced with pass-by-reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants