Median calculation in Threshold #21

ElliotMebane · 2019-05-17T05:44:23Z

I noticed that the use of median seemed to favor the last value of the range instead of the whole range of values used in the interval. I checked the calculation and the median calculation only uses the last iterator value:
for (i = 0, sum = 0; (i <= 1024) && (sum < numtrue); i++)
{
sum += buckets[i];
}
to_input = size_value(0.0f,1024.0f,(float)i,in_ports[0].in_min,in_ports[0].in_max,0);

It looks like the buckets need to be sorted then the middle value in the list should be chosen.

ChrisVeigl · 2019-05-17T07:36:03Z

uups ! that bug was in there for a log time! actually the median value was added by a contributor and i did not reallly check it's functionality well enough ! i'll have a look when time permits,

ChrisVeigl · 2019-05-19T12:31:21Z

It looks like the buckets need to be sorted then the middle value in the list should be chosen.

the code for median was contributed before Brainbay was under version control.
after having a look I'm not so sure that the calcualtion is wrong:

the buckets are used in order to prevent sorting the incoming values (to save computation effort particularily for larger intervals). they divide the whole singal range into 1024 "bins" of equal size (sacrificing precision). for an incoming value it's associated bin is increased by one. so above for loop IMO makes sense to find the bottom x% or top x% of the values (represented by bottom / top numtrue bin entries, where numtrue is the number of samples in that interval * x/100)

ElliotMebane · 2019-05-19T18:47:18Z

OK, I thought the traditional use of the term median was being used (middle bucket).

The fan on my VR-ready laptop engages when BrainBay runs, so I suppose all the optimization that can be done is worth it.

-- new values outside the min/max settings get clipped to the lowest/highest bucket in incoming_data method. Not sure what impact that may have.
-- there are 1025 entries in the bucket, FYI. The big/small adapt blocks use for loops that seem to be consistent with that length (one counts up and the other counts down), but be careful not to assume the length is 1024.
-- I"m not sure if the for loops in the bigadapt/smalladapt blocks are counting in the correct directions. The bigadapt block counts backwards until the percentage has been met. So a high percentage would trim off the bulk of the top values, returning a number on the low side of the range.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Median calculation in Threshold #21

Median calculation in Threshold #21

ElliotMebane commented May 17, 2019 •

edited

Loading

ChrisVeigl commented May 17, 2019 via email •

edited

Loading

ChrisVeigl commented May 19, 2019

ElliotMebane commented May 19, 2019

Median calculation in Threshold #21

Median calculation in Threshold #21

Comments

ElliotMebane commented May 17, 2019 • edited Loading

ChrisVeigl commented May 17, 2019 via email • edited Loading

ChrisVeigl commented May 19, 2019

ElliotMebane commented May 19, 2019

ElliotMebane commented May 17, 2019 •

edited

Loading

ChrisVeigl commented May 17, 2019 via email •

edited

Loading