-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC on Hypotheses Resampling #196
base: main
Are you sure you want to change the base?
RFC on Hypotheses Resampling #196
Conversation
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
# Summary | ||
|
||
Resample hypotheses at every step in a manner inspired by particle-filters. This is the first step for Monty to interact with multiple objects and recognize compositional objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A link to some background reading on particle filters would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. @vkakerbeck shared this video a week ago. I thought it was really helpful.
https://www.youtube.com/watch?v=NrzmH_yerBU
I can add it to the RFC if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe check out this (more specific to Monty): https://thousandbrainsproject.readme.io/docs/test-particle-filter-like-resampling-of-hypothesis-space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Both of these links would be useful to include in the RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for putting this together! I like the idea of using the slope as a signal for sampling, i.e. even when the total evidence counts might be low.
Just a couple comments, e.g. I think it would be worth also emphasizing the uniform sampling, because I think this will still be useful, otherwise we might not see a good slope for the new object in the first case? At least based on the second plot, it looks like it takes a while for the slope to appear.
I've also just written a high-level comment about the dataset we want to evaluate this work on, as this might be something we want to revisit.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
1) For every object, at every new step, get initial hypotheses based on the observed pose. | ||
2) If principal curvature is not defined, skip | ||
3) Uniformly sample M percent from these new hypotheses (e.g., M = 20% = 1000 hyp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention further up in the RFC that we would also do some uniform sampling (which I think is reasonable, but I think this is the first time it is mentioned).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified the summary section. Let me know if it's not clear enough.
* Here we would focus on detecting the change that triggers full reset of Monty states. | ||
* Detecting the object change will likely be based on evidence decay and can be problematic. If an object was added and quickly changed before accumulating enough evidence, it will be difficult to detect change based on evidence decay. This will still result in incorrect initial hypotheses. Resampling without a hard reset is more general. | ||
|
||
# Future possibilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I've been thinking about in terms of the artificial nature of this task (which is also something we've discussed before), is this current setup is a bit like a "magic trick" where someone puts their hand in a box to feel an object with one finger, then after a couple minutes, someone discretely replaces the object without their finger really feeling a major change. In other words, it's not very naturalistic. What we are actually trying to build towards is being able to seamlessly move from one sub-object to another if we have e.g. a logo on a mug, or a car composed of parts, and for the LM representation to update without needing a supervisory signal that it's on a new object. In the compositional case however, if the LM was 100% confident about where it was on a learned reference frame (e.g. on the "N" in the child object logo), after performing a movement onto the parent object (e.g. mug), it's hypothesized location in the child object's reference frame would now be in empty space, far from any learned features that exist for that object. In that case, the evidence associated with this "off-object" location should essentially be zero by definition, and it should be straightforward to implement this.
In the above case, resampling would still be important, but I think it just emphasizes that the evidence scores could be a lot less adversarial than in the current experiment setup.
For what it's worth in the above situation, if you moved back onto the object, I think it would be the responsibility of a higher level LM that remembers e.g. logo at location x to bias the low-level LM, rather than requiring the low-level LM maintain a memory of the previous object. This would tie in with resampling, i.e. we wouldn't need to worry about keeping lots of sampling points for the old object if we've moved off of it.
@vkakerbeck what do you think? It almost makes me wonder whether we would be better off having the compositional dataset to evaluate these approaches on, rather than trying to shoehorn it into the current YCB object cycling setup. This is similar to some of Jeff's comments at the retreat. Maybe something we can also discuss on Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. I think this may relate to the mechanism for switching between matching and exploration policies we discussed briefly:
- If we have recognized an object and are now exploring it, we don't need to keep testing hypotheses, we just make predictions using our model of this object and check if these predictions are correct.
- We would exit this mode if 1) our prediction is suddenly incorrect (the artificial setup where you suddenly replace the object, or you were wrong about the object) or 2) the sensor moves to a location where the model wouldn't predict the object to exist anymore (moving off object) which is the usual, more natural case.
- The second case would give us a model-based signal for switching back into matching mode and potentially reinitializing the entire hypothesis space.
- In my mind we are currently dealing with the case where we have actually not recognized the object yet so we can't make model-based predictions. I mean you could argue that we can make model-based predictions using the mlh (which we do when testing that hypothesis) but it is not as much of a reliable signal that it could be used for a hard reset.
In terms of the test bed we use, I think that we can keep using the current setup for the hypothesis resampling work since this mechanism doesn't rely on physical movement from one object onto another. It should actually be useful for recognizing a single object in general (removing reliance on first observation and allowing us to test less hypotheses at any point in time).
However, it does bring up a good point on whether this item is really the most impactful one for us to tackle for the unsupervised learning milestone. We could evaluate whether it makes more sense to start with the prediction error/off-object prediction signal I outlined. In that case we would definitely need a multi object environment setup since this relies on a physical separation between the two objects.
Lmk what you think. Happy to talk about this more in a meeting too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think my concern with optimizing for this current problem setup is that we may end up with solutions that aren't actually that necessary / useful. E.g. clearly the hypothesis resampling is useful, but the exact details of it might be lead on a tangent by trying to deal with this adversarial (and artificial) problem setup.
- One option is just use this as you say, but we should be cognizant of this caveat and not worry about getting super high accuracy in this setting, and just consider it as a test-bed for resampling.
- The other is the option of, after this RFC, to refocus on the actual dataset we want to deal with. A lot of unsupervised learning features aren't actually important if we are learning a dataset where compositional objects can be learned in isolation, in which case we can potentially refocus the items under the first milestone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @nielsleadholm that this setup is not very naturalistic and might lead to some unnecessary solutions because we're optimizing for a much harder experiment. I like the option where we consider this a test-bed for resampling and not worry about super high accuracy. In practice, Monty should be able to always accumulate evidence even when we switch objects in place, but we don't need to spend too much time optimizing the speed of inference between objects.
I also like @vkakerbeck distinction between model-based and model-free evidence accumulation. If we have not recognized a model, we can't really use a model to make good predictions. This is where hypotheses resampling is most useful - to seamlessly adapt to changes while still trying to recognize the object. But if we have recognized the model, we can make predictions and figure out when a "Hard" reset might be needed. This would require that we continuously test the recognized model at every exploratory step. My only concern here is how do we know that a prediction error due to an unseen part of the object (e.g., previously unseen handle on a mug) should be treated as part of the object (i.e., add it to the cylinder graph) vs. trigger a prediction error reset for recognizing a new object that happens to be spatially close? Is this problem specific to learning-from-scratch?
I am a bit biased towards keeping this setup until we implement a decent hypotheses resampling LM because multiple objects would require us to think about (hierarchical) action policies to move between objects.
- If it's a compositional object, such as wheels on a car, we would use a hypotheses-based hierarchical action policy. For example, if I know the higher level LM is on the car and the lower level LM is on the front wheel, now the higher level LM can instruct the lower level LM to move to the back wheel and reset its hypotheses (and perhaps even bias the new hypotheses with extra evidence for the back wheel object as Niels mentioned)
- If we are not on a compositional object, we need to think about model-free action policies to move from one object to another. Maybe we can hard-code that in for now as a hack (i.e., after x steps move the sensor to object 2 at a known location)? Note that the "model" in "model-free" here refers to higher-level compositional model.
@nielsleadholm when you say compositional dataset, are you talking about the dinner set scenes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, Viviane and I were discussing this earlier this morning and yes what we are thinking is:
- Clarify that the current work to enable a first version of compositional objects is about unsupervised inference, not learning. For now, we assume that we are going to learn sub-objects in isolation (e.g. mug and logo, see below), and so it's more about equipping Monty to handle inference where it may move from one object onto another.
- Continue the current work you are doing. Adding the ability to resample hypotheses will be important for several aspects of compositional objects. This first version would actually be best tested on the existing benchmarks by seeing whether we can get more robust inference with the same number of initial points, or equally achieve the same accuracy by starting with fewer initial points. We can still use this artificial object-swapping task as an adversarial condition to magnify the benefits of resampling, but we don't need to optimize parameters to achieve high performance on it.
- Later, create an RFC proposing how we can use more model-based resetting based on moving off of an object. I agree this will be an interesting question to explore in terms of untangling prediction errors that should inform learning vs. hypothesis resetting. E.g. maybe the default is that the hypothesis space is reset, but with some evidence accumulating that the model might be incomplete. Given enough movements with a consistent prediction error, this might trigger a more learning-style phase. But there are clearly complications to work through.
- The compositional dataset would initially be a simple one consisting of different objects with different logos on their surfaces. That way we can learn the sub-components in isolation. This ties in with simplifying the first part of research to focus on unsupervised inference, and in the future we can return to unsupervised learning to enable more complex compositional datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice outline and great figures! Overall really like it. I just asked about clarifications in a few places. On a major note, we should talk more about the longer thread at the future possibilities section (is this really the most impactful next step). This shouldn't keep this RFC from being merged but may determine whether it's actually the next thing to work on.
rfcs/0000_hypotheses_resampling.md
Outdated
2) If principal curvature is not defined, skip | ||
3) Uniformly sample M percent from these new hypotheses (e.g., M = 20% = 1000 hyp) | ||
4) Sample N percent from the existing hypotheses distribution of highest evidence slope (e.g., N = 10% = 500 hyp). These sampled hypotheses should be "close" to the most likely hypotheses. | ||
5) Replace unlikely M+N percent of existing hypotheses (e.g., 30% = 1500 hyp) with these newly sampled hypotheses. The unlikely hypotheses are chosen based on evidence change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the evidence count for these new hypotheses initialized?
On a technical level, we have this large matrix of hypotheses, do you basically in-place replace the low evidence one's with the new ones and thereby don't change the shape of the hypothesis space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we would just initialize with the mean of the evidence from the corresponding graph. It wouldn't matter when sampling new hypotheses and removing old ones because we can use evidence slope for that, but it would matter for deciding on the mlh for terminal state.
Yes, that was my plan, but I'm updating the RFC to include more flexible resampling that doesn't necessarily keep the same number of hypotheses.
* Here we would focus on detecting the change that triggers full reset of Monty states. | ||
* Detecting the object change will likely be based on evidence decay and can be problematic. If an object was added and quickly changed before accumulating enough evidence, it will be difficult to detect change based on evidence decay. This will still result in incorrect initial hypotheses. Resampling without a hard reset is more general. | ||
|
||
# Future possibilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. I think this may relate to the mechanism for switching between matching and exploration policies we discussed briefly:
- If we have recognized an object and are now exploring it, we don't need to keep testing hypotheses, we just make predictions using our model of this object and check if these predictions are correct.
- We would exit this mode if 1) our prediction is suddenly incorrect (the artificial setup where you suddenly replace the object, or you were wrong about the object) or 2) the sensor moves to a location where the model wouldn't predict the object to exist anymore (moving off object) which is the usual, more natural case.
- The second case would give us a model-based signal for switching back into matching mode and potentially reinitializing the entire hypothesis space.
- In my mind we are currently dealing with the case where we have actually not recognized the object yet so we can't make model-based predictions. I mean you could argue that we can make model-based predictions using the mlh (which we do when testing that hypothesis) but it is not as much of a reliable signal that it could be used for a hard reset.
In terms of the test bed we use, I think that we can keep using the current setup for the hypothesis resampling work since this mechanism doesn't rely on physical movement from one object onto another. It should actually be useful for recognizing a single object in general (removing reliance on first observation and allowing us to test less hypotheses at any point in time).
However, it does bring up a good point on whether this item is really the most impactful one for us to tackle for the unsupervised learning milestone. We could evaluate whether it makes more sense to start with the prediction error/off-object prediction signal I outlined. In that case we would definitely need a multi object environment setup since this relies on a physical separation between the two objects.
Lmk what you think. Happy to talk about this more in a meeting too.
* Detecting the object change will likely be based on evidence decay and can be problematic. If an object was added and quickly changed before accumulating enough evidence, it will be difficult to detect change based on evidence decay. This will still result in incorrect initial hypotheses. Resampling without a hard reset is more general. | ||
|
||
# Future possibilities | ||
* This only works in inference assuming that we have good object models learned in isolation. It likely breaks learning from scratch experiments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you think it would break learning from scratch? Do you mean because we would never have all hypotheses go below 0 evidence and can therefor never classify no_match? I would be curious about the details of how you plan to initialize the evidence values for the new hypotheses. I think we should try to make this mechanism work in the learning setting as well. Maybe by initializing the average evidence, which could be negative? I'd have to think about this a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am more concerned about environments with multiple objects during the learning phase. If we assume objects are learned in isolation, then these pretrained graphs will work fine during inference. But in the learning from scratch, don't we use the same environment for learning and inference? If we start a multiobject episode, wouldn't we have to use it during learning too? Then objects won't be learned in isolation and the graphs we use for matching might end up being graphs of multiple objects. I could be missing something here..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing that forces us to use the same environment setup during training as during evaluation. We could first show objects in isolation while not providing labels (making it a learning from scratch setup) and then show them in multi object setups. I'm not saying this would work great, even in the current unsupervised learning setup with isolated objects we often merge multiple objects into the same graph. Just saying that not providing labels doesn't keep us from showing objects in isolation initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I had the wrong assumptions about the learning from scratch setup then.
* This only works in inference assuming that we have good object models learned in isolation. It likely breaks learning from scratch experiments. | ||
* I manually force Monty back into the matching phase and reset it's counters between episodes. | ||
* Maybe we should brainstorm about what a terminal state should be in a multi-object environment. | ||
* Maybe also brainstorm about ways to more seamlessly switch between matching and exploration? I don't like forcing Monty to always be in matching mode, by doing this it feels that we are making the gap between learning and inference wider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely two good topics to talk about more. I wrote my thoughts on both in earlier comments. lmk what you think.
* Here we would focus on detecting the change that triggers full reset of Monty states. | ||
* Detecting the object change will likely be based on evidence decay and can be problematic. If an object was added and quickly changed before accumulating enough evidence, it will be difficult to detect change based on evidence decay. This will still result in incorrect initial hypotheses. Resampling without a hard reset is more general. | ||
|
||
# Future possibilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think another future work item would be to test if we can use this mechanism to make Monty more efficient. Basically we don't have to initialize 1000s of hypotheses on the first step. We can initialize a small subset and then iteratively refine these hypotheses. This was one of my initial excitements about this idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point! I find it interesting that you said start with less hypotheses and add more, because that's the opposite of what they explained here on particle filters. They started with many particles and reduced the number as the model started narrowing down the location (i.e., simpler distribution can be represented with less particles). I can say good arguments for both ways actually.
I know you mentioned this as part of future work, but I think it makes sense to accommodate for it here. I've updated the RFC to allow for changing count of hypotheses. So if we want to start with a small number and increase the hypotheses by 10% every step, we can define hypotheses_count_ratio=1.1
. We can also control where these hypotheses come from using the hypotheses_old_to_new_ratio
and hypotheses_informed_to_reenforce_ratio
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon reading your comment again, I see now that you said "iteratively refine" not increase 😅 Yeah makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great updates! I like the proposed changes to the code, as well as the parameters you've introduced. Just added a few comments.
rfcs/0000_hypotheses_resampling.md
Outdated
* Number of hypotheses should not scale up with steps, if anything they should decrease. For now, any sampled hypothesis must replace an old "unlikely" hypothesis. | ||
* Sampling of likely and unlikely hypotheses is based on evidence change instead of absolute evidence. | ||
* Terminal state calculation is still based on accumulated evidence and `x_percent_threshold`. | ||
* The resampling procedure only occurs if principal curvature is defined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah just elaborating on this further - for objects like the ball, we shouldn't need a ton of hypotheses to recognize them, because the poses are symmetric. So I think this fits with the suggestion that if pose is undefined, we can add some hypotheses, but rather than adding them for e.g. 8 different rotations, we can just add them for say, 4 (or even 2). If it is an object that is less symmetric than a sphere, then this should hopefully get accounted for by informed hypotheses that are sampled when we get onto more pose-defining features.
(this also couples well with Monty's approach to symmetry, i.e. it doesn't matter if the poses it samples and converges to aren't the Euler-angle equivalent of the ground-truth pose, especially if it has stored which poses are equivalent, which is a feature in the pipeline)
rfcs/0000_hypotheses_resampling.md
Outdated
Resample hypotheses at every step in a manner inspired by particle-filters. This is the first step for Monty to interact with multiple objects and recognize compositional objects. The newly sampled hypotheses will come from: | ||
1) a subset (uniformly sampled) of new hypotheses initialized based on the current step observation. | ||
2) a set of newly sampled hypotheses from the distribution of the most likely hypotheses. | ||
3) a subset of the old hypotheses based on the metric representing the most likely hypotheses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the wording of (3) a bit confusing/hard to read for some reason. Maybe something like "A subset of the old hypotheses. Which of these are maintained is based on...".
Also for 2 and 3, my understanding is we are basing resampling on the first-order derivative of their evidence accumulation. It might be worth saying something like "... the most rapidly rising hypotheses" rather than "most likely".
rfcs/0000_hypotheses_resampling.md
Outdated
3) Uniformly sample M percent from these new hypotheses (e.g., M = 20% = 1000 hyp) | ||
4) Sample N percent from the existing hypotheses distribution of highest evidence slope (e.g., N = 10% = 500 hyp). These sampled hypotheses should be "close" to the most likely hypotheses. | ||
5) Replace unlikely M+N percent of existing hypotheses (e.g., 30% = 1500 hyp) with these newly sampled hypotheses. The unlikely hypotheses are chosen based on evidence change. | ||
1) If principal curvature is not defined, skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the separate discussion, this may need to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I've removed this from the resampling procedure and the constraints.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
## High Level Code Changes | ||
|
||
The needed modification will only change the `EvidenceLM` class. More specifically, we would be either be modifying the `_update_evidence` function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"be either be modifying" ... "defining" --> "modify either"... "define" (noticed a typo, but also a useful reminder to use active voice in writing 😅)
rfcs/0000_hypotheses_resampling.md
Outdated
## High Level Code Changes | ||
|
||
The needed modification will only change the `EvidenceLM` class. More specifically, we would be either be modifying the `_update_evidence` function | ||
directly or defining another function that calls `_update_evidence` as one of its steps. A rough proposal of the needed changes is shown below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For function number 8, it would be nice to refactor out the existing section below into it's own function so that can be used ideally as-is both i) in the existing _update_evidence
, and ii) the new setting you are considering.
Lines 952-1002 in evidence_matching.py
# Before moving we initialize the hypothesis space:
if displacements is None:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense. I may end up rewriting and factoring out parts of _update_evidence
in the process. I was thinking of just rewriting _update_evidence
to support resampling and breaking it up into manageable smaller functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, yeah that would be great. It's a pretty big method at the moment.
rfcs/0000_hypotheses_resampling.md
Outdated
| ----------|-------------|-------| | ||
| **hypotheses_count_ratio** | A multiplier for the needed number of hypotheses at this new step | [0, inf) | | ||
| **hypotheses_old_to_new_ratio** | How many old to new hypotheses to be added. `0` means all old, `1` means all new | [0, 1] | | ||
| **hypotheses_informed_to_reinforce_ratio** | How many informed (sampled based on newly observed pose) to reinforced hypotheses (sampled close to existing likely hypotheses) to be added. `0` means all informed, `1` means all reinforced | [0, 1] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to define informed and reinforced earlier in the document, as these terms are used a few times but are a bit confusing without a clear description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, good idea. I've added them to the summary. Also I came up with "reinforced" and I'm not sure if it's the best term, but we can go with it for now if no objections.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
|
||
def calculate_new_hypotheses_counts( | ||
curr_hypotheses_count, new_informed_hypotheses_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is the new_informed_hypotheses_count
parameter for? Would be useful with some description here and on line 135.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the total number of informed hypotheses sampled based on the current pose observation. Assuming there are x
points in the object graph, these can be 2x
if the PC is defined or 8x
if PC is undefined. When calculating the needed number of informed hypotheses (i.e., needed_new_informed_hyp
), we should not request more than the available new_informed_hypotheses_count
. If we request more, the difference will come from reinforced hypotheses which are technically infinite.
Note that we do not need to actually get the informed hypotheses at this step to know how many we have, we can just use the number of points in the graph and whether the pose is defined.
I'll add some description.
rfcs/0000_hypotheses_resampling.md
Outdated
needed_hypotheses_count * (1 - hypotheses_old_to_new_ratio), | ||
needed_hypotheses_count * hypotheses_old_to_new_ratio, | ||
) | ||
if needed_old_sampled_hyp > curr_hypotheses_count: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be worth with a comment here, e.g. if trying to include more old hypotheses than exist, set their count as the ceiling.
|
||
The needed modification will only change the `EvidenceLM` class. More specifically, we would be either be modifying the `_update_evidence` function | ||
directly or defining another function that calls `_update_evidence` as one of its steps. A rough proposal of the needed changes is shown below. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be worth thinking whether function 6 "Resample old and reinforced" can be broken up into two methods - at least on the surface it seems like these should probably be separated out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The main purpose of the proposed figure is to show roughly where the changes will occur within the matching_step
but I imagine that the exact number of functions will change as I start implementing. I may even do all the changes in update_evidence
as discussed above. But at least this gives you an idea of where I plan to start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for making those changes! Just a couple more very minor comments but otherwise it looks ready to merge from my perspective.
rfcs/0000_hypotheses_resampling.md
Outdated
3) a subset of the old hypotheses based on the metric representing the most likely hypotheses. | ||
1) **Informed Hypotheses:** a subset (uniformly sampled) of new hypotheses initialized based on the current step observation (i.e., sensed pose). | ||
2) **Reinforced Hypotheses:** a set of newly sampled hypotheses from the distribution of the most rapidly rising hypotheses. | ||
3) **Old Hypotheses:** a subset of the old hypotheses, sampled based on the most rapidly rising hypotheses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice thanks for writing these descriptions, I think it would be useful to maybe say "maintained based on..." rather than "sampled based on..." for (3) - sampled to me suggests drawing from a random distribution, whereas here we are just keeping ones that already exist.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
Outputs: | ||
- `needed_old_sampled_hyp`: The number of existing hypotheses to be kept from the | ||
previous step. Should not exceed `curr_hypotheses_count`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than "should not", maybe describe it as "Will not exceed" - should implies to me that it is a user-specified parameter that the user needs to be mindful of setting correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice, thanks for the updates! I just left one more major comment around the execution order of step 1 and 2. But happy to approve this already so you can get started on implementing this.
rfcs/0000_hypotheses_resampling.md
Outdated
* Sampling of likely and unlikely hypotheses is based on evidence change instead of absolute evidence. | ||
* Terminal state calculation is still based on accumulated evidence and `x_percent_threshold`. | ||
* The resampling procedure only occurs if principal curvature is defined. | ||
|
||
## The Resampling Procedure | ||
|
||
1) For every object, at every new step, get initial hypotheses based on the observed pose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for efficiency reasons it could make sense to do step 2 first, and then only calculate initial hypotheses for needed_new_informed_hyp
locations on the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. This is what I planned out in the high-level code change diagram, but I forgot to update this here in the resampling procedure.
|
||
*Note that the two types (reinforced and informed) of resampled hypotheses are treated differently. | ||
Unlike reinforced hypotheses, the current positions of the informed hypotheses do not need to be rotated and displaced at resampling. | ||
Informed hypotheses are added after the reinforced (and old) hypotheses are updated/displaced.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still plan on added feature evidence to the new informed hypotheses, right? (like in _calculate_feature_evidence_for_all_nodes
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I would need to compare with sensed features and add evidence if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, I'm probably better off rewriting _update_evidence
with support for resampling.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
| Parameter | Description | Range | | ||
| ----------|-------------|-------| | ||
| **hypotheses_count_ratio** | A multiplier for the needed number of hypotheses at this new step | [0, inf) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean if it is 1, the number of hypotheses remains the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see this below now :) Could be worth adding to the description here too since you have that in the description of the other params as well and it's useful info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll add that.
- `curr_hypotheses_count`: The current number of hypotheses | ||
- `new_informed_hypotheses_count`: The total number of informed hypotheses | ||
sampled based on the current pose observation. Assuming there are x points | ||
in the object graph, these can be 2x if the PC is defined or 8x if PC is undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my earlier comment, I think it would be more efficient if we sampled the informed hypotheses after we calculated the counts so we don't need to calculate possible poses for all points in the graph. You could take the needed_new_informed_hyp
and sample that number/2 points from the graph (could even be sampled in an informed way by first picking the ones where the features actually match the sensed features.
This would remove the need for this parameter but would introduce the risk that needed_new_informed_hyp
is larger than the number of points in the graph * 2. Is this actually something that would realistically happen? In what case would this happen? If hypotheses_count_ratio
is >1? Do we want to support this case? And if so, could we just duplicate hypotheses with some noise added to them to fill up the number of requested hypotheses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good point. I don't plan on sampling the informed hypotheses first, but I only need the number of expected informed hypotheses at this step. We can quickly figure that out from the number of graph points and whether PC is defined. Then, I can know if the needed_new_informed_hyp
is more than the actual informed hypotheses and handle it. I will only sample what I need from the graph later.
Yes, we run this risk if hypotheses_count_ratio
is > 1. Another case is if we started off with undefined PC at step 1 (graph points * 8) and moved to defined PC at step 2 (graph points * 2). Even if hypotheses_count_ratio
is < 1, we could still request too many informed hypotheses to maintain the high number of hypotheses we started with. A bit of an edge case, but could happen.
rfcs/0000_hypotheses_resampling.md
Outdated
2) Calculate the needed hypotheses counts to be sampled based on the defined parameters as shown [here](#The-Resampling-Count-Calculation). | ||
3) Sample `needed_old_sampled_hyp` from the existing hypotheses based on highest evidence slope. We will keep these old hypotheses and remove the rest. | ||
4) Uniformly sample `needed_new_informed_hyp` from the new informed hypotheses. We will add these new hypotheses. | ||
5) Sample `needed_new_reinforced_hyp` from the existing hypotheses distribution of highest evidence slope. These sampled hypotheses should be "close" to the most likely hypotheses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you are planning exactly for how to initialize the evidence values but I think it would be worth considering using the average of the most likely hypotheses (the ones that the new hypotheses are sampled from) for this here instead of the average evidence for all hypotheses. We can use the average evidence of all hypotheses for step 4 (+evidence based on the sensed feature match)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, great idea!
Resample hypotheses at every step in a manner inspired by particle-filters. This is the first step for Monty to interact with multiple objects and recognize compositional objects. | ||
Resample hypotheses at every step in a manner inspired by particle-filters. This is the first step for Monty to interact with multiple objects and recognize compositional objects. The newly sampled hypotheses will come from: | ||
1) **Informed Hypotheses:** a subset (uniformly sampled) of new hypotheses initialized based on the current step observation (i.e., sensed pose). | ||
2) **Reinforced Hypotheses:** a set of newly sampled hypotheses from the distribution of the most rapidly rising hypotheses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "reinforced hypotheses" a known term from particle filters? To me it sounds maybe a bit ambiguous as to what they actually are. If there is a common terminology for this already we should use this. I also find "informed hypotheses" a bit confusing in this context now (since technically all 3 types are informed, just by different things). Initially I used the term to contrast the initialization to uniformly initialized hypotheses. I get how it would be useful to keep using the same terminology so I don't have a good solution. I just noticed reading through the code that I was getting quite confused at time and the variable names don't seem so intuitive. I don't know you @nielsleadholm you have any thoughts or if we should just stick with it and add good documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a known term, I just needed a place holder and this is the best I could come up with 😅
My intuition is we are reinforcing existing hypotheses that are most likely correct by adding ones that are "close" to them. Open to suggestions though..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could call it resampled hypotheses, similar to here: https://stonesoup.readthedocs.io/en/v1.5/auto_tutorials/sampling/ResamplingTutorial.html Or offspring hypotheses (borrowing from evolutionary algorithm terminology) as it is a resampling of the "fittest" hypotheses. But yeah not sure those are much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah both are great! I am a bit biased towards offspring because resampled sounds a bit more generic, we've used the term resampling in many other places. I'll go with offspring for now but we can change it later in the actual PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, great write-up and an interesting read. It looks like there are quite a few ideas in here to try out, and I'm looking forward to seeing how it pans out.
I don't have much detailed feedback since my knowledge of EvidenceLM
s details is incomplete. I'll keep digesting it but, but I'm approving so as not to hold up its merge.
Great, thanks @scottcanoe! And thanks @nielsleadholm and @vkakerbeck for the thorough and quick review. |
@vkakerbeck @nielsleadholm @tristanls @codeallthethingz @scottcanoe @hlee9212 I'd like to motion for final comment period. All maintainers please state your final comments/vote. I will take a thumbs up on this comment as a vote for "merge", and thumbs down as a vote for "No merge". However feel free to also write comments to explain or provide additional feedback if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments. Voting to merge.
Why: | ||
* **Faster**: we don't have to wait for high unbounded evidence to decay enough to realize that a new hypothesis is more likely. We also may not need to worry about initializing new hypotheses with mean evidence, giving them fighting chance against other old hypotheses. Average slope is more fair in this sense. | ||
* **Accurate resampling**: If we sample new hypotheses close to the hypotheses with high total accumulated evidence (e.g., particle filter), we could be sampling from the incorrect hypotheses (if we had just switched objects). If we sample close to the hypotheses with high evidence slope, we may converge faster. | ||
* **Practical**: The total evidence can still be unbounded, it doesn't matter because we only consider the slope. This metric does not care about how much evidence we've accumulated already. In other words, a hypothesis with a high evidence, can be removed if it hasn't accumulated evidence in a while, while a consistently growing hypothesis is less likely to be removed even if it was just added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. Considering the average evidence slope over the last S steps only will allow us to forget about objects that disappear in the favor of other objects with rapidly rising evidence. It doesn't matter if the old object has quickly accumulated evidence in the past (i.e., before S steps). But that's another parameter to tune..
rfcs/0000_hypotheses_resampling.md
Outdated
## The Resampling Procedure | ||
|
||
1) Calculate the needed hypotheses counts to be sampled based on the defined parameters as shown [here](#The-Resampling-Count-Calculation). | ||
2) Select `needed_old_sampled_hyp` from the existing hypotheses based on highest evidence slope. We will keep these old hypotheses and remove the rest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: All of these variables have needed_
prefix and _hyp
postfix, which makes them seem redundant as they don't highlight any differences.
suggestion: old_sampled
, new_informed
, new_reinforced
since the method is going to be calculate_new_hypotheses_counts()
, providing the context that these are all hypothesis counts.
The whole function would read as follows, which I regard easier to read:
# a multiplier for the needed number of hypotheses at this new step
hypotheses_count_ratio = 1
# 0 means all sampled from old, 1 means all sampled from new
hypotheses_old_to_new_ratio = 0.5
# 0 means all sampled from informed hypotheses, 1 means all sampled from reinforced hypotheses
hypotheses_informed_to_reinforced_ratio = 0.5
def calculate_new_hypotheses_counts(
current, informed
):
"""
This is a proposed function for calculating the needed count of each type of
hypotheses ("informed", "reinforced", "old").
Inputs:
- `current`: The current number of hypotheses
- `informed`: The total number of informed hypotheses
sampled based on the current pose observation. Assuming there are x points
in the object graph, these can be 2x if the PC is defined or 8x if PC is undefined.
Outputs:
- `old_sampled`: The number of existing hypotheses to be kept from the
previous step. Will not exceed `current`.
- `new_informed`: The number of informed hypotheses to be sampled from
the pool of informed hypotheses. Will not exceed `informed`.
- `new_reinforced`: The number of needed reinforced hypotheses. Can technically
be an infinite amount.
"""
# calculate the total number of hypotheses needed
needed = current * hypotheses_count_ratio
# calculate how many old and new hypotheses needed
old_sampled, new_sampled = (
needed * (1 - hypotheses_old_to_new_ratio),
needed * hypotheses_old_to_new_ratio,
)
# needed old hypotheses should not exceed the existing hypotheses
# if trying to sample more hypotheses, set the available count as ceiling
if old_sampled > current:
old_sampled = current
new_sampled = needed - current
# calculate how many informed and reinforced hypotheses needed
new_informed, new_reinforced = (
new_sampled * (1 - hypotheses_informed_to_reinforced_ratio),
new_sampled * hypotheses_informed_to_reinforced_ratio,
)
# needed informed hypotheses should not exceed the available informed hypotheses
# if trying to sample more hypotheses, set the available count as ceiling
if new_informed > informed:
new_informed = informed
new_reinforced = new_sampled - informed
return (
int(old_sampled),
int(new_informed),
int(new_reinforced),
)
calculate_new_hypotheses_counts(
currrent=100, informed=100
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I've added this to the RFC. I've also changed old_sampled
to old_maintained
since it's not really stochastically sampled as Niels has pointed out in another discussion.
| Parameter | Description | Range | | ||
| ----------|-------------|-------| | ||
| **hypotheses_count_ratio** | A multiplier for the needed number of hypotheses at this new step. `1` means maintain the same number of hypotheses. | [0, inf) | | ||
| **hypotheses_old_to_new_ratio** | How many old to new hypotheses to be added. `0` means all old, `1` means all new | [0, 1] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: I find the use of _ratio
in hypothesis_old_to_new_ratio
(and this table) confusing. I would expect a ratio to not be bound to [0, 1]. When I think ratio of old to new I think of phrases like 2 old to 1 new 2:1, which would be 2 numerically. Maybe this could be called hypothesis_old_to_new_range|mix|factor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point here. Yes, the actual parameter is not a ratio, but instead it allows us to set the ratio (e.g., 0.5 would give us 1:1). I like factor
. We can also call it weight
, as in the weight of old to new in the hypotheses mix.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
| Parameter | Description | Range | | ||
| ----------|-------------|-------| | ||
| **hypotheses_count_ratio** | A multiplier for the needed number of hypotheses at this new step. `1` means maintain the same number of hypotheses. | [0, inf) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Rename to hypothesis_count_multiplier
as per the description.
rfcs/0000_hypotheses_resampling.md
Outdated
needed_old_sampled_hyp = curr_hypotheses_count | ||
needed_new_sampled_hyp = needed_hypotheses_count - curr_hypotheses_count | ||
|
||
# calculate how many informed and re-enforced hypotheses needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Should be reinforced
.
rfcs/0000_hypotheses_resampling.md
Outdated
|
||
## High Level Code Changes | ||
|
||
The needed modification will only change the `EvidenceLM` class. More specifically, we would modify either the `_update_evidence` function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: There is no EvidenceLM
class. I think you mean EvidenceGraphLM
?
This RFC is for resampling hypotheses inspired by particle filters.
View here: https://github.com/ramyamounir/tbp.monty/blob/rfc_hypotheses_resampling/rfcs/0000_hypotheses_resampling.md