Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tables without deltas don't replenish downstream tables #303

Closed
jhellerstein opened this issue Mar 7, 2013 · 5 comments
Closed

tables without deltas don't replenish downstream tables #303

jhellerstein opened this issue Mar 7, 2013 · 5 comments
Assignees

Comments

@jhellerstein
Copy link
Member

(courtesy @ilikerps)
Given a rule like

derived <+ base

and a delete from derived, we'd expect that base would "replenish" the deleted values from derived. This doesn't currently happen. See code below:

require 'rubygems'
require 'bud'

class Test
  include Bud
  state do
    table :myTable, [:str]
    table :other, [:str]
  end

  bloom do
    other <+ myTable
  end
end

t = Test.new
t.myTable <+ [["Hello!"]]
t.tick
t.tick
puts t.other.entries # prints ["Hello!"]

t.other <- [["Hello!"]]
t.tick
t.tick
puts t.myTable.entries # prints ["Hello!"]
puts t.other.entries   # prints []
@aarondav
Copy link
Contributor

aarondav commented Mar 7, 2013

And additionally, even with a delta, only the delta is added to the downstream table.

@ghost ghost assigned neilconway Mar 7, 2013
@neilconway
Copy link
Member

This might be related to #292 (which, unfortunately, seems like it will take some work to fix properly). I'll take a look...

@neilconway
Copy link
Member

Ugh, there are at least 3 different errors in the rescan/invalidate logic interacting here; similar to the problems underlying #292, but at least some of the problems are distinct.

  1. When a table receives a deletion, we mark it as invalidated; we we also want to (a) mark the table's scanner, if any, for rescan (b) mark any downstream nodes as both invalidated and needing rescan (c) mark any upstream nodes as needing rescan. Note that (b) is because downstream state should be blown away to prep for this table's rescan, whereas (c) is because we want to refill the table that received the deletion (e.g., because there might be another way to derive the deleted tuple). The problem is that the set of other elements to rescan/invalidate is associated with the table's scanner -- but only tables that appear on the RHS of rules have scanners (other would not have a scanner above). So the fix here is probably to move the set of to-rescan/to-invalidate elements to the collection itself.
  2. The code for computing the set of elements to rescan/invalidation upon deletion was just wrong; for each scanner s, the previous coding considered "what elements would need rescan/invalidation were s to need rescan?" This is the wrong hypothetical; the situation is that s has been invalidated, not that it needs rescan (e.g., if a scanner on a table is to be rescanned, we don't also need to rescan upstream nodes; whereas if the table is to be invalidated, we do).
  3. Next, the code neglected to consider the need to rescan nodes connected via <+ rules; it only did <= rules. Or at least, I suspect this is fishy; need to fix the first two issues first, which are problems regardless.

neilconway added a commit that referenced this issue Mar 9, 2013
The rescan/invalidation logic runs into several different problems when
presented with deletions to a persistent table that is downstream of other
persistent tables -- see #303. This commit fixes one of those issues: when
computing the invalidation scheme, we should consider what needs to be done when
each table is invalidated, NOT when each table is marked for rescan.

Other related issues will be addressed in subsequent commits.
neilconway added a commit that referenced this issue Mar 10, 2013
See #303. This commit fixes the case where we have a deletion on a table that
does not appear in the RHS of any rules. Since the table does not get a scanner,
the previous coding neglected to compute an invalidation/rescan scheme for this
collection.

A possible fix would be to move the necessary invalidation/rescan state from the
scanner to the collection itself. However, this is complicated -- a single
collection might have multiple scanners with different rescan requirements, for
example. For now, it seems simpler (but possibly uglier) to create a separate
list of "orphan scanners" -- i.e., scanners for collections that appear in at
least one rule LHS but no RHSs. We basically consider these scanners like normal
when computing the rescan/invalidation scheme -- but naturally we don't need to
consider them during the fixpoint computation.
@aarondav
Copy link
Contributor

Thanks for the fixes! Will this cause stdio <~ to behave more intuitively as well? It seemed to have the same problem.

@neilconway
Copy link
Member

Nope; that issue is #292 (#297 is also related), which needs fixing separately. That's on my TODO list; hopefully I'll get a chance to fix it today or tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants