Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add associated functions/methods for exposing/loading filter data in BloomFilter + CuckooFilter #125

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DJAndries
Copy link

No description provided.

@DJAndries DJAndries changed the title Add associated functions/methods for exposing/loading bitmap data in BloomFilter Add associated functions/methods for exposing/loading filter data in BloomFilter + CuckooFilter Feb 12, 2023
Copy link
Member

@crepererum crepererum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the interfaces: why is there a "create w/ existing" for the bloomfilter and cuckoofilter, but only the cuckoofilter as a "load" method?

I suggest we streamline the interface a bit and add a dump(&self) -> Vec<u8> method to both which includes the full state (for the cuckoofilter, this also includes n_elements) and one to restore it (maybe load(&mut self, data: &[u8])). This way we could later lift this up into a trait and have this working for multiple data structures.

Self {
bs: FixedBitSet::with_capacity_and_blocks(m, bitmap),
k,
builder: HashIterBuilder::new(m, k, buildhasher),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this check that the iterator has the correct number of elements for the given parameters?

Comment on lines +304 to +311
for (i, block) in table_succinct_blocks.into_iter().enumerate() {
assert!(
i < filter.table.block_len(),
"existing input table block length must not exceed filter table block length"
);
filter.table.set_block(i, block);
}
filter.n_elements = n_elements;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just use filter.load_table(...) instead of a duplicate code block?

self.clear();
for (i, value) in table.into_iter().enumerate() {
let i = i as u64;
assert!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a should_panic test that triggers this assertion.

/// and existing element count.
pub fn load_table<I: IntoIterator<Item = u64>>(&mut self, table: I, n_elements: usize) {
self.clear();
for (i, value) in table.into_iter().enumerate() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably validate that the iterator has the correct length because this depends on the filter parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants