Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Request: Dataset Previews #16

Open
b5 opened this issue Aug 17, 2018 · 0 comments
Open

RFC Request: Dataset Previews #16

b5 opened this issue Aug 17, 2018 · 0 comments
Labels

Comments

@b5
Copy link
Member

b5 commented Aug 17, 2018

#14 (comment) articulates our need for "dataset previews":

"dataset preview" that is the head of a dataset, plus a sampling of the body

We want datasets to be as big as they need to be, where a 1TB dataset shouldn't be anything other than a storage space problem. But we also need fast, shorthand ways of describing datasets that move around easily.

Dataset Previews should aim to produce a size-bounded version of a dataset. The hard part is getting sampling right. This could start as just the dataset head + the first number of entries until the body becomes a certain size, and having it be possible that no preview is available if even one row exceeds that minimum size.

This RFC should also try to articulate all of the size variations of a dataset:

  • dataset reference
  • dataset preview
  • full dataset
@b5 b5 changed the title RFC Req: Dataset Previews RFC Request: Dataset Previews Aug 17, 2018
@b5 b5 added the backlog label Feb 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant