Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexedDB putAll #536

Closed
nums11 opened this issue Jul 14, 2020 · 19 comments
Closed

IndexedDB putAll #536

nums11 opened this issue Jul 14, 2020 · 19 comments
Assignees
Labels
Progress: propose closing we think it should be closed but are waiting on some feedback or consensus Review type: CG early review An early review of general direction from a Community Group Venue: Web Apps WG W3C Web Applications Working Group

Comments

@nums11
Copy link

nums11 commented Jul 14, 2020

Saluton TAG!

I'm requesting a TAG review of IndexedDB putAll.

Add a new endpoint to the IndexedDB API that will provide developers with a way to bulk-insert many records into an IndexedDB object store.

Further details:

  • [ ✓ ] I have reviewed the TAG's API Design Principles
  • The group where the incubation/design work on this is being done (or is intended to be done in the future): Google
  • The group where standardization of this work is intended to be done ("unknown" if not known): unknown
  • Existing major pieces of multi-stakeholder review or discussion of this design: Feature Request: Allow batch adding records (from an Array) w3c/IndexedDB#69
  • Major unresolved issues with or opposition to this design: N/A
  • This work is being funded by: N/A

You should also know that...

N/A

We'd prefer the TAG provide feedback as (please delete all but the desired option):

☂️ open a single issue in our GitHub repo for the entire review

@nums11 nums11 added Progress: untriaged Review type: CG early review An early review of general direction from a Community Group labels Jul 14, 2020
@kenchris kenchris self-assigned this Jul 15, 2020
@torgo torgo self-assigned this Jul 15, 2020
@torgo torgo added this to the 2020-07-20-week milestone Jul 15, 2020
@kenchris
Copy link

Why is there All in the name. You are just doing bulk insert, that might not be the same as "all" from developer point of view.

@dmurph
Copy link

dmurph commented Jul 15, 2020

I believe it was chosen to match getAll, and it's functionality seems to match other instances of putAll in languages (all java collections implementation of putAll, "insert all" for sql, etc).

What do you think developers will think putAll means, if not to put all of the arguments into the database? I'm not aware of another interpretation, but it could definitely exist :)

@cynthia
Copy link
Member

cynthia commented Jul 20, 2020

I don't have a strong opinion on what is better, but aside from putAll(), putMultiple() and putBatch() also seem to exist as potential options in different contexts. Explicitly indicating that the function is intended for multiple entries does make perfect sense though.

Question 1) The explainer suggests that a later key-value pair clobbering an early key-value pair is fine; but this feels like it might obstruct debugging for cases where this happens due to human/code errors. (Imagine debugging something where the 3rd entry is clobbered by the 499844th entry out of 1M inserts, and you are trying to figure out why) Would this make sense to throw? (or optionally throw?)

Question 2) The examples seem to suggest this is a synchronous API; given that this may be used to batch insert lots of data that could potentially block - would there be a async path for this feature?

@nums11
Copy link
Author

nums11 commented Jul 20, 2020

Thanks for the input :). Inserting a record with a duplicate key using put overrides the previous record so I assumed that putAll would have the same behavior for consistency.

Sorry for the miscommunication, this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

@nums11
Copy link
Author

nums11 commented Jul 20, 2020

Additionally, IndexedDB already has a separate API endpoint called add() that throws an error on duplicate key insertion so I tried to stay away from that. In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

@cynthia
Copy link
Member

cynthia commented Jul 20, 2020

this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

Thanks for following up on this. There is an inherent risk of inconsistency in the API surface, but could a non-event based asynchronous design be considered here? The event based design I believe that IDB has predates idiomatic async patterns in JS, so it made sense back then but it feels a bit strange to continue with that pattern for a new-ish API.

In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

We'll look into this and provide further feedback. Thank you for the info.

@dmurph
Copy link

dmurph commented Jul 20, 2020

this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

Thanks for following up on this. There is an inherent risk of inconsistency in the API surface, but could a non-event based asynchronous design be considered here? The event based design I believe that IDB has predates idiomatic async patterns in JS, so it made sense back then but it feels a bit strange to continue with that pattern for a new-ish API.

I think this is definitely an option. The main hurdle, in my opinion, is event ordering. If this was a promise, how would we order the execution of this promise vs all of the other events? We have a lot of this defined by spec, and it could get complicated.

@inexorabletash you probably have the best context here. How difficult would it be to make this API return, say, a Promise spec-wise?

I know implementation-wise this would be a little bit rough. We would probably have to still use IDBRequest internally, and have that hold onto the promise resolve, to keep ordering correct with our auto-wrapper.

In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

We'll look into this and provide further feedback. Thank you for the info.

@inexorabletash
Copy link

There has been some exploration of a new API based on top of Indexed DB that introduces a Promise-based async API rather than event-based async API kv-storage but that incubation has not progressed.

Incorporating Promises into Indexed DB is non-trivial, see for example https://github.com/inexorabletash/indexeddb-promises which describes some of the complexity around event loop / transaction integration. In short, doing it for a single method would introduce confusion for developers (as it would differ from other methods) and as the transaction activation mechanism would need to be redesigned to accommodate it (transaction activation is closely tied to the event loop, not microtasks.)

Practically speaking, Indexed DB is predominantly used indirectly by users who instead select libraries, many of which wrap the usage in promises. It's important to make sure that additions to the API surface can be integrated into libraries - i.e. by following the same transaction model.

@annevk
Copy link
Member

annevk commented Jul 21, 2020

I'm a little worried that the motivation here comes down to a slow binding layer. Is that the case?

Or is it really the primitive of bulk-insert-or-fail? I.e., either all the given records get inserted or none get inserted.

@nums11
Copy link
Author

nums11 commented Jul 21, 2020

To answer your question, the primary motivations are partner requests and customer feedback. We hope for some small performance improvements but those aren't certain and would most likely come from amortized costs of certain operations and this will have to be verified through testing.

It would be such that all the records get inserted or none get inserted.

@kbsspl
Copy link

kbsspl commented Jul 28, 2020

would a putAll accepting options with a returnDuplicatekeys:true which onsuccess return a set of keys which existed prior and were updated be useful.

@dmurph
Copy link

dmurph commented Jul 28, 2020

That's a great question. We actually are interested in the ideas:

  1. putAll returns nothing as a result
  2. putAll returns all keys as a result

The duplicate keys idea could be an add-on to the second option.

What would you expect of the above return values?

@kbsspl
Copy link

kbsspl commented Aug 6, 2020

I would expect 1. to be a complete success with all data either added or updated.
For returning results, both, the updated and failed sets would be helpful.

apologies for the delay in responding.

@dmurph
Copy link

dmurph commented Aug 6, 2020

The API is being designed as all-or-nothing, so if one result fails then the whole set fails. Does this match your expectation? There shouldn't be some-failed-some-updated results right now, basically, as then the whole api call would fail anyways (and return an error event)

It sounds like you would expect the list of keys, which we can do.

@kbsspl
Copy link

kbsspl commented Aug 7, 2020

I find both, all-or-nothing, or partial updates having genuine use-cases, but personally would expect a set of keys with partial commits happening.

additional thoughts -
At the risk of increasing the complexity of the putAll usage early on. I would prefer it to be parametrized.

parameters -
allOrPart = All/Part- the approach to take.
returnKeys = Success/Failure/Both - the sets to return

"All" - In this approach while the data itself is not committed.
For "All", "Success" returns all keys updated successfully (which is not very useful for an "All" approach and maybe ignored). "Failure" returns the keys which failed to update for whatever reason. "Both" returns whatever keys were successful and whatever keys failed, to help mitigate.

"Part" - In this approach, any data which could be successfully updated, gets committed.
For "Part" - "Success" returns only all keys updated successfully, "Failure", only the failed keys, "Both" returns both the success and failed key sets.

Another point that I can think of is that the definition of "Success" itself may need to be clarified.
Should everything that was "updated" irrelevant of an addition or update be labeled a "Success". OR Are only "updates" considered a success.

And what is the definition of an "update"
Is updating data without any difference in the incoming and existing sets -("key_1", {title: "Quarry Memories", author: "Fred"}) updated the ObjectStore which previously had ("key_1", {title: "Quarry Memories", author: "Fred"}) an update ?
vs
updating ("key_1", {title: "Quarry Memories", author: "Fred"}) to ("key_1", {title: "Quarry Memories original value", author: "Fred"}) in the ObjectStore

I am not sure how well thought out the comment is, so please ignore or provide feedback to help me understand and see context if I'm not making sense..

@dmurph
Copy link

dmurph commented Aug 7, 2020

Thanks for the feedback!

I guess the main difficulty here with the functionality that you're mentioning is that it conflicts with the existing patterns of the IndexedDB API. Currently all calls to get or modify data in the database follows an asynchronous pattern where the result of the API call is an IDBRequest object, and then the developer listens to a 'success' or 'error' event on that request object.

We don't have any API calls that can 'partially succeed', they all either succeed totally or fail totally. All requests are associated with a transaction, which again either totally atomically commits, or fails & is aborted (and all changes are not in the database).

Regarding updating as changing vs not changing, this also isn't an existing feature of the database. putAll is trying to model a bulk-inserting of put, so at least initially we want to model what put does. The one restriction we DO provide is put vs add, where add will error if a record already exists at the given key. It wouldn't be too hard for us to also create an addAll as well with our current implementation, not sure about others.

The feature knowing if a record has been 'changed' or not might be valuable, but we haven't received any feedback that people want this. We only have feedback that people want a putAll.

To keep things consistent with the existing API patterns, and to keep this change simple, I think we want to have putAll* be all-or-nothing, and have no extra options. I think we should probably return the array of keys that were committed - this is useful for people who have auto-increment on.

If you have a use cases for the behaviors above, definitely let us know. The only one that might be 'faster' if the API supported it would probably be the constraint for telling the user if the value was updated or not. Allowing some puts to fail is already supported by just using the put API as it already exists.

@kbsspl
Copy link

kbsspl commented Aug 10, 2020

thank you for such a detailed response. I understand the considerations better now.

@cynthia cynthia added the Progress: propose closing we think it should be closed but are waiting on some feedback or consensus label Aug 18, 2020
@plinss
Copy link
Member

plinss commented Aug 19, 2020

Overall we're satisfied with this and ready to close. We just wanted to check if other projects that build on top of IndexedDB, such as PouchDB have been consulted and had an opportunity to provide feedback.

@torgo
Copy link
Member

torgo commented Sep 22, 2020

Discussed and agreed to close at our f2f.

@torgo torgo closed this as completed Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Progress: propose closing we think it should be closed but are waiting on some feedback or consensus Review type: CG early review An early review of general direction from a Community Group Venue: Web Apps WG W3C Web Applications Working Group
Projects
None yet
Development

No branches or pull requests

9 participants