Updated review of WebNN API #933

dontcallmedom · 2024-02-15T15:19:24Z

I'm requesting an updated TAG review of WebNN API - previous TAG review: #771

Since the initial Candidate Recommendation Snapshot and the previous TAG review, the Working Group has gathered further implementation experience and added new operations and data types needed for well-known transformers webmachinelearning/webnn#375. In addition, the group has removed selected features informed by this implementation experience: higher-level operations that can be expressed in terms of lower-level primitives in a performant manner, and support for synchronous execution. The group has also updated the specification to use modern authoring conventions to improve interoperability and precision of normative definitions and is developing a new feature, a webmachinelearning/webnn#482, to improve performance and interoperability between the WebNN, WebGPU APIs and purpose-built hardware for ML.

The removal of support for synchronous execution is in-line with TAG's guidance (removal discussed in #531 and moving toward JSPI that is coming finally.

Explainer¹ (minimally containing user needs and example code): https://github.com/webmachinelearning/webnn/blob/main/explainer.md
Specification URL: https://www.w3.org/TR/webnn/
Tests: https://github.com/web-platform-tests/wpt/tree/master/webnn
User research: N/A
Security and Privacy self-review²: Self-Review Questionnaire: Security and Privacy webmachinelearning/webnn#119
GitHub repo (if you prefer feedback filed there): https://github.com/webmachinelearning/webnn/
Primary contacts (and their relationship to the specification):
- Anssi Kostiainen, @anssiko, Intel (chair)
- Dominique Hazael-Massieux, @dontcallmedom, W3C (staff contact)
- Ningxin Hu, @huningxin, Intel (editor)
- Dwayne Robinson, @fdwr, Microsoft (editor)
Organization(s)/project(s) driving the specification: W3C Web Machine Learning Working Group
Key pieces of existing multi-stakeholder (e.g. developers, implementers, civil society) support, review or discussion of this specification:
- Chromium comments: https://chromestatus.com/feature/5738583487938560 (in development)
- Mozilla comments: N/A
- WebKit comments: N/A (but Apple recently joined the WG)

Further details:

I have reviewed the TAG's Web Platform Design Principles
Relevant time constraints or deadlines: expecting to republish as Candidate Recommendation Snapshot in Q1 2024
The group where the work on this specification is currently being done: W3C Web Machine Learning Working Group
Major unresolved issues with or opposition to this specification:
This work is being funded by:

You should also know that...

[please tell us anything you think is relevant to this review]

We'd prefer the TAG provide feedback as open issues in our GitHub repo for each point of feedback

anssiko · 2024-02-29T08:24:45Z

We've issued a CfC to advance with the CR Snapshot publication mid-March noting in this CfC the TAG delta review is currently in flight. We expect this issue to be looked at in the context of the transition request.

As outlined in this issue, your earlier feedback (removal of sync APIs) has been addressed. The rest of the changes since your last review are evolutionary informed by implementation experience. Specifically, we are not expecting you to do another "full" review.

If the group doesn't hear any concerns from you it plans to proceed with the publication. Thank you for your review comments (#771 (comment)) that motivated the removal of the sync APIs.

matatk · 2024-04-03T07:57:38Z

Thanks @anssiko, @dontcallmedom for the review request. We were wondering, could you clarify the changes around transformers? We note you've added new data types and operations in support of them (is there a list?) - did you also add/remove any transformers?

dontcallmedom · 2024-04-03T08:02:41Z

the list of operators added (and the removal of a redundant one) is in webmachinelearning/webnn#478 (comment) based on the detailed analysis made in webmachinelearning/webnn#375 (comment)

anssiko · 2024-04-03T08:32:43Z

@matatk you may also find the updated use cases for transformers webmachinelearning/webnn#507 helpful -- these use cases motivated the new ops discussed in the above-mentioned issue, also linked from the SOTD.

To provide further context on the removal: one op (squeeze) was removed from the initial list of considered transformer ops because it was found out it can be expressed in terms of an existing lower-level op (reshape) in a performant manner. The emulation path for squeeze is presented informatively in the specification.

Please let us know if you have any further questions.

anssiko · 2024-04-25T10:43:16Z

For full disclosure and to close the loop on this review:

A new CR Snapshot (history) was published recently. Thank you for your questions and reviews (plural). We've already received two rounds of reviews from the TAG given we've hit the CRS milestone twice for this spec and appreciate your insights and persistence in working with us as we further evolve this specification. We look forward to another delta review with you as appropriate.

If you have further review comments now or at any time do not hesitate to reach out to our group. We will consider all suggestions regardless of the spec milestone we're targeting. We're currently iterating on CRDs and plan to publish a new CRS approximately every 6-12 months.

matatk · 2024-04-29T18:37:59Z

Hi @anssiko. Thank you for providing the context and info on recent changes, and for the publishing and cadence info. We are still looking into a few things on this review (noting that the 2024-04-29 version is now the current one, as you mentioned). We'll reply on this thread with any additional thoughts.

matatk · 2024-05-23T10:49:29Z

We discussed WebNN earlier this week. We're generally happy with the way this is going. However, in previous discussions on this in the TAG, @cynthia expressed a concern regarding the threading approach - that it's possible that an intensive model running on the GPU could disrupt normal site/content rendering, and that would manifest as things like delays in requestAnimationFrame(). Is this something you have considered?

RafaelCintron · 2024-05-26T23:12:51Z

@matatk and @cynthia with Chromium on Windows, the WebNN context runs on a separate command queue from the Chromium compositor. Depending on the device, the ML work may run on a separate chip than the one which performs 3D work. Even when it runs on the same chip, the ML work is multi-tasked with other work on the system.

As with other web platform features (WebGL, WebGPU, 2D Canvas, CSS blurs, etc) overtaxing the system will eventually affect requestAnimationFrame. Web developers need to be responsible with the content they build.

torgo · 2024-06-10T09:25:54Z

Hi @RafaelCintron - thanks for this detailed response. We're just discussing in our TAG breakout today. Can we just clarify 2 points:

You say "Chromium on Windows" but does this equally apply to other platforms - particularly mobile platforms? Is there implementation guidance pertaining to this in the spec?
We agree that the performance issues we've highlighted may equally apply to other web technologies such as WebGL as you've pointed out. Since we're building a new technology into the web, we have an opportunity to do something to improve the status quo.The WebGPU spec contains specific language in reference to denial of service attacks which seems related to our concerns. Would it be appropriate for WebNN spec to contain similar platform requirements, or at least point to this part of the WebGPU spec?

reillyeon · 2024-06-10T21:44:32Z

@torgo, a challenge here is that WebNN supports more than just GPU compute. What @RafaelCintron mentioned makes sense as a concrete mitigation when the MLContext is configured to prefer GPU compute and we need to coordinate with the browser's WebGPU engine anyways for interop purposes (e.g. passing buffers between WebNN graphs and WebGPU shaders). We have the most implementation experience with that scenario when using DirectML on Windows but are actively prototyping with other frameworks such as Core ML on macOS.

When using the CPU or a dedicated ML accelerator the types of potential resource contention and their mitigations are different. I think a general statement similar to WebGPU's reference to denial of service attacks makes sense to add to WebNN as well, with the understanding that exactly how the mitigations work will be implementation- and configuration-dependent. Implementations should use whatever mechanisms are available from the platform (such as the watchdogs mentioned by WebGPU) to prevent sites from using an unfair amount of system resources but in the end these are shared resources and the use of any compute API will affect overall performance on a fully-loaded system.

cynthia · 2024-06-12T01:30:42Z

Having some guidance in non-normative text, specifically around the different DoS vectors and mitigations would be helpful.

torgo · 2024-06-26T21:19:45Z

Hi - thanks again for bringing this to us. We appreciate that you've been responsive to our feedback. We still have some concerns, but considering the current status of the work, we are planning to close the current review with a 'satisfied with concerns' label.

Our main concern is: has this API considered the full range of hardware that it might need to run on? We see this running on CPUs without neural processing extensions, GPUs without extensions, CPUs with extensions, GPUs with extensions, and dedicated ML hardware. What steps have you taken to ensure that this runs across all of these targets, considering the range of hardware that exists and might exist?

Our second and related concern is about multi-implementer support. If this is going to be ubiquitous as an approach to "do NN on the web" then it really needs to be implemented across different platforms and different hardware.

We encourage you to consider these issues as the spec and technology continues to evolve.

This is to address feedback from the TAG review: w3ctag/design-reviews#933

This is to address feedback from the TAG review: w3ctag/design-reviews#933 Co-authored-by: Dwayne Robinson <[email protected]>

anssiko · 2024-10-24T14:00:21Z

We've made updates to the specification that we believe address the remaining concerns you had, namely:

Added resource contention considerations, and noted denial of service considerations similar to WebGPU apply: Add resource contention considerations webmachinelearning/webnn#765
To demonstrate this API considers the full range of hardware and has multi-implementer support, we've explicitly linked our Implementation Status tracker to the specification header. This tracker presents implementation status across multiple OSes and three backends that abstract out CPU, GPU and NPU device types: Add Implementation Status to metadata webmachinelearning/webnn#769

Please let us know if there are any remaining concerns. Thank you!

dontcallmedom added the Progress: untriaged label Feb 15, 2024

anssiko mentioned this issue Feb 15, 2024

Wide review tracker webmachinelearning/webnn#239

Closed

25 tasks

torgo added Priority: urgent and removed Progress: untriaged labels Mar 21, 2024

torgo self-assigned this Mar 21, 2024

maxpassion self-assigned this Mar 21, 2024

torgo assigned matatk Mar 21, 2024

torgo added Review type: horizontal review A review request from a WG seeking horizontal review to transition a document along the REC track Review type: small delta Topic: Machine Learning Not AI Mode: breakout Work done during a time-limited breakout session labels Mar 21, 2024

torgo added this to the 2024-03-25-week milestone Mar 21, 2024

plinss modified the milestones: 2024-03-25-week, 2024-03-25-week:a Mar 25, 2024

matatk added the Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review label Apr 3, 2024

plinss removed this from the 2024-03-25-week:a milestone Apr 15, 2024

plinss added Progress: in progress and removed Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review labels Apr 15, 2024

plinss added this to the 2024-04-15-week:a milestone Apr 15, 2024

torgo modified the milestones: 2024-04-15-week:a, 2024-04-22-week:a Apr 21, 2024

plinss modified the milestones: 2024-04-22-week:a, 2024-04-29-week:a Apr 29, 2024

rhiaro added the Venue: WebML CG label May 6, 2024

rhiaro added Focus: Security (pending) Focus: Privacy (pending) labels May 6, 2024

plinss modified the milestones: 2024-04-29-week:a, 2024-05-13-week:a May 13, 2024

torgo modified the milestones: 2024-05-13-week:a, 2024-05-20-week:a May 19, 2024

plinss modified the milestones: 2024-05-20-week:a, 2024-05-27-week:e May 27, 2024

plinss modified the milestones: 2024-05-27-week:e, 2024-06-10-week:a Jun 10, 2024

torgo modified the milestones: 2024-06-10-week:a, 2024-06-17-week:c Jun 16, 2024

plinss modified the milestones: 2024-06-17-week:c, 2024-06-24-week:e Jun 24, 2024

torgo closed this as completed Jun 26, 2024

torgo added Progress: review complete Resolution: satisfied with concerns The TAG is satisfied with this work overall but requires changes and removed Progress: in progress labels Jun 26, 2024

anssiko mentioned this issue Aug 23, 2024

WebML WG - TPAC 2024 agenda webmachinelearning/meetings#25

Open

anssiko added a commit to webmachinelearning/webnn that referenced this issue Oct 10, 2024

Add resource contention considerations

8d71e18

This is to address feedback from the TAG review: w3ctag/design-reviews#933

anssiko mentioned this issue Oct 10, 2024

Add resource contention considerations webmachinelearning/webnn#765

Merged

anssiko mentioned this issue Oct 21, 2024

Add Implementation Status to metadata webmachinelearning/webnn#769

Merged

anssiko added a commit to webmachinelearning/webnn that referenced this issue Oct 24, 2024

Add resource contention considerations (#765)

2cc59c5

This is to address feedback from the TAG review: w3ctag/design-reviews#933 Co-authored-by: Dwayne Robinson <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated review of WebNN API #933

Updated review of WebNN API #933

dontcallmedom commented Feb 15, 2024 •

edited

Loading

anssiko commented Feb 29, 2024

matatk commented Apr 3, 2024

dontcallmedom commented Apr 3, 2024

anssiko commented Apr 3, 2024

anssiko commented Apr 25, 2024

matatk commented Apr 29, 2024

matatk commented May 23, 2024

RafaelCintron commented May 26, 2024

torgo commented Jun 10, 2024

reillyeon commented Jun 10, 2024

cynthia commented Jun 12, 2024

torgo commented Jun 26, 2024

anssiko commented Oct 24, 2024

Updated review of WebNN API #933

Updated review of WebNN API #933

Comments

dontcallmedom commented Feb 15, 2024 • edited Loading

anssiko commented Feb 29, 2024

matatk commented Apr 3, 2024

dontcallmedom commented Apr 3, 2024

anssiko commented Apr 3, 2024

anssiko commented Apr 25, 2024

matatk commented Apr 29, 2024

matatk commented May 23, 2024

RafaelCintron commented May 26, 2024

torgo commented Jun 10, 2024

reillyeon commented Jun 10, 2024

cynthia commented Jun 12, 2024

torgo commented Jun 26, 2024

anssiko commented Oct 24, 2024

dontcallmedom commented Feb 15, 2024 •

edited

Loading