Skip to content

Add quota exceeded error details and input usage APIs #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,73 @@ console.log(summary); // will be in Chinese

If the `outputLanguage` is not supplied, the default behavior is to produce the output in "the same language as the input". For the multilingual input case, what this means is left implementation-defined for now, and implementations should err on the side of rejecting with a `"NotSupportedError"` `DOMException`. For this reason, it's strongly recommended that developers supply `outputLanguage`.

### Too-large inputs

It's possible that the inputs given for summarizing and rewriting might be too large for the underlying machine learning model to handle. The same can even be the case for strings that are usually smaller, such as the writing task for the writer API, or the context given to all APIs.

Whenever any API call fails due to too-large input, it is rejected with a `QuotaExceededError`. This is a proposed new type of exception, which subclasses `DOMException`, and replaces the web platform's existing `"QuotaExceededError"` `DOMException`. See [whatwg/webidl#1465](https://github.com/whatwg/webidl/pull/1465) for this proposal. For our purposes, the important part is that it has the following properties:

* `requested`: how many tokens the input consists of
* `quota`: how many tokens were available (which will be less than `requested`)

("[Tokens](https://arxiv.org/abs/2404.08335)" are the way that current language models process their input, and the exact mapping of strings to tokens is implementation-defined. We believe this API is relatively future-proof, since even if the technology moves away from current tokenization strategies, there will still be some notion of requested and quota we could use, such as normal JavaScript string length.)

This allows detecting failures due to overlarge inputs and giving clear feedback to the user, with code such as the following:

```js
const summarizer = await ai.summarizer.create();

try {
console.log(await summarizer.summarize(potentiallyLargeInput));
} catch (e) {
if (e.name === "QuotaExceededError") {
console.error(`Input too large! You tried to summarize ${e.requested} tokens, but only ${e.quota} were available.`);

// Or maybe:
console.error(`Input too large! It's ${e.requested / e.quota}x as large as the maximum possible input size.`);
}
}
```

Note that all of the following methods can reject (or error the relevant stream) with this type of exception:

* `ai.summarizer.create()`, if `sharedContext` is too large;

* `ai.summarizer.summarize()`/`summarizeStreaming()`, if the combination of the creation-time `sharedContext`, the current method call's `input` argument, and the current method call's `context` is too large;

* Similarly for writer creation / writing, and rewriter creation / rewriting.

In some cases, instead of providing errors after the fact, the developer needs to be able to communicate to the user how close they are to the limit. For this, they can use the `inputQuota` property and the `measureInputUsage()` method on the summarizer/writer/rewriter objects:

```js
const rewriter = await ai.rewriter.create();
meterEl.max = rewriter.inputQuota;

textbox.addEventListener("input", () => {
meterEl.value = await rewriter.measureInputUsage(textbox.value);
submitButton.disabled = meterEl.value > meterEl.max;
});

submitButton.addEventListener("click", () => {
console.log(rewriter.rewrite(textbox.value));
});
```

Developers need to be cautious not to over-use this API, however, as it requires a round-trip to the language model. That is, the following code is bad, as it performs two round trips with the same input:

```js
// DO NOT DO THIS

const usage = await rewriter.measureInputUsage(input);
if (usage < rewriter.inputQuota) {
console.log(await rewriter.rewrite(input));
} else {
console.error(`Input too large!`);
}
```

If you're planning to call `rewrite()` anyway, then using a pattern like the one that opened this section, which catches `QuotaExceededError`s, is more efficient than using `measureInputUsage()` plus a conditional call to `rewrite()`.

### Testing available options before creation

All APIs are customizable during their `create()` calls, with various options. In addition to the language options above, the others are given in more detail in [the spec](https://webmachinelearning.github.io/writing-assistance-apis/).
Expand Down
Loading