-
Notifications
You must be signed in to change notification settings - Fork 87
Description
From: w3ctag/design-reviews#1093 (comment)
We see that hybrid or remote inference are design goals For example, user agents or the underlying platform could decide to process a prompt on a server based on the prompt's complexity, device capabilities, resource consumption, or environmental conditions.
However, the API only incorporates the complexity of downloading local models. We'd like you to pay more attention to the complexity of running inference in the cloud. One big example is that cloud models are often subscription services, and so the user needs to be in charge of how much of their quota the site gets to use. The location of inference (local, edge, cloud) could also have a significant impact on response quality and privacy, and sites may need to be able to respond to those differences. Please look into what API changes will be needed to fully support hybrid and cloud-based execution.