index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8" />
    <title>Client-Edge-Cloud coordination Use Cases and Requirements</title>

    <style>
    .two-cols {
      display: grid;
      grid-template-columns: 1fr 1fr;
    }

    table {
      border-collapse:collapse;
    }

    table,th, td {
      border: 1px solid #666;
    }

    td {
      padding:2px 15px;
    }
    </style>

    <script async class="remove" src="https://www.w3.org/Tools/respec/respec-w3c"></script>

    <script class="remove">
        var respecConfig = {
            specStatus: "ED",
            copyrightStart: "2021",
            edDraftURI: "https://w3c.github.io/edge-computing-web-exploration/",
            github: "https://github.com/w3c/edge-computing-web-exploration",

            latestVersion: null,
	    noRecTrack: true,
            copyrightStart: "2021",

            editors: [{
                name: "Dapeng(Max) Liu",
                companyURL: "http://www.alibabagroup.com/en/global/home",
                company: "Alibaba Group"
              },

              {
                name: "Michael McCool ",
                companyURL: "https://www.intel.com/",
                company:"Intel"
              },

              {
                name: "Song Xu",
                companyURL: "https://www.chinamobileltd.com",
                company:"China Mobile"
              }
            
            ],

            group: "web-networks"

        };
    </script>
</head>

<body>
    <section id='abstract'>
        <p>
        This document introduces the use cases and requirements of client, edge, cloud coordination mechanism and its standardization. 
        </p>
    </section>
    <section id='sotd'>
        <p>
        This is still a work in progress. The proposal is being incubated in the <a href="https://github.com/w3c/web-networks/">W3C Web & Networks Interest Group</a>.
        </p>
    </section>

    <section>
      <h2>Introduction</h2>
        <p>With the rapid development of cloud computing technology, the centralized cloud is evolving towards distributed "edge cloud" that allows developers to deploy their code as FaaS in 
          the edge cloud which is close to the user's location. One of such service is Alibaba Cloud's <a href="https://www.alibabacloud.com/help/en/dynamic-route-for-cdn/latest/er-overview">EdgeRoutine</a> 
          service.</p>
      
        <p>
          With the rapid adoption of new technologies such as machine learning, IoT ect in the client side's applications, the client side's application may also need to perform computing intensive work. For example, machine
          learning inference can also be done in client side. As one of such examples, Taobao mobile App leverages client side machine learning inference for user face detection etc. 
          W3C is also working on <a href="https://www.w3.org/groups/wg/webmachinelearning">WebNN</a>  standard that allow client side developers to leverage the machine learning acceleration 
          hardwares that resides in the client side devices.
        </p>

        <p>
          To improve the client side application's performance, there is a trend to offload computing intensive work to the edge cloud, such as cloud app, cloud gaming etc.
          However, current approach could be further optimized if there is a mechanism for Client-Edge-Cloud coordination.
          This document discusses the use cases and requirements of Client-Edge-Cloud coordination mechanism and its standardization.
        </p>

        <p>
          
        </p>

    </section>

    <section id="terminology">
      <h2>Terminology</h2>
      <p>This document uses the following terms with the specific meanings defined here.
         Where possible these meanings are consistent with common usage.
         However, note that common usage of some of these terms have 
         multiple, ambiguous, or inconsistent meanings.
         The definition here will take precedence.
         When terms are used with the specific meanings defined here they will
         be capitalized.
         When possible reference to existing standards defining these terms is given.
      </p>
      <dl>
<!-- Example, consistent with ReSpec
        <dt><dfn data-lt="term" class="lint-ignore">Term</dfn></dt>
        <dd>Defn.</dd>
-->
        <dt><dfn class="lint-ignore">Edge<dt>
        <dd>
        The periphery of a network.  
        </dd>
        <dt><dfn class="lint-ignore">FaaS<dt>
        <dd>Function as a Service.
        A service provided by a Computing Resource that can execute a stateless computation.
        </dd>
        <dt><dfn class="lint-ignore">Cloud<dt>
        <dd>
        A set of managed services that are 
        designed to be interchangable, scalable, and location-independent.
        </dd>
        <dt><dfn class="lint-ignore">Cloud Resources<dt>
        <dd>
        A set of managed Computing Resources available in the Cloud.
        </dd>
        <dt><dfn class="lint-ignore">Edge Cloud<dt>
        <dd>
        A set of Edge Resources managed as an extension of the Cloud.
        Such resources use similar abstractions and management APIs as a cloud
        but typically will add mechanisms to manage location and latency,
        and will typically be deployed in a less centralized, more location and
        latency-sensitive manner than a typical Cloud.
        </dd>
        <dt><dfn class="lint-ignore">Edge Resource<dt>
        <dd>
        A Computing Resource located on the Edge, that is, near the periphery of a network.
        Note: this definition does not
        necessarily include "endpoints" such as IoT sensors.
        It refers specifically to computers
        that can make Computing Resources available to others on the network.  
        </dd>
        <dt><dfn class="lint-ignore">Migration<dt>
        <dd>
        The ability to move a workload from one Computing Resources to another.
        See also Live Migration and Application-Directed Migration, which are subclasses.
        </dd>
        <dt><dfn class="lint-ignore">Live Migration<dt>
        <dd>
        The ability to transparently move a running workload from one Computing Resources to another.
        This includes transparent migration of state and updates to references, so that the
        application that invoked the resource does not need to manage the transition and
        does not need to be made aware of it.  Such migration needs to be implemented with
        minimum impact on quality of service factors such as latency of responses.
        See also the more general defintion of Migration.
        </dd>
        <dt><dfn class="lint-ignore">Application-Directed Migration<dt>
        <dd>
        The ability to move a running workload from one Computing Resources to another under
        control of an application.  In this version of Migration, the controlling 
        application or the Workload itself needs to manage the orderly transfer of state
        from one Computing Resource to another, and may also have to explictly update
        references, and will have to explictly manage quality of service factors such
        as latency of response.
        See also the more general definition of Migration.
        </dd>
        <dt><dfn class="lint-ignore">Computing Resource<dt>
        <dd>
        Any computer which can be used to execute a Workload, and may
        include Edge, Cloud, or Client computers.
        </dd>
        <dt><dfn class="lint-ignore">Client Computer<dt>
        <dd>
        A Computing Resource used directly by an end user, such as a laptop or desktop.
        Such a Computing Resource may also act as an Edge Resource if it provides
        Computing Resources to other systems on the network. 
        </dd>
        <dt><dfn class="lint-ignore">CDN<dt>
        <dd>
        Content Distribution Network. 
        A specialized network and set of computers targetted 
        at caching and delivering content with low latency.
        May also host Edge Resources.
        </dd>
        <dt><dfn class="lint-ignore">MEC<dt>
        <dd>
        <a href="https://www.etsi.org/technologies/multi-access-edge-computing">Multi-access Edge Computing</a>. 
        A form of Edge Computing 
        based on Computing Resources 
        typically hosted within a cellular network's infrastructure.
        </dd>
        <dt><dfn class="lint-ignore">Workload<dt>
        <dd>
        A packaged definition of the compute work required to be executed
        on a Computing Resource.  
        For example, a workload might be a container image, a script, or WASM.
        </dd>
      </dl>
    </section>

    <section>
      <h2>Stakeholders and Business Models</h2>
      <p>Different stakeholders with an interest in edge computing will have different
      motivations and priorities.  In this section we present an categorization of the different
      kinds of stakeholders and their business models.  As we present use cases and discuss
      proposals we can then relate these to the motivating drivers of different stakeholders.
      Note that some stakeholders may belong to more than one category.
      </p>
      <table>
        <thead>
          <tr>
            <th>Abbv</th>
            <th>Category</th>
            <th>Business Model</th>
            <th>Motivation</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td>BWSR</td>
            <td>Browser Vendor</td>
            <td>OSS - supported by other business (e.g. CSP, ads/search)</td>
            <td>More applications can use web</td>
          </tr>
          <tr>
            <td>CSP</td>
            <td>Cloud Service Provider</td>
            <td>Usage or subscription, account based (service provider pays)</td>
            <td>Offer edge computing service.</td>
          </tr>
          <tr>
            <td>CDN</td>
            <td>Content Distribution Network</td>
            <td>Usage or subscription, account based (service provider pays)</td>
            <td>Offer edge computing service</td>
          </tr>
          <tr>
            <td>ISP</td>
            <td>Internet Service Provider</td>
            <td>Subscription/rental; HW sales in some cases</td>
            <td>Offer edge computing service</td>
          </tr>
          <tr>
            <td>HW</td>
            <td>Hardware Vendor</td>
            <td>Sale or rental</td>
            <td>Desktops/servers as private edge computers</td>
          </tr>
          <tr>
            <td>NET</td>
            <td>Mobile Network Provider (MEC)</td>
            <td>Usage or subscription, account based (user pays)</td>
            <td>Offer compute utility service</td>
          </tr>
          <tr>
            <td>OS</td>
            <td>Operating System Vendor</td>
            <td>Sale or subscriptions to OS licenses; HW co-sales</td>
            <td>HW co-sales for edge computers</td>
          </tr>
          <tr>
            <td>APPL</td>
            <td>Application Developer</td>
            <td>Sale or subscription to software licenses (or in some cases, ad supported)</td>
            <td>Avoid limitations of client and/or cloud platforms</td>
          </tr>
          <tr>
            <td>SVC</td>
            <td>Web Service (API) Provider</td>
            <td>Usage or subscription, account based (user pays)</td>
            <td>Improved deployment options; increased usage</td>
          </tr>
          <tr>
            <td>USER</td>
            <td>End User</td>
            <td>Direct payment, bundled cost, or private HW</td>
            <td>Improved performance, lower latency</td>
          </tr>
        </tbody>
      </table>
    </section>

    <section>
      <h2>Use Cases</h2>

      <p>
        The client side application could be generally classified into the following categories:
      </p>
          <li>Render intensive application</li>
              <p>
                Render intensive application refers to the client side applications whose main task is to fetch the content from the backend server then rendering the content in the front-end.
                For example, news,social media Web applications and mobile applications belongs to this category.
              </p>
        <li>Computing intensive application</li>
              <p>
                Computing intensive application refers to the client side applications whose main task is to do computing intensive work in the client side. For example, mobile gaming applications 
                need to calculate certain object's location and other complex parameters based on user interaction then rendering in the client side. 
              </p>
        <li>Hybrid application</li>
              <p>
                Hybrid application refers to the application whose main task includes both rendering intensive work and computing intensive work. For example, morden e-commerce mobile application 
                leverage machine learning inference in the client side for AR/VR tpye user experience. At the same time, the e-commerce mobile application needs to fetch dynamic content based
                on uesr preference.
              </p>
        <li>Mobile/static client</li>
            <p>
              Some client side applications remain static most of the time. For example, a camera for traffic monitorning and analysis do not require mobility support. 
              On the other hands, some client side application will change its location continuously. For example, for applications running on a connected vehicle or self driving vehicle, 
              it will change its location rapidly with the vehicle.
            </p>


      <p>
        The use cases in the follwing sections are generally classified into the following categories based on the different work load type:
      </p>

      <section>
        <h3>Accelerated work loads</h3>
          <p>
            For this category of use cases, the client side application leverage edge cloud for accelerating certain work loads by offloading those work loads to the edge.
          </p>
          <section id="UC-CA">
            <h4>Cloud App</h4>
                <p>
                  Cloud App is a new form of application which utilizing cloud computing technology to move client side application's work load to the edge cloud and central cloud. The user interaction happens 
                  at the client device side and the computing and rendering process happens in the edge cloud side. This can accelerate the client side application's performance, lower the client's hardware requirement and reduce the cost.
                </p>
      
                <p>
                  As one examples of cloud App, Alibaba Group's Tmall Genie smart speaker leverage edge cloud to offload the computing intensive and accelerated work load from the client side to the edge cloud.
                </p> 
      
                <p>
                  The client, the central cloud, the edge cloud works together in a coordinated way for Cloud App. Typically, the control and orchestration function is located in the central cloud.
                  The computing intensive function is located in edge cloud. The user intercation and display function is located in the client.
                </p>
      
                <figure>
                  <img alt="CloudApp" src="images/CloudApp.png" width="600">
                  <figcaption>
                    Cloud App Architecture
                  </figcaption>
                </figure> 
          </section>

          <section id="UC-VR">
            <h4>VR/AR Acceleration</h4>
              <p>
                VR/AR devices such as VR/AR glasses normally has limited hardware resources, so it is preferred to offload the computing intensive task to the edge for acceleration and
                reducing delay since the edge server is deployed near the location of the user.
              </p>
              <p>
                Note: this could be generalized to "acceleration of low-latency tasks".  Some other examples might include 
                game physics simulation or CAD tools (in a business environment).  The latter might add confidentiality
                constraints (a business user may want to offload to on-premises computers).
                We may also want to clarify that this pattern is for local communication to/from the client.
                See also "Streaming Acceleration", where the communication is in-line with an existing network
                connection.
              </p>
          </section>

          <section id="UC-CG">
            <h4>Cloud Gaming</h4>
              <p>
                Cloud gaming is a game mode which leverages cloud/edge computing. Under the operation mode of cloud game, the games are running on the cloud side, and the game images are compressed and transmitted to users by video stream through the network after rendering. 
                The cloud gaming user can receive the game video stream and send control commands to the cloud to control the game.
              </p>

              <p>Taking Click-and-Play scenario for example, since all the rendering and control commands are offloaded to the edge, the cloud game user don't need to install the game locally, just click the game and then play with it smoothly. </p>

              <p>
              By offloading the gaming work load to the edge and making full use of the computing power and low-latency network, the cloud gaming can provide more smoother experience.
              </p>
          </section>


          <section id="UC-SA">
            <h4>Streaming Acceleration</h4>
            <p>In the case of video acceleration, we may want to offload work to a location with both
              compute performance that is already on the network path.  Specifically, consider a low-performance
              client that wants to compress video or do background removal as part of a teleconference.
              It could connect over a local high-performance wireless network to a local edge computer (perhaps
              an enhanced ISP gateway box) that would then perform the video processing and forward the processed
              video to the network.
            </p>
            <p>
              We may also want to clarify that this pattern is for communication  in-line with an existing
              network.
              See also "VR/AR", where the communication is to/from the offload target.
            </p>

            <section id="UC-VC">
              <h5>Online Video Conference</h5>
                <p>
                  One special case of streaming acceleration is online video conference application. The online video conference system provides realtime translation and subtitles service. 
                  This will use AI/Machine learning technology and it is computing intensive. Also, the real time translation service is very delay sensitive. 
                </p>
                <p>
                  The online video conference application could be installed on PC terminals or mobile terminals.
                  For PC terminals, there is enough computing resources and enough disk storage to allow the installation of online video conference application. In this case, the computing
                  intensive work could be done in the PC terminal and providing ultra-low latency user experience.
                </p>
  
                <p>
                  For mobile terminals, there is limited disk storage and limited computing capability, it is not possbile to run the computing intensive task on the mobile terminals. In this case,
                  the computing intensive task could be offloaded to the edge and then providing ultra-low latency user experience.
                </p>
  
                <p>
                  It is preferred that in this use case the online video conference application can offload the computing intensive task according to the terminal capability and edge resources availability.
                  The online video conference service provider can provide consistent user experience on different terminals.
                </p>
            </section>
          </section>

          <section id="UC-MLA">
            <h4>Machine Learning Acceleration</h4>
              <p>
                Machine learning inference can be done in the client side to reduce latency. W3C is working on <a href="https://www.w3.org/groups/wg/webmachinelearning">WebNN</a>  standard that 
                allows client side developers to leverage the machine learning acceleration hardwares reside in the client side devices.
              </p>
    
              <p>
                The client side devices may have different hardware capabilities. For example, some mobile phone have very limited hardware resource and do not have GPU. 
                It is very difficult for this kind of client devices to run machine learning code on it. 
    
                So in this scenario, it is preferred to offload machine learning code to the edge cloud. The user experience is greatly improved in this case.
              </p>

              <section id="UC-IVPU">
                <h5>Image and Video Processing and Understanding</h5>
                <p>
                  One special case of machine learning acceleration is Image and Video Processing and Understanding. For some mobile image/video processing applications, it is required to use machine learning 
                  algorithm for the image/video analysis and processing. It is normally need NPU chipset on the terminals for better performance.
                  For the terminals that without NPU chipset, it is preferred to offload the machine learning computing intensive work load to the edge cloud and this will provide a consistent user experience 
                  on different terminals.
                </p>
              </section>

              <section id="UC-PWMP">
                <h4>Professional Web-based Media Production</h4>
                  <p>
                    Another special case and example of machine learning acceleration is Professional Web-based Media Production. 
                    Processing and rendering media is a complex task. For example, a video editing application needs to do image processing, video editing, audio editing, etc. 
                    So it has high performance requirements.
                  </p>
        
                  <p>
                    Professional Web-based media production relies on web-based media editing tools heavily which can be used to do AI Cutting, AI Editing, AI Transcoding and publish videos to the cloud. 
                    Since the edge cloud has more powerful computing power and close to user's location,  by offloading the expensive rendering process to the edge, the web apps can render media more quickly 
                    and provide better user experience.
                  </p>
        
                  <figure>
                    <img alt="WebMedia" src="images/web_based_media.png" width="600">
                    <figcaption>
                      Web-based media production
                    </figcaption>
                  </figure>
              </section>

          </section>

       
          <section>
            <h3>Rbustness of Workload Acceleration for Certain Applications</h3>
              <p>
                For the applications which offload some parts of its functionalities to the edge cloud, if the edge cloud is not available due to the client mobility or other reasons,
                it is preferred that the functionalities could be handed over back to the client. More complex rules could be designed to improve the robustness of the application.
              </p>
              
              <section id="UC-LVB">
                <h4>Live video broadcasting mobile application</h4>
                 
                  <p>
                    One example of such use cases is: for the mobile application (for example, live video broadcasting mobile application) that leverage edge cloud for computing and/or machine learning 
                    acceleration, when certain condition (edge clould availability, network condition etc.) is not met for work load acceleration by offloading , it shall have the ability to migrate the work 
                    load back to the client side to ensure the robustness and availability of the application.
                  </p>
              </section>

              <section id="UC-ALPR">
                <h4>Automatic License Plate Recognition</h4>
                <p>
                  Another example is Automatic License Plate Recognition. For automatic license plate recognition applications, offline processing can provide 90% recognition rate. Online processing on the edge will improve the recognition rate to 99%. 
                  It is preferred to offload the license plate recognition computing intensive task to edge when the network connection is stable and if the network condition is not stable or broken, the offloaded computing
                  intensive task could move back to the terminals to guarantee the availability of the service.
                </p>
              </section>
          </section> 
      </section>
      

      <section>
        <h3>IoT workloads</h3>
          <section id="UC-RNA">
            <h4>Robot Navigation Acceleration</h4>
              <p>
                Consider a robot navigating in a home using Visual SLAM.  In this case the robot has limited
                performance due to cost and power constraints.  So it wishes to offload the video processing work
                to another computer.  However, the video is sensitive private data so the user would prefer that
                it does not leave the user's premises, and would like to offload the processing to an existing
                desktop computer or an enhanced gateway.  Latency may also be a concern (the robot needs the 
                results immediately to make navigation decisions, for example to avoid a wire or other obstacle
                on the floor).
                </p>
                <p>
                Note: in general, there are other opportunities for IoT devices to want to offload work to another
                computer.  Video processing however is of special interest because of its high data and processing
                requirements and privacy constraints.
                </p>
          </section>


      </section>

      <section id="UC-PW">
        <h3>Persistent workloads</h3>
          <p>
            In some cases it may be desireable to a task from a browser that continues to run even when the 
            browser application is not active.  This could be used to monitor a condition for example and send
            a notification when that condition is met.  As a sub-category of this use case, the offloaded task
            might be used to monitor IoT devices and instead of or in addition to sending a notification, it
            might be used for automation. 
            Such an offloaded task might also be used to execute long-running computational
            tasks such as machine learning or data indexing.
        </p>
        <p>
            Persistent tasks require a mechanism to manage their lifetime using expiry dates or explicit controls.
            In the case of applying this to IoT orchestration, there is also the issue of granting access rights
            to such offloaded tasks, for example access to a LAN, to specific IoT devices on that LAN, and to the 
            data they generate.
        </p>
      </section>
    </section>


    <section id="gap">
      <h2>Gap Analysis</h2>

      <section>
        <h2>Common approaches for offloading</h2>
        <p>
          Currently, there are two common approaches for offloading. One is to send the codes in the request from the client to the edge server, the other is to fetch the codes from inner file repositories on edge and execute them. They all work fine, but both approaches have some downsides.
  
          <li>Sending codes from client to edge</li>
  
          <p>In this approach, the client sends the local codes to the edge and execute the codes on the edge side. The downside is obvious, since more data is transferred and it may cause network latency.Handling more data will also put more strain on the resource-restricted end device. Meanwhile, some codes are sensitive, data security is also an big issue.  </p>
  
          <li>Fetching codes from file repositories and executing on edge</li>
          <figure>
            <img alt="EdgeFetch" src="images/edge_fetching_codes.png" width="600">
            <figcaption>
              Fetching codes from inner file repositories
            </figcaption>
          </figure>
  
          <p>In this approach, The client leverages user-defined offloading library to send proper params to the offloading server and the server will fetch specified codes from inner repositories and then execute the codes.</p>
  
          <p>The downside of this approach is that additional file repositories are needed and the developers have to upload the codes to the repository and make sure the local and the edge side have the same version codes. </p>

          <p>Meanwhile, since the offloading library plays an import role in offloading, developers should be reliable for  creating a robust offloading policy to discovery and connect with edge nodes, decide which parts of the codes can be offloaded and the time to offload. This will put more strain on the developer and affect the overall programming experience and the productivity.</p>
          
      </section>


      <section>
          <h2>Conclusion</h2>
          <p>
              The two approaches discussed above are mostly similar. The main difference is the way the codes are sent to the edge. However, except for those downsides mentioned, some common issues or pain points are still remained 
              and needed to be addressed. 
              <ul>
                <li>Discrepancies between client runtime and edge runtime may cause running failures, as well as the runtime differences between edge node and edge node.So an unified runtime is needed to take into consideration. WebAssembly runtime may be good choice.</li>
                <li>Capabilities for discovering and connecting with edge nodes is needed. </li>
                <li>Some parts of the program might be sensitive, and the developer might not want it to be sent over the internet.Capabilities for developers to configure which parts can be offloaded is needed.</li>
                <li>When certain condition meets, for example, if there is no edge server in proximity to the user or the internet is lost, codes should be executed locally instead of offloaded. That's to say, the developer will know what might be offloaded to a server but does not need to decide or care about when it is offloaded. </li>
                <li>Security and privacy are important, secure communication mechanisms are needed.</li>
              </ul>
          </p>

          <p>
            There is no W3C standard currently available to address the above gaps. To achieve the interoperability between different vendors, standardization is needed.
          </p>
      </section>

    </section>

    <section>
      <h2>Requirements</h2>

    <section>
      <h3>General Requirements</h3>
      <p>
        The following are a set of high-level requirements,
        cross-referenced with related use cases.
      </p>
        <table>
          <thead>
            <tr>
              <th>Name</th>
              <th>Description</th>
              <th>Use Cases</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td>Performance</td>
              <td>The overall performance of an application using offload,
                  as measured by user responsiveness or time to completion
                  of computational work as appropriate,
                  should be improved.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Scalabilty</td>
              <td>Efficient implementation in a virtualized cluster environment
                  (i.e. a cloud system) should be acheivable.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Flexibility</td>
              <td>The solution should allow a variety of compute resources
                  from different providers to be used.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Compatibility</td>
              <td>The proposed standards should be as consistent as possible
                  with existing web standards to maximize adoption.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Resiliency</td>
              <td>The solution should allow adaptation to changing circumstances
                  such as changes in relative performance, network connectivity,
                  or failure of a remote Computing Resource.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Security</td>
              <td>The standards should be consistent with existing
                  security expectations for web applications.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Privacy</td>
              <td>The standards should be consistent with existing
                  privacy expectations for web applications.</td>
              <td>TBD</td>
            </tr>
            <tr>
              <td>Control</td>
              <td>The use of resources should ultimately be under the 
                  control of the entity responsible for paying for their use.</td>
              <td>TBD</td>
            </tr>
          </tbody>
        </table>
    </section>

    <section>
      <h3>Detailed Requirements</h3>
      <p>
        Some more detailed requirements are listed below, 
        cross-referenced with related use cases and related
        high-level requirements.
      </p>

        <table>
          <thead>
            <tr>
              <th>General Requirements</th>
              <th>Name</th>
              <th>Description</th>
              <th>Use Cases</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td>Performance</td>
              <td>R1: Client Offload.</td>
              <td>Client should be able to offload computing intensive work 
                  to an edge resource.</td>
              <td>Use case 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, 2.13; </td> 
            </tr>
            <tr>
              <td>Resiliency</td>
              <td>R2a: Application-Directed Migration.</td>  
              <td>The application should be able to explictly manage migration 
                  of work between computing resources.
                  This may include temporarily running a workload on 
                  multiple computing resources to hide transfer latency.
              </td>
              <td>Use case (refs TBD) ;</td>
            </tr>
            <tr>
              <td>Resiliency</td>
              <td>R2b: Live Migration.</td>
              <td>The edge cloud should be able to transparently migrate 
                live (running) work between computing resources.
                This includes between edge resources, cloud resources, 
                and back to the client, as necessary.
                If the workload is stateful, 
                this includes state capture and transfer.</td>
              <td>Use case 2.1,2.2, 2.5;</td>
            </tr>
            <tr>
              <td>Flexibility</td>
              <td>R3: Discovery.</td>
              <td>A client should be able to dynamically enumerate available 
                  edge resources.</td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Flexibility</td>
              <td>R4: Selection.</td>
              <td>A client should be able to select between available resources, 
                  including making a decision about whether offload is appropriate
                  (e.g. running on the client may be the best choice).
                  This selection may be automatic or application-directed,
                  and may require metadata or measurements
                  of the performance and latency of edge resources,
                  and may be static or dynamic.
                  To do: perhaps break variants down into separate sub-requirements.
                  Also, it needs to be clear about how this is different
                  from the out-of-scope issue "Offload policy".
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Flexibility</td>
              <td>R5: Packaging.</td>
              <td>A workload should be packaged so it can be executed on 
                a variety of edge resources.
                This means either platform independence OR a means to 
                negotiate which workloads can run where.</td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Flexibility</td>
              <td>R6: Persistence.</td>
              <td>It should be possible for a workload to be run "in the background",
                possibly event-driven, even if the client is not active. 
                This also implies lifetime management (cleaning
                up workloads under some conditions, 
                such as if the client has not connected for a certain amount
                of time, etc.)</td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Security, Privacy</td>
              <td>R7: Confidentiality and Integrity</td>
              <td>The client should be able to control and protect the data used
                by an offloaded workload.
                Note: this may result in constraints upon the selection of offload targets, but 
                it also means data needs to be protected in transit, at rest, etc.
              </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Control</td>
              <td>R8: Resource Management.</td>
              <td>The client should be able to control
                  the use of resources by an
                  offloaded workload on a per-application basis.
                Note: If an edge resource has a usage charge, for example, 
                a client may want to set quotas on offload, 
                and some applications may need more resources than others.
                This may also require a negotiation, 
                e.g. a workload may have minimum requirements,
                making offload mandatory on limited clients.
                This is partially about QoS as it relates to performance
                (making sure a minimum amount of resources is available)
                but is also about controlling charges (so a web app does
                not abuse the edge resources paid for by a client).
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Scalability</td>
              <td>R9: Statelessness</td>
              <td>It should be possible to identify workloads that
                  are stateless so they can be run in a more scalable
                  manner, using FaaS cloud mechanisms.
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Compatibility</td>
              <td>R10: Stateful</td>
              <td>It should be possible to run stateful workloads,
                  to be compatible with existing client-side 
                  programming model expectations.
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Performance</td>
              <td>R11: Parallelism</td>
              <td>It should be possible to run multiple workloads in
                  parallel and/or express parallelism within a single workload.
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Performance</td>
              <td>R12: Asynchronous</td>
              <td>The API for communicating with a running workload
                  should be non-blocking (asynchronous) to hide the
                  latency of remote communication and allow the 
                  main (user interface) thread to run in parallel with the
                  workload (even if the workload is being run on the client).
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Security</td>
              <td>R13: Sandboxing</td>
              <td>A workload should be specified and packaged in such
                a way that it can be run in a sandboxed environment and its
                access to resources can be managed.
              </td>
              <td>Use case (refs TBD);</td>
            </tr>
            <tr>
              <td>Performance, Compatibility</td>
              <td>R14: Acceleration</td>
              <td>A workload should have (managed) access to accelerated
                  computing resources when appropriate, such as AI accelerators.
                  Note: Since the availablity of these resources may vary between
                  compute resources these need to be taken into account
                  when estimating performance and selecting a compute
                  resource to use for offload.  Access to such resources
                  should use web standards, e.g. standard WASM/WASI APIs.
                </td>
              <td>Use case (refs TBD);</td>
            </tr>
          </tbody>
        </table>
      </section>
    </section>

    <section>
      <h2>Architecture Proposals</h2>
        <p>
          This document proposes different architectures that address the needs identified above.</p>

        <section>
          <h3>Seamless code sharing across client/edge/cloud</h3>
          <p>This architecture allows the client, edge and the central cloud share a common code running environment which allows the 
          task running in either client, edge, cloud or both in a coordinated way. 
        </p>

          <p>
            The proposed high level architectures is shown in the following figure:
          </p>

          <figure>
            <img alt="ClientEdgeArchitecture" src="images/Client_Edge_Architecture_v3.png" width="1000">
            <figcaption>
              Proposed High Level Architecture
            </figcaption>
          </figure>

          <p>
            In this architecture, the code of the work load in the client side could be offloaded to the edge cloud and can also be handover to central cloud and handover back to client.
            The high-level procedure is as follows:
          </p>

          <li>
            The client side application encapsulate offloaded work load code into offloaded code module and load the code module using specific API for the runtime environment.
            There may be different types of runtime environment, for example, WebAssembly or JavaScript runtime.
            The offloaded work load will be written according to the specific runtime.
          </li>

          <li>
            The client's runtime dispatch policy module query the dispatch policy from the Offload Management Module which located in the Edge cloud or the central cloud.
          </li>

          <li>
            The Offload Management Module send dispatch policy to the client side application.
          </li>

          <li>
            The client side runtime send the offloaded code to the target Edge cloud/Central cloud according to the dispatch policy.
          </li>

          <li>
            The target Edge cloud/Central cloud excute the offloaded work load and return the result to the client application.
          </li>

          <li>
            If offloading is not feasible in certain conditions, for example, if the network connection between the client and Edge cloud is not stable or 
            there is no enough resources in the Edge Cloud, the offloaded work load should handover back to the client side to ensure the availability of the client application.
          </li>

      </section>

      <section>
        <h3>Distributed Workers</h3>
        <p>The Web already has a set of standards for managing additional
        threads of computation managed by application 
        code loaded into the browser: Web and Service Workers.
        Workers already support a message-passing style of communication
        which would allow them to execute remotely.
        This architectural option proposes extending workers to support
        edge computing as follows:
        </p>
        <ul>
        <li>Compute Utility services would be made available on the
            network that could provide
            a capabilty to execute Worker payloads.
            For fallback purposes, the client itself would also offer
            a Compute Utility to support local execution when needed.
            It would also be possible for the origin (the server) to host a Compute
            Utility.  However, in general Compute Utilities could be
            hosted at other locations, including in desktops on the LAN,
            within a local cloud, or within edge infrastructure.
        </li>
        <li>Application developers would continue to use existing
            APIs for Workers, but could optionally provide metadata about
            performance and memory requirements which could be used to
            select an appropriate execution target.
        </li>
        <li>Browsers would collect metadata about available Compute
            Utility services, including latency and performance, and
            would select an appropriate target for each Worker.
            The user would have controls in the browser to control
            this rule, including the ability to specify specific offload
            targets or to force local execution when appropriate.
            Note: the reason it is suggested that the browser makes 
            the decision and not the application is to prevent fingerprinting
            and associated privacy risks.
            Metadata about available Compute Utilities might otherwise
            be used to infer location. The proposed architecture hides
            this information by default from the application while
            still supporting intelligent selection of offload targets.
        </li>
        <li>Once a Compute Utility is selected, the browser would
            automatically (transparent to the application) use the 
            network API of the Compute Utility to load and execute the
            workload for the Worker.  
            Note that in the existing Worker API, a URL of
            a Javascript workload is provided to the Worker.  
        </li>
        <li>The Compute Utility itself would be responsible for downloading
            the workload, it would not have to be downloaded to the 
            browser and then uploaded.  Also, a WASM workload could be used,
            bootstrapping from the standard Javascript workload.
            In general, the
            workload execution environment should be the same as the 
            normal Worker execution environment.  However, access to
            accelerators to support performance and other advanced capabilities
            will become more important in this context.
        </li>
        </ul>

        <p>
          The proposed high level architecture is shown in the following figure.
          The browser discovers Compute Utilities using 
          a Discovery mechanism. Discovery services in each discovered
          Compute Utility return metadata, which then allows the 
          selection of a Compute Utility for each workload.
          A Workload Managment service is then
          used to load and run a packaged workload for the worker.
        </p>
        <figure>
          <img alt="Distributed Worker Architecture" src="images/DW.png" width="600">
          <figcaption>
            Proposed High Level Architecture: Distributed Workers
          </figcaption>
        </figure>
      </section>
    </section>

    <section>
          <h2>Standardization Proposals</h2>
            <section>
              <h3>WebAssembly as unified runtime</h3>
                <p> This proposal proposes to extend WebAssembly runtime and use it in both client side and edge cloud side as unified runtime.</p>
                <figure> 
                    <img src="images/WebAssemblyRuntime.png" alt="WebAssemblyRuntime" width="600">
                        <figcaption>
                          WebAssembly As Unified Runtime Architecure
                        </figcaption>
                </figure> 

                <p>
                  The proposed solution includes the following parts:
                </p>

                <li>Load wasm code</li>
                    <p>
                      The client load the WebAssembly code by invoking a standard API. This API should indicate that the wasm code could be offloaded to Edge cloud when certain condition is meet.
                      This API should also set the destination Edge cloud's location identifier.
                    </p>

                <li>Offload wasm code to Edge Cloud</li>
                    <p>
                      The client side's WebAssembly runtime excutes the wasm code and when certain condition is meet, it offload the wasm code to the Edge cloud.
                    </p>

                <li>Run wasm code on Edge cloud</li>
                    <p>
                      The wasm code runs on the Edge cloud, and return the result back to the client side.
                    </p>

                <li>Dispatch back to the client</li>
                    <p>
                      When certain condition is meet, the wasm code is dispatched back to the client side and continue to run.
                    </p>
                </section>
              
              <section id="PotentialStandards">
                <h3>Potential Standards</h3>
                  <dl>
                  <dt>JavaScript API to specify workloads:</dt>
                    <dd>
                      This JavaScript API is to be used by a web application 
                      to specify a workload.
                      It indicates this workload may be offloaded to the 
                      edge cloud when certain conditions are meet.
                      The standardization of this API is required 
                      since different implementations should implement this API 
                      in the same way to make sure that the developers have the 
                      same user experience.
                    </dd>

                  <dt>Communication mechanism and protocol between client and Edge cloud:</dt>
                    <dd>
                      The communication mechanism, the communication protocol,
                      and the interface (network API) between the client side 
                      and the Edge cloud should be standardized to enable
                      the interoperability of different Edge cloud providers.
                    </dd>

                  <dt>Edge cloud availability and network condition discovery:</dt>
                    <dd>
                      The workload should only be offloaded when certain conditions
                      are met. 
                      These condition may include Edge cloud availability 
                      and network condition.
                    </dd>
                 </dl>

              </section>

              <section id="OutOfScope">
                <h3>Out of Scope</h3>
                <p>The following topics will be out of scope of standardization
                work:</p>
                <dl>
                  <dt>Offloading policy:</dt>
                    <dd>
                      This proposal recommends that the offloading policy is not 
                      standardized to allow flexibile implementation by 
                      different vendors. 
                    </dd>
                </dl>
              </section>

              <section>
                <h3>Conclusion & Recommendations for the Way Forward</h3>

                <p>
                  As analysed in the <a href="#gap"> gap analysis section</a> and in the <a href="#PotentialStandards">potential standards</a>  section, the proposed potential standards work is to specify a standardized 
                  common mechanism to offload code to the edge cloud. 
                  There may be mulitple potential solutions and some of them may be related to <a href="https://www.w3.org/wasm/">WebAssembly</a> and some may be related to webplatform APIs.
                </p>

                <p>There are basically two different approaches for the standardization:  </p>
                  <li>Option 1: Divide the standardization work into different relevant W3C working groups. </li>
                  <li>Option 2: Establish a new working group that focus on the standardization of this problem.</li>

                <p>
                  The disadvantage of option 1 is that if divided the work to different related working groups in W3C, it would be very difficult to maintain a unified architecture.
                </p>

                <p>
                  The advantage of option 2 is that it would be easier to maintain a unified architeture in one working group.
                  Considering this, it is preferred to establish a new working group (Option 2) to focus on the standardized solution development. The proposed new working group 
                  could develop the standard solution architecture firstly. If the standard solution architecture require extension of current W3C standards, the new working group could work together with the related 
                  working group(for example,<a href="https://www.w3.org/wasm/">WebAssembly Working Group</a> etc.) to develop the extensions. 
               </p>

               <p>
                 The proposed next step is to draft the charter which is used to establish the proposed new working group.
               </p>

              </section>
          
    </section>

    <section>
      <h2>Contributors</h2>
          <p>
            Feiwei Lei,  <a href="http://www.alibabagroup.com/en/global/home">Alibaba Group</a>
          </p>       
          
          <p>
            Jingyu Yang,  <a href="http://www.alibabagroup.com/en/global/home">Alibaba Group</a>
          </p>    

          <p>
            Wenming Wang,  <a href="https://www.bytedance.com/">ByteDance</a>
          </p>

          <p>
            Dian Peng,  <a href="https://www.bytedance.com/">ByteDance</a>
          </p>

          <p>
            Lei Zhao, <a href="https://www.chinamobileltd.com">China Mobile </a>
          </p>                              
    </section>

    
</body>

</html>