forked from w3c/edge-computing-web-exploration
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
1130 lines (1012 loc) · 54.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Client-Edge-Cloud coordination Use Cases and Requirements</title>
<style>
.two-cols {
display: grid;
grid-template-columns: 1fr 1fr;
}
table {
border-collapse:collapse;
}
table,th, td {
border: 1px solid #666;
}
td {
padding:2px 15px;
}
</style>
<script async class="remove" src="https://www.w3.org/Tools/respec/respec-w3c"></script>
<script class="remove">
var respecConfig = {
specStatus: "ED",
copyrightStart: "2021",
edDraftURI: "https://w3c.github.io/edge-computing-web-exploration/",
github: "https://github.com/w3c/edge-computing-web-exploration",
latestVersion: null,
noRecTrack: true,
copyrightStart: "2021",
editors: [{
name: "Dapeng(Max) Liu",
companyURL: "http://www.alibabagroup.com/en/global/home",
company: "Alibaba Group"
},
{
name: "Michael McCool ",
companyURL: "https://www.intel.com/",
company:"Intel"
},
{
name: "Song Xu",
companyURL: "https://www.chinamobileltd.com",
company:"China Mobile"
}
],
group: "web-networks"
};
</script>
</head>
<body>
<section id='abstract'>
<p>
This document introduces the use cases and requirements of client, edge, cloud coordination mechanism and its standardization.
</p>
</section>
<section id='sotd'>
<p>
This is still a work in progress. The proposal is being incubated in the <a href="https://github.com/w3c/web-networks/">W3C Web & Networks Interest Group</a>.
</p>
</section>
<section>
<h2>Introduction</h2>
<p>With the rapid development of cloud computing technology, the centralized cloud is evolving towards distributed "edge cloud" that allows developers to deploy their code as FaaS in
the edge cloud which is close to the user's location. One of such service is Alibaba Cloud's <a href="https://www.alibabacloud.com/help/en/dynamic-route-for-cdn/latest/er-overview">EdgeRoutine</a>
service.</p>
<p>
With the rapid adoption of new technologies such as machine learning, IoT ect in the client side's applications, the client side's application may also need to perform computing intensive work. For example, machine
learning inference can also be done in client side. As one of such examples, Taobao mobile App leverages client side machine learning inference for user face detection etc.
W3C is also working on <a href="https://www.w3.org/groups/wg/webmachinelearning">WebNN</a> standard that allow client side developers to leverage the machine learning acceleration
hardwares that resides in the client side devices.
</p>
<p>
To improve the client side application's performance, there is a trend to offload computing intensive work to the edge cloud, such as cloud app, cloud gaming etc.
However, current approach could be further optimized if there is a mechanism for Client-Edge-Cloud coordination.
This document discusses the use cases and requirements of Client-Edge-Cloud coordination mechanism and its standardization.
</p>
<p>
</p>
</section>
<section id="terminology">
<h2>Terminology</h2>
<p>This document uses the following terms with the specific meanings defined here.
Where possible these meanings are consistent with common usage.
However, note that common usage of some of these terms have
multiple, ambiguous, or inconsistent meanings.
The definition here will take precedence.
When terms are used with the specific meanings defined here they will
be capitalized.
When possible reference to existing standards defining these terms is given.
</p>
<dl>
<!-- Example, consistent with ReSpec
<dt><dfn data-lt="term" class="lint-ignore">Term</dfn></dt>
<dd>Defn.</dd>
-->
<dt><dfn class="lint-ignore">Edge<dt>
<dd>
The periphery of a network.
</dd>
<dt><dfn class="lint-ignore">FaaS<dt>
<dd>Function as a Service.
A service provided by a Computing Resource that can execute a stateless computation.
</dd>
<dt><dfn class="lint-ignore">Cloud<dt>
<dd>
A set of managed services that are
designed to be interchangable, scalable, and location-independent.
</dd>
<dt><dfn class="lint-ignore">Cloud Resources<dt>
<dd>
A set of managed Computing Resources available in the Cloud.
</dd>
<dt><dfn class="lint-ignore">Edge Cloud<dt>
<dd>
A set of Edge Resources managed as an extension of the Cloud.
Such resources use similar abstractions and management APIs as a cloud
but typically will add mechanisms to manage location and latency,
and will typically be deployed in a less centralized, more location and
latency-sensitive manner than a typical Cloud.
</dd>
<dt><dfn class="lint-ignore">Edge Resource<dt>
<dd>
A Computing Resource located on the Edge, that is, near the periphery of a network.
Note: this definition does not
necessarily include "endpoints" such as IoT sensors.
It refers specifically to computers
that can make Computing Resources available to others on the network.
</dd>
<dt><dfn class="lint-ignore">Migration<dt>
<dd>
The ability to move a workload from one Computing Resources to another.
See also Live Migration and Application-Directed Migration, which are subclasses.
</dd>
<dt><dfn class="lint-ignore">Live Migration<dt>
<dd>
The ability to transparently move a running workload from one Computing Resources to another.
This includes transparent migration of state and updates to references, so that the
application that invoked the resource does not need to manage the transition and
does not need to be made aware of it. Such migration needs to be implemented with
minimum impact on quality of service factors such as latency of responses.
See also the more general defintion of Migration.
</dd>
<dt><dfn class="lint-ignore">Application-Directed Migration<dt>
<dd>
The ability to move a running workload from one Computing Resources to another under
control of an application. In this version of Migration, the controlling
application or the Workload itself needs to manage the orderly transfer of state
from one Computing Resource to another, and may also have to explictly update
references, and will have to explictly manage quality of service factors such
as latency of response.
See also the more general definition of Migration.
</dd>
<dt><dfn class="lint-ignore">Computing Resource<dt>
<dd>
Any computer which can be used to execute a Workload, and may
include Edge, Cloud, or Client computers.
</dd>
<dt><dfn class="lint-ignore">Client Computer<dt>
<dd>
A Computing Resource used directly by an end user, such as a laptop or desktop.
Such a Computing Resource may also act as an Edge Resource if it provides
Computing Resources to other systems on the network.
</dd>
<dt><dfn class="lint-ignore">CDN<dt>
<dd>
Content Distribution Network.
A specialized network and set of computers targetted
at caching and delivering content with low latency.
May also host Edge Resources.
</dd>
<dt><dfn class="lint-ignore">MEC<dt>
<dd>
<a href="https://www.etsi.org/technologies/multi-access-edge-computing">Multi-access Edge Computing</a>.
A form of Edge Computing
based on Computing Resources
typically hosted within a cellular network's infrastructure.
</dd>
<dt><dfn class="lint-ignore">Workload<dt>
<dd>
A packaged definition of the compute work required to be executed
on a Computing Resource.
For example, a workload might be a container image, a script, or WASM.
</dd>
</dl>
</section>
<section>
<h2>Stakeholders and Business Models</h2>
<p>Different stakeholders with an interest in edge computing will have different
motivations and priorities. In this section we present an categorization of the different
kinds of stakeholders and their business models. As we present use cases and discuss
proposals we can then relate these to the motivating drivers of different stakeholders.
Note that some stakeholders may belong to more than one category.
</p>
<table>
<thead>
<tr>
<th>Abbv</th>
<th>Category</th>
<th>Business Model</th>
<th>Motivation</th>
</tr>
</thead>
<tbody>
<tr>
<td>BWSR</td>
<td>Browser Vendor</td>
<td>OSS - supported by other business (e.g. CSP, ads/search)</td>
<td>More applications can use web</td>
</tr>
<tr>
<td>CSP</td>
<td>Cloud Service Provider</td>
<td>Usage or subscription, account based (service provider pays)</td>
<td>Offer edge computing service.</td>
</tr>
<tr>
<td>CDN</td>
<td>Content Distribution Network</td>
<td>Usage or subscription, account based (service provider pays)</td>
<td>Offer edge computing service</td>
</tr>
<tr>
<td>ISP</td>
<td>Internet Service Provider</td>
<td>Subscription/rental; HW sales in some cases</td>
<td>Offer edge computing service</td>
</tr>
<tr>
<td>HW</td>
<td>Hardware Vendor</td>
<td>Sale or rental</td>
<td>Desktops/servers as private edge computers</td>
</tr>
<tr>
<td>NET</td>
<td>Mobile Network Provider (MEC)</td>
<td>Usage or subscription, account based (user pays)</td>
<td>Offer compute utility service</td>
</tr>
<tr>
<td>OS</td>
<td>Operating System Vendor</td>
<td>Sale or subscriptions to OS licenses; HW co-sales</td>
<td>HW co-sales for edge computers</td>
</tr>
<tr>
<td>APPL</td>
<td>Application Developer</td>
<td>Sale or subscription to software licenses (or in some cases, ad supported)</td>
<td>Avoid limitations of client and/or cloud platforms</td>
</tr>
<tr>
<td>SVC</td>
<td>Web Service (API) Provider</td>
<td>Usage or subscription, account based (user pays)</td>
<td>Improved deployment options; increased usage</td>
</tr>
<tr>
<td>USER</td>
<td>End User</td>
<td>Direct payment, bundled cost, or private HW</td>
<td>Improved performance, lower latency</td>
</tr>
</tbody>
</table>
</section>
<section>
<h2>Use Cases</h2>
<p>
The client side application could be generally classified into the following categories:
</p>
<li>Render intensive application</li>
<p>
Render intensive application refers to the client side applications whose main task is to fetch the content from the backend server then rendering the content in the front-end.
For example, news,social media Web applications and mobile applications belongs to this category.
</p>
<li>Computing intensive application</li>
<p>
Computing intensive application refers to the client side applications whose main task is to do computing intensive work in the client side. For example, mobile gaming applications
need to calculate certain object's location and other complex parameters based on user interaction then rendering in the client side.
</p>
<li>Hybrid application</li>
<p>
Hybrid application refers to the application whose main task includes both rendering intensive work and computing intensive work. For example, morden e-commerce mobile application
leverage machine learning inference in the client side for AR/VR tpye user experience. At the same time, the e-commerce mobile application needs to fetch dynamic content based
on uesr preference.
</p>
<li>Mobile/static client</li>
<p>
Some client side applications remain static most of the time. For example, a camera for traffic monitorning and analysis do not require mobility support.
On the other hands, some client side application will change its location continuously. For example, for applications running on a connected vehicle or self driving vehicle,
it will change its location rapidly with the vehicle.
</p>
<p>
The use cases in the follwing sections are generally classified into the following categories based on the different work load type:
</p>
<section>
<h3>Accelerated work loads</h3>
<p>
For this category of use cases, the client side application leverage edge cloud for accelerating certain work loads by offloading those work loads to the edge.
</p>
<section id="UC-CA">
<h4>Cloud App</h4>
<p>
Cloud App is a new form of application which utilizing cloud computing technology to move client side application's work load to the edge cloud and central cloud. The user interaction happens
at the client device side and the computing and rendering process happens in the edge cloud side. This can accelerate the client side application's performance, lower the client's hardware requirement and reduce the cost.
</p>
<p>
As one examples of cloud App, Alibaba Group's Tmall Genie smart speaker leverage edge cloud to offload the computing intensive and accelerated work load from the client side to the edge cloud.
</p>
<p>
The client, the central cloud, the edge cloud works together in a coordinated way for Cloud App. Typically, the control and orchestration function is located in the central cloud.
The computing intensive function is located in edge cloud. The user intercation and display function is located in the client.
</p>
<figure>
<img alt="CloudApp" src="images/CloudApp.png" width="600">
<figcaption>
Cloud App Architecture
</figcaption>
</figure>
</section>
<section id="UC-VR">
<h4>VR/AR Acceleration</h4>
<p>
VR/AR devices such as VR/AR glasses normally has limited hardware resources, so it is preferred to offload the computing intensive task to the edge for acceleration and
reducing delay since the edge server is deployed near the location of the user.
</p>
<p>
Note: this could be generalized to "acceleration of low-latency tasks". Some other examples might include
game physics simulation or CAD tools (in a business environment). The latter might add confidentiality
constraints (a business user may want to offload to on-premises computers).
We may also want to clarify that this pattern is for local communication to/from the client.
See also "Streaming Acceleration", where the communication is in-line with an existing network
connection.
</p>
</section>
<section id="UC-CG">
<h4>Cloud Gaming</h4>
<p>
Cloud gaming is a game mode which leverages cloud/edge computing. Under the operation mode of cloud game, the games are running on the cloud side, and the game images are compressed and transmitted to users by video stream through the network after rendering.
The cloud gaming user can receive the game video stream and send control commands to the cloud to control the game.
</p>
<p>Taking Click-and-Play scenario for example, since all the rendering and control commands are offloaded to the edge, the cloud game user don't need to install the game locally, just click the game and then play with it smoothly. </p>
<p>
By offloading the gaming work load to the edge and making full use of the computing power and low-latency network, the cloud gaming can provide more smoother experience.
</p>
</section>
<section id="UC-SA">
<h4>Streaming Acceleration</h4>
<p>In the case of video acceleration, we may want to offload work to a location with both
compute performance that is already on the network path. Specifically, consider a low-performance
client that wants to compress video or do background removal as part of a teleconference.
It could connect over a local high-performance wireless network to a local edge computer (perhaps
an enhanced ISP gateway box) that would then perform the video processing and forward the processed
video to the network.
</p>
<p>
We may also want to clarify that this pattern is for communication in-line with an existing
network.
See also "VR/AR", where the communication is to/from the offload target.
</p>
<section id="UC-VC">
<h5>Online Video Conference</h5>
<p>
One special case of streaming acceleration is online video conference application. The online video conference system provides realtime translation and subtitles service.
This will use AI/Machine learning technology and it is computing intensive. Also, the real time translation service is very delay sensitive.
</p>
<p>
The online video conference application could be installed on PC terminals or mobile terminals.
For PC terminals, there is enough computing resources and enough disk storage to allow the installation of online video conference application. In this case, the computing
intensive work could be done in the PC terminal and providing ultra-low latency user experience.
</p>
<p>
For mobile terminals, there is limited disk storage and limited computing capability, it is not possbile to run the computing intensive task on the mobile terminals. In this case,
the computing intensive task could be offloaded to the edge and then providing ultra-low latency user experience.
</p>
<p>
It is preferred that in this use case the online video conference application can offload the computing intensive task according to the terminal capability and edge resources availability.
The online video conference service provider can provide consistent user experience on different terminals.
</p>
</section>
</section>
<section id="UC-MLA">
<h4>Machine Learning Acceleration</h4>
<p>
Machine learning inference can be done in the client side to reduce latency. W3C is working on <a href="https://www.w3.org/groups/wg/webmachinelearning">WebNN</a> standard that
allows client side developers to leverage the machine learning acceleration hardwares reside in the client side devices.
</p>
<p>
The client side devices may have different hardware capabilities. For example, some mobile phone have very limited hardware resource and do not have GPU.
It is very difficult for this kind of client devices to run machine learning code on it.
So in this scenario, it is preferred to offload machine learning code to the edge cloud. The user experience is greatly improved in this case.
</p>
<section id="UC-IVPU">
<h5>Image and Video Processing and Understanding</h5>
<p>
One special case of machine learning acceleration is Image and Video Processing and Understanding. For some mobile image/video processing applications, it is required to use machine learning
algorithm for the image/video analysis and processing. It is normally need NPU chipset on the terminals for better performance.
For the terminals that without NPU chipset, it is preferred to offload the machine learning computing intensive work load to the edge cloud and this will provide a consistent user experience
on different terminals.
</p>
</section>
<section id="UC-PWMP">
<h4>Professional Web-based Media Production</h4>
<p>
Another special case and example of machine learning acceleration is Professional Web-based Media Production.
Processing and rendering media is a complex task. For example, a video editing application needs to do image processing, video editing, audio editing, etc.
So it has high performance requirements.
</p>
<p>
Professional Web-based media production relies on web-based media editing tools heavily which can be used to do AI Cutting, AI Editing, AI Transcoding and publish videos to the cloud.
Since the edge cloud has more powerful computing power and close to user's location, by offloading the expensive rendering process to the edge, the web apps can render media more quickly
and provide better user experience.
</p>
<figure>
<img alt="WebMedia" src="images/web_based_media.png" width="600">
<figcaption>
Web-based media production
</figcaption>
</figure>
</section>
</section>
<section>
<h3>Rbustness of Workload Acceleration for Certain Applications</h3>
<p>
For the applications which offload some parts of its functionalities to the edge cloud, if the edge cloud is not available due to the client mobility or other reasons,
it is preferred that the functionalities could be handed over back to the client. More complex rules could be designed to improve the robustness of the application.
</p>
<section id="UC-LVB">
<h4>Live video broadcasting mobile application</h4>
<p>
One example of such use cases is: for the mobile application (for example, live video broadcasting mobile application) that leverage edge cloud for computing and/or machine learning
acceleration, when certain condition (edge clould availability, network condition etc.) is not met for work load acceleration by offloading , it shall have the ability to migrate the work
load back to the client side to ensure the robustness and availability of the application.
</p>
</section>
<section id="UC-ALPR">
<h4>Automatic License Plate Recognition</h4>
<p>
Another example is Automatic License Plate Recognition. For automatic license plate recognition applications, offline processing can provide 90% recognition rate. Online processing on the edge will improve the recognition rate to 99%.
It is preferred to offload the license plate recognition computing intensive task to edge when the network connection is stable and if the network condition is not stable or broken, the offloaded computing
intensive task could move back to the terminals to guarantee the availability of the service.
</p>
</section>
</section>
</section>
<section>
<h3>IoT workloads</h3>
<section id="UC-RNA">
<h4>Robot Navigation Acceleration</h4>
<p>
Consider a robot navigating in a home using Visual SLAM. In this case the robot has limited
performance due to cost and power constraints. So it wishes to offload the video processing work
to another computer. However, the video is sensitive private data so the user would prefer that
it does not leave the user's premises, and would like to offload the processing to an existing
desktop computer or an enhanced gateway. Latency may also be a concern (the robot needs the
results immediately to make navigation decisions, for example to avoid a wire or other obstacle
on the floor).
</p>
<p>
Note: in general, there are other opportunities for IoT devices to want to offload work to another
computer. Video processing however is of special interest because of its high data and processing
requirements and privacy constraints.
</p>
</section>
</section>
<section id="UC-PW">
<h3>Persistent workloads</h3>
<p>
In some cases it may be desireable to a task from a browser that continues to run even when the
browser application is not active. This could be used to monitor a condition for example and send
a notification when that condition is met. As a sub-category of this use case, the offloaded task
might be used to monitor IoT devices and instead of or in addition to sending a notification, it
might be used for automation.
Such an offloaded task might also be used to execute long-running computational
tasks such as machine learning or data indexing.
</p>
<p>
Persistent tasks require a mechanism to manage their lifetime using expiry dates or explicit controls.
In the case of applying this to IoT orchestration, there is also the issue of granting access rights
to such offloaded tasks, for example access to a LAN, to specific IoT devices on that LAN, and to the
data they generate.
</p>
</section>
</section>
<section id="gap">
<h2>Gap Analysis</h2>
<section>
<h2>Common approaches for offloading</h2>
<p>
Currently, there are two common approaches for offloading. One is to send the codes in the request from the client to the edge server, the other is to fetch the codes from inner file repositories on edge and execute them. They all work fine, but both approaches have some downsides.
<li>Sending codes from client to edge</li>
<p>In this approach, the client sends the local codes to the edge and execute the codes on the edge side. The downside is obvious, since more data is transferred and it may cause network latency.Handling more data will also put more strain on the resource-restricted end device. Meanwhile, some codes are sensitive, data security is also an big issue. </p>
<li>Fetching codes from file repositories and executing on edge</li>
<figure>
<img alt="EdgeFetch" src="images/edge_fetching_codes.png" width="600">
<figcaption>
Fetching codes from inner file repositories
</figcaption>
</figure>
<p>In this approach, The client leverages user-defined offloading library to send proper params to the offloading server and the server will fetch specified codes from inner repositories and then execute the codes.</p>
<p>The downside of this approach is that additional file repositories are needed and the developers have to upload the codes to the repository and make sure the local and the edge side have the same version codes. </p>
<p>Meanwhile, since the offloading library plays an import role in offloading, developers should be reliable for creating a robust offloading policy to discovery and connect with edge nodes, decide which parts of the codes can be offloaded and the time to offload. This will put more strain on the developer and affect the overall programming experience and the productivity.</p>
</section>
<section>
<h2>Conclusion</h2>
<p>
The two approaches discussed above are mostly similar. The main difference is the way the codes are sent to the edge. However, except for those downsides mentioned, some common issues or pain points are still remained
and needed to be addressed.
<ul>
<li>Discrepancies between client runtime and edge runtime may cause running failures, as well as the runtime differences between edge node and edge node.So an unified runtime is needed to take into consideration. WebAssembly runtime may be good choice.</li>
<li>Capabilities for discovering and connecting with edge nodes is needed. </li>
<li>Some parts of the program might be sensitive, and the developer might not want it to be sent over the internet.Capabilities for developers to configure which parts can be offloaded is needed.</li>
<li>When certain condition meets, for example, if there is no edge server in proximity to the user or the internet is lost, codes should be executed locally instead of offloaded. That's to say, the developer will know what might be offloaded to a server but does not need to decide or care about when it is offloaded. </li>
<li>Security and privacy are important, secure communication mechanisms are needed.</li>
</ul>
</p>
<p>
There is no W3C standard currently available to address the above gaps. To achieve the interoperability between different vendors, standardization is needed.
</p>
</section>
</section>
<section>
<h2>Requirements</h2>
<section>
<h3>General Requirements</h3>
<p>
The following are a set of high-level requirements,
cross-referenced with related use cases.
</p>
<table>
<thead>
<tr>
<th>Name</th>
<th>Description</th>
<th>Use Cases</th>
</tr>
</thead>
<tbody>
<tr>
<td>Performance</td>
<td>The overall performance of an application using offload,
as measured by user responsiveness or time to completion
of computational work as appropriate,
should be improved.</td>
<td>TBD</td>
</tr>
<tr>
<td>Scalabilty</td>
<td>Efficient implementation in a virtualized cluster environment
(i.e. a cloud system) should be acheivable.</td>
<td>TBD</td>
</tr>
<tr>
<td>Flexibility</td>
<td>The solution should allow a variety of compute resources
from different providers to be used.</td>
<td>TBD</td>
</tr>
<tr>
<td>Compatibility</td>
<td>The proposed standards should be as consistent as possible
with existing web standards to maximize adoption.</td>
<td>TBD</td>
</tr>
<tr>
<td>Resiliency</td>
<td>The solution should allow adaptation to changing circumstances
such as changes in relative performance, network connectivity,
or failure of a remote Computing Resource.</td>
<td>TBD</td>
</tr>
<tr>
<td>Security</td>
<td>The standards should be consistent with existing
security expectations for web applications.</td>
<td>TBD</td>
</tr>
<tr>
<td>Privacy</td>
<td>The standards should be consistent with existing
privacy expectations for web applications.</td>
<td>TBD</td>
</tr>
<tr>
<td>Control</td>
<td>The use of resources should ultimately be under the
control of the entity responsible for paying for their use.</td>
<td>TBD</td>
</tr>
</tbody>
</table>
</section>
<section>
<h3>Detailed Requirements</h3>
<p>
Some more detailed requirements are listed below,
cross-referenced with related use cases and related
high-level requirements.
</p>
<table>
<thead>
<tr>
<th>General Requirements</th>
<th>Name</th>
<th>Description</th>
<th>Use Cases</th>
</tr>
</thead>
<tbody>
<tr>
<td>Performance</td>
<td>R1: Client Offload.</td>
<td>Client should be able to offload computing intensive work
to an edge resource.</td>
<td>Use case 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, 2.13; </td>
</tr>
<tr>
<td>Resiliency</td>
<td>R2a: Application-Directed Migration.</td>
<td>The application should be able to explictly manage migration
of work between computing resources.
This may include temporarily running a workload on
multiple computing resources to hide transfer latency.
</td>
<td>Use case (refs TBD) ;</td>
</tr>
<tr>
<td>Resiliency</td>
<td>R2b: Live Migration.</td>
<td>The edge cloud should be able to transparently migrate
live (running) work between computing resources.
This includes between edge resources, cloud resources,
and back to the client, as necessary.
If the workload is stateful,
this includes state capture and transfer.</td>
<td>Use case 2.1,2.2, 2.5;</td>
</tr>
<tr>
<td>Flexibility</td>
<td>R3: Discovery.</td>
<td>A client should be able to dynamically enumerate available
edge resources.</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Flexibility</td>
<td>R4: Selection.</td>
<td>A client should be able to select between available resources,
including making a decision about whether offload is appropriate
(e.g. running on the client may be the best choice).
This selection may be automatic or application-directed,
and may require metadata or measurements
of the performance and latency of edge resources,
and may be static or dynamic.
To do: perhaps break variants down into separate sub-requirements.
Also, it needs to be clear about how this is different
from the out-of-scope issue "Offload policy".
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Flexibility</td>
<td>R5: Packaging.</td>
<td>A workload should be packaged so it can be executed on
a variety of edge resources.
This means either platform independence OR a means to
negotiate which workloads can run where.</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Flexibility</td>
<td>R6: Persistence.</td>
<td>It should be possible for a workload to be run "in the background",
possibly event-driven, even if the client is not active.
This also implies lifetime management (cleaning
up workloads under some conditions,
such as if the client has not connected for a certain amount
of time, etc.)</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Security, Privacy</td>
<td>R7: Confidentiality and Integrity</td>
<td>The client should be able to control and protect the data used
by an offloaded workload.
Note: this may result in constraints upon the selection of offload targets, but
it also means data needs to be protected in transit, at rest, etc.
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Control</td>
<td>R8: Resource Management.</td>
<td>The client should be able to control
the use of resources by an
offloaded workload on a per-application basis.
Note: If an edge resource has a usage charge, for example,
a client may want to set quotas on offload,
and some applications may need more resources than others.
This may also require a negotiation,
e.g. a workload may have minimum requirements,
making offload mandatory on limited clients.
This is partially about QoS as it relates to performance
(making sure a minimum amount of resources is available)
but is also about controlling charges (so a web app does
not abuse the edge resources paid for by a client).
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Scalability</td>
<td>R9: Statelessness</td>
<td>It should be possible to identify workloads that
are stateless so they can be run in a more scalable
manner, using FaaS cloud mechanisms.
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Compatibility</td>
<td>R10: Stateful</td>
<td>It should be possible to run stateful workloads,
to be compatible with existing client-side
programming model expectations.
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Performance</td>
<td>R11: Parallelism</td>
<td>It should be possible to run multiple workloads in
parallel and/or express parallelism within a single workload.
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Performance</td>
<td>R12: Asynchronous</td>
<td>The API for communicating with a running workload
should be non-blocking (asynchronous) to hide the
latency of remote communication and allow the
main (user interface) thread to run in parallel with the
workload (even if the workload is being run on the client).
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Security</td>
<td>R13: Sandboxing</td>
<td>A workload should be specified and packaged in such
a way that it can be run in a sandboxed environment and its
access to resources can be managed.
</td>
<td>Use case (refs TBD);</td>
</tr>
<tr>
<td>Performance, Compatibility</td>
<td>R14: Acceleration</td>
<td>A workload should have (managed) access to accelerated
computing resources when appropriate, such as AI accelerators.
Note: Since the availablity of these resources may vary between
compute resources these need to be taken into account
when estimating performance and selecting a compute
resource to use for offload. Access to such resources
should use web standards, e.g. standard WASM/WASI APIs.
</td>
<td>Use case (refs TBD);</td>
</tr>
</tbody>
</table>
</section>
</section>
<section>
<h2>Architecture Proposals</h2>
<p>
This document proposes different architectures that address the needs identified above.</p>
<section>
<h3>Seamless code sharing across client/edge/cloud</h3>
<p>This architecture allows the client, edge and the central cloud share a common code running environment which allows the
task running in either client, edge, cloud or both in a coordinated way.
</p>
<p>
The proposed high level architectures is shown in the following figure:
</p>
<figure>
<img alt="ClientEdgeArchitecture" src="images/Client_Edge_Architecture_v3.png" width="1000">
<figcaption>
Proposed High Level Architecture
</figcaption>
</figure>
<p>
In this architecture, the code of the work load in the client side could be offloaded to the edge cloud and can also be handover to central cloud and handover back to client.
The high-level procedure is as follows:
</p>
<li>
The client side application encapsulate offloaded work load code into offloaded code module and load the code module using specific API for the runtime environment.
There may be different types of runtime environment, for example, WebAssembly or JavaScript runtime.
The offloaded work load will be written according to the specific runtime.
</li>
<li>
The client's runtime dispatch policy module query the dispatch policy from the Offload Management Module which located in the Edge cloud or the central cloud.
</li>
<li>
The Offload Management Module send dispatch policy to the client side application.
</li>
<li>
The client side runtime send the offloaded code to the target Edge cloud/Central cloud according to the dispatch policy.
</li>
<li>
The target Edge cloud/Central cloud excute the offloaded work load and return the result to the client application.
</li>
<li>
If offloading is not feasible in certain conditions, for example, if the network connection between the client and Edge cloud is not stable or
there is no enough resources in the Edge Cloud, the offloaded work load should handover back to the client side to ensure the availability of the client application.
</li>
</section>
<section>
<h3>Distributed Workers</h3>
<p>The Web already has a set of standards for managing additional
threads of computation managed by application
code loaded into the browser: Web and Service Workers.
Workers already support a message-passing style of communication
which would allow them to execute remotely.
This architectural option proposes extending workers to support
edge computing as follows:
</p>
<ul>
<li>Compute Utility services would be made available on the
network that could provide
a capabilty to execute Worker payloads.
For fallback purposes, the client itself would also offer
a Compute Utility to support local execution when needed.
It would also be possible for the origin (the server) to host a Compute
Utility. However, in general Compute Utilities could be
hosted at other locations, including in desktops on the LAN,
within a local cloud, or within edge infrastructure.
</li>
<li>Application developers would continue to use existing
APIs for Workers, but could optionally provide metadata about
performance and memory requirements which could be used to
select an appropriate execution target.
</li>
<li>Browsers would collect metadata about available Compute
Utility services, including latency and performance, and
would select an appropriate target for each Worker.
The user would have controls in the browser to control
this rule, including the ability to specify specific offload
targets or to force local execution when appropriate.
Note: the reason it is suggested that the browser makes
the decision and not the application is to prevent fingerprinting
and associated privacy risks.
Metadata about available Compute Utilities might otherwise
be used to infer location. The proposed architecture hides
this information by default from the application while
still supporting intelligent selection of offload targets.
</li>
<li>Once a Compute Utility is selected, the browser would
automatically (transparent to the application) use the
network API of the Compute Utility to load and execute the
workload for the Worker.
Note that in the existing Worker API, a URL of
a Javascript workload is provided to the Worker.
</li>
<li>The Compute Utility itself would be responsible for downloading
the workload, it would not have to be downloaded to the
browser and then uploaded. Also, a WASM workload could be used,
bootstrapping from the standard Javascript workload.
In general, the
workload execution environment should be the same as the
normal Worker execution environment. However, access to
accelerators to support performance and other advanced capabilities
will become more important in this context.
</li>
</ul>
<p>
The proposed high level architecture is shown in the following figure.
The browser discovers Compute Utilities using
a Discovery mechanism. Discovery services in each discovered
Compute Utility return metadata, which then allows the
selection of a Compute Utility for each workload.
A Workload Managment service is then
used to load and run a packaged workload for the worker.
</p>
<figure>
<img alt="Distributed Worker Architecture" src="images/DW.png" width="600">
<figcaption>
Proposed High Level Architecture: Distributed Workers
</figcaption>
</figure>
</section>
</section>
<section>
<h2>Standardization Proposals</h2>
<section>
<h3>WebAssembly as unified runtime</h3>
<p> This proposal proposes to extend WebAssembly runtime and use it in both client side and edge cloud side as unified runtime.</p>
<figure>
<img src="images/WebAssemblyRuntime.png" alt="WebAssemblyRuntime" width="600">
<figcaption>
WebAssembly As Unified Runtime Architecure
</figcaption>
</figure>
<p>
The proposed solution includes the following parts:
</p>