DA Job #252

auniverseaway · 2024-10-23T20:36:45Z

auniverseaway
Oct 23, 2024
Maintainer

@sukamat made a good point that if something takes longer than a few minutes, we should probably move to a server-side approach.

Separately, @bstopp raised an issue with da-admin passing down continuation tokens for larger tasks. My push back was that we are not in a place to offload longer tasks.

Putting these two things together, I think we're converging on the need of a job system where we can offload some of these longer operations. Similar to Helix's job system... https://www.aem.live/docs/admin.html#tag/job

We have a hard limit in CF for 1000 sub-requests. I think for the immediate need, a continuation token is ok, but it's not sustainable long term. I also think as @sukamat goes down the path of these longer running operations, we can start to justify a larger investment (in terms of time / energy) in this space.

In Sunil's case, he's effectively automating the copying of content between repos. I think this could be a generic solution that could be wrapped into a job + worker sub-request type solution. Any UIs could simply poll to see when these longer operations are finished. This would be much more reliable than what we have today.

My high level thought is if you ask da-admin to do something, and da-admin detects it's going to be larger than 1000, it kicks out to da-job (or whatever) to finish the task. The client then receives a 202 + job id it can follow up with. What's nice about this is that we don't have to change our API surface, and clients can handle 200 / 201 / 202 / 204 depending on the response.

A real-world use case:

I ask to move / copy a folder. As a client, I don't know how big it is, I just need it moved.

da-admin does a listObjects command and gets continuation tokens back... uh oh, more than 1000.

da-admin blows through all continuation tokens to get the current list of objects impacted (maybe 10,000). What is nice here, is that we can the state of the world at request time. This is helpful for scenarios where you may want to copy something inside itself. If you don't get this list up front, you could run into recursion issues.

da-admin then kicks off da-job requests and sends back the relevant job information.

sukamat · 2024-10-24T17:46:20Z

sukamat
Oct 24, 2024

I like the idea.

For a.com in the SharePoint/Milo world, some of the background/heavy processing required for localization, floodgate, graybox etc. is being done using AdobeIO (AIO). There are two parts to it:

Bulk actions (copy, move, delete etc.) on content in SharePoint
Bulk preview/publish content

For (1), SharePoint does not provide a way to perform bulk actions (eg: bulk copy, bulk move etc.) using MS Graph API. It happens at a per file level. So the AIO app handles this operation in the background by performing the actions per file.

The AIO app also does additional a.com specific operations like removing certain blocks from the page, updating fragment URLs etc. before the file copy is done to the destination location.

For (2), the AIO app pushes some of the bulk processing tasks (eg: bulk preview, publish etc.) to Helix using Helix's bulk APIs and uses the job-status API to check on the status of it.

For a.com use-cases on DA, both of the above can be similarly achieved using AIO. That said, it will definitely be good if (1) is available OOTB in DA-as-a-product to handle large copy/move/delete etc. operations effectively by using background server-side processing (perhaps similar to how bulk jobs are handled in Helix).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DA Job #252

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

DA Job #252

auniverseaway Oct 23, 2024 Maintainer

Replies: 1 comment

sukamat Oct 24, 2024

auniverseaway
Oct 23, 2024
Maintainer

sukamat
Oct 24, 2024