Skip to content

.actor().call() method to set the correct timeout, show the progress in status message, and stream logs #632

Open
@mtrunkat

Description

@mtrunkat

I was trying out https://apify.com/jakub.kopecky/llmstxt-generator Actor, the experience was not great because of the following:

Timeout

The Actor above was started with a timeout of 18,000 seconds, but the WCC is triggered with a default timeout of 360,000. So, it may happen that the original Actor timeouts, but the WCC will continue running. IMHO, in this case, we should set the timeout for the remaining time for the original Actor.

There might be cases when this is not appropriate, so this behavior could be opt-in or out.

Logs

It's called WCC underneath, which may take a long time to finish in the case of a large website. This means that the Actor seem to get stuck on the following log:

2025-01-23T13:56:06.535Z ACTOR: Pulling Docker image of build OQWIcf5rmeLt4icyd from repository.
2025-01-23T13:56:08.308Z ACTOR: Creating Docker container.
2025-01-23T13:56:08.850Z ACTOR: Starting Docker container.
2025-01-23T13:56:11.052Z [apify] INFO  Initializing Actor...
2025-01-23T13:56:11.054Z [apify] INFO  System info ({"apify_sdk_version": "2.1.0", "apify_client_version": "1.8.1", "crawlee_version": "0.4.5", "python_version": "3.12.8", "os": "linux"})
2025-01-23T13:56:11.119Z [apify] INFO  Starting the "apify/website-content-crawler" actor for URL: https://docs.apify.com/

So, I am thinking about improving the .actor().call() method in SDK/client the way that it enables developers to optionally stream the log from the Actor called via a .call() to provide progress/context info.

Status message

Finally, it displays a dummy status message that does not communicate progress. The call could automatically update the status message, for example, here, with:

Running Website Content Crawler: processed 235/7876

Metadata

Metadata

Assignees

No one assigned

    Labels

    t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions