Replies: 5 comments 1 reply
-
The branch monitor-execinfo-exp proposes an executor API to implement either the pure-polling or the hybrid notify-and-pull approaches (the pure-push model wouldn't involve the executor at all). Two public methods are to be implemented by concrete executors: # Compute list of files to monitor based on dispatch id and node id
def get_files_to_monitor(self) -> List[str]:
raise NotImplementedError
# The user overrides this method to retrieve at most `size` bytes from a
# remote file starting at the specified position (in bytes).
async def poll_file(self, path: str, starting_pos: int, size: int = -1) -> str:
raise NotImplementedError The executor plugin then starts monitoring each specified file at the beginning of |
Beta Was this translation helpful? Give feedback.
-
Demo script: import covalent as ct
import time
from dask.distributed import LocalCluster
from covalent.executor import DaskExecutor
lc = LocalCluster()
dask_exec = DaskExecutor(lc.scheduler_address)
@ct.electron(executor=dask_exec)
def write_msg(n_iter):
for i in range(n_iter):
with open("/tmp/stdout.log", "a") as f:
print("Hello", file=f)
time.sleep(3)
return 0
@ct.lattice(workflow_executor="local")
def workflow(n_iter):
return write_msg(n_iter) Try calling
|
Beta Was this translation helpful? Give feedback.
-
@cjao I like the pure polling method; it's the simplest to implement in the short term |
Beta Was this translation helpful? Give feedback.
-
I definitely think this is a great idea! |
Beta Was this translation helpful? Give feedback.
-
The Covalent dispatcher currently retrieves
stdout
andstderr
from a task only after the task has finished running. In addition, it does not provide a standard way to retrieve other log files from the executor that may be of interest, such as Slurm logs or profiling data.The purpose of this discussion is to sketch out some possible APIs for monitoring a specified set of files in the executor during task execution and streaming their updates to the dispatcher, either continuously or according to a specified polling frequency. These executors will usually be remote. For simplicity, let us also assume that the files to be monitored are append-only. This is a reasonable assumption for most log files.
Possible approaches:
/api/update_result
). Each POST would reference thedispatch_id
andnode_id
, and contain the file delta in its body. This approach assumes that the executor has outgoing network connectivity (e.g. can reach the dispatcher).Pull:
Push:
A hybrid approach might be for the executor to merely notify that the dispatcher that an update has occurred, which prompts the dispatcher to then pull the updates from the executor. This would avoid the periodic polling while reducing the security requirements.
Beta Was this translation helpful? Give feedback.
All reactions