Skip to content

Batch jobs fail to start with a bash error "unary operator expected" #1389

Open
@tuulos

Description

@tuulos

Over the years, we have received a number of sporadic reports of @batch jobs failing without anything on the Metaflow console but CloudWatch containing messages like:

  | 2023-05-03T13:58:18.077-07:00 | Setting up task environment.
-- | -- | --
  | 2023-05-03T13:58:24.321-07:00 | bash: line 1: [: -le: unary operator expected
  | 2023-05-03T13:58:24.321-07:00 | bash: line 1: [: -gt: unary operator expected
  | 2023-05-03T13:58:24.322-07:00 | tar: job.tar: Cannot open: No such file or directory
  | 2023-05-03T13:58:24.323-07:00 | tar: Error is not recoverable: exiting now
  | 2023-05-03T13:58:24.336-07:00 | /usr/local/bin/python: Error while finding module specification for 'metaflow.mflog.save_logs' (ModuleNotFoundError

This seems to happen if Metaflow fails to install its dependencies in the entrypoint (awscli / pip etc), e.g. due to upstream package repos not being responsive. The issue typically fixes itself after a while.

We could provide a better error message at least

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions