Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DANE version with no-tar option (WIP) #48

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

Veldhoen
Copy link
Member

@Veldhoen Veldhoen commented Jul 26, 2024

Instead of tarring the worker's output before uploading to S3, files are added separately with prefixes to maintain the directory structure. This mostly relies on a proposed update in DANE core; https://github.com/CLARIAH/DANE/tree/23-update-s3_util-to-allow-for-untarred-uploading

  • Make sure to have a .env file with AWS credentials to be able to write to the bucket you specify in config
  • Run either locally:
    • install the environment with poetry install
    • Copy ./config/config.yl to the root of the repo (.) and adjust the necessary settings: BASE_MOUNT and PATHS , S3_endpoint_URL, TRANSFER_ON_COMPLETION
    • poetry run python worker.py --run-test-file
  • Or run containerized:
    • Build the image docker build -t dane-visual-feature-extraction-worker .
    • Adjust the config in ./config/config.yl (set S3_endpoint_URL and TRANSFER_ON_COMPLETION)
    • Run docker-compose up
  • Run with TAR_OUTPUT: false and TAR_OUTPUT: true and confirm that the output is transferred in the appropriate format

Bonus points: check out the DANE PR too (https://github.com/CLARIAH/DANE/tree/23-update-s3_util-to-allow-for-untarred-uploading)

@Veldhoen Veldhoen linked an issue Jul 26, 2024 that may be closed by this pull request
@Veldhoen Veldhoen marked this pull request as ready for review August 6, 2024 14:36
@Veldhoen Veldhoen requested a review from KleinRana August 6, 2024 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dont tar
1 participant