From 8f490b02c34559f2b7106d194f18c19a6aab2f22 Mon Sep 17 00:00:00 2001 From: Rico Chiu Date: Wed, 15 May 2024 00:36:00 -0500 Subject: [PATCH] [DOCFIX] Add warning for distributedCp limitations There are known limitations in using the `fs distributedCp` command; explicitly call them out in the docs, ex. https://docs.alluxio.io/ee-da/user/stable/en/operation/User-CLI.html#distributedcp pr-link: Alluxio/alluxio#18608 change-id: cid-be4807b887956808990a5094b4afee63780bfd84 --- docs/en/operation/User-CLI.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/en/operation/User-CLI.md b/docs/en/operation/User-CLI.md index c42558427d4b..171e69820418 100644 --- a/docs/en/operation/User-CLI.md +++ b/docs/en/operation/User-CLI.md @@ -847,6 +847,12 @@ Please wait for command submission to finish.. Submitted migrate job successfully, jobControlId = JOB_CONTROL_ID_2 ``` +Please note below are known limitations for the distributed copy command. +- Limited Scalability: No more than 1 million total number of files should be moved concurrently. Note that a copy job may stay active for a short period after the last file is copied. +- Manual Integrity Validation: Verification between source and destination files relies on the response code from the underlying data lake storage. In case the response code is unreliable, we recommend manual verification of source and destination checksums. +- Manual Cleanup: In certain failure scenarios, a user may need to manually remove partially written contents in destination directories and restart the failed jobs. +- Limited Observability: Status checks are limited to using the command line for each job individually. + ### du The `du` command outputs the total size and amount stored in Alluxio of files and folders.