Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lhsm_remove policy with a UUID and a missing FID #95

Open
guilbaults opened this issue May 28, 2018 · 0 comments
Open

lhsm_remove policy with a UUID and a missing FID #95

guilbaults opened this issue May 28, 2018 · 0 comments

Comments

@guilbaults
Copy link
Contributor

lhsm_remove policy with a UUID and a missing FID

Currently Robinhood does not seem to support an lhsm_remove policy on a file that was removed from lustre and stored on the HSM backend using a UUID.

Robinhood will issue a lfs hsm_remove to the POSIX path were that file was on Lustre before getting an rm command. Since this path/FID does not exist anymore, the copytool will not be able to extract the UUID from the xattr and will fail.

Posible solutions:

  • Robinhood could call a external program and pass the UUID to delete from the HSM storage using the UUID stored in the database. This could be configured like the rebind_cmd in the config file.
  • Since rbh-undelete works with UUID, it could be used to recreate a temporary FID and issue the hsm_remove on that FID using the normal HSM event queue. That file does not need to be located in a user directory, it could be located in a dot folder in the lustre mount point. The standard copy tool will be able to read the UUID from the xattr on that hidden file and delete it from the backend. After the hsm_remove is completed, that temporary FID should also be deleted, or expire after x days.

Example:

Robinhood send the equivalent of lfs hsm_remove on a FID that does not exist anymore, it was deleted from Lustre while still having a copy in the HSM backend.
robinhood_actions.log:

2018/05/28 10:47:21 [9391/15] lhsm_remove success for '/project/archive/sigui4/test1', matching rule 'default', rm_time 7.2min ago | size=104857600, rm_time=1527518412

The copytool receive the request but can't find the FID to get the UUID from the xattr on the inode, logs from lhsmtool_cmd stdout:

1527518855.378281 lhsmtool_cmd[30344]: Running REMOVE command: '/root/sigui4/env/bin/python3.6 /root/sigui4/ct_tsm.py --remove --fid=[0x200001b73:0x5fc:0x0] --lustre-root=/project' 
[...]
FileNotFoundError: [Errno 2] No such file or directory: b'/project/.lustre/fid/0x200001b73:0x5fc:0x0'

lhsmtool_cmd was patched to add the remove command

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant