Feature: Object Lifecycle Management for Gateway Mode NasXL #27

rluetzner · 2022-04-20T12:53:56Z

Description

This allows us to use ILM in combination with Gateway Mode NasXL.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Optimization (provides speedup with no functional changes)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

Fixes a regression (If yes, please add commit-id or PR # here)
Documentation updated
Unit tests added/updated

The only reason why the useful profiling function of the admin API doesn't work for gateway modes is, that the gateway-main doesn't init the global variable globalLocalNodeName. We simply use the same initialization logic as in server mode.

* set MINIO_GATEWAY_SCANNER_CANDIDATE to 'on' to make the gateway instance take part in the leader election for running the datascanner process * the active leader will execute the datascanner process * leader election is implemented with etcd concurrency api * TODOs: remove info logs

In erasure or server mode minio keeps track of changes and stores the object paths that have been updated in a bloomfilter that is shared between the server instances. Changes to this filter are published via RPC notification. The data scanner utilizes this information to decide if a folder needs to be scanned. Because nas(xl) mode currently no not support RPC notifications, we decided to disable checking of this filter within the data scanner.

…ateDirCycles The value of globalDataScannerStartDelay can be set via env 'MINIO_SCANNER_START_DELAY_SECONDS' This value of globalDataUsageUpdateDirCycles can be set via env 'MINIO_USAGE_UPDATE_DIR_CYCLES'

* Activation of Workers (BackgroundTransition, BackgroundExpiry, TierDeletionJournal) * Implementation of TransitionObject and RestoreObject in fs-v1.go * Linting

* Activated Router endpoints for tier configuration * Initializing subsystem * Handler methods now allow nasxl mode * only working with single server mode, because the tier config changes are not populated to other servers yet

The format stored in xl.meta was not conform to the spec of aws: https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html#API_HeadObject_ResponseSyntax

The response code returned in case of success did not match the aws spec. See 'Responses' (200 OK vs. 202 Accepted): https://docs.aws.amazon.com/AmazonS3/latest/API/API_RestoreObject.html

Updated to tier configs on server A do now trigger server B to reload its tier configs. This is done using a watch on an etcd key. Each time an instance updates a tier config, it updates the etcd key and all other instances reload their tier configs.

Restoring of files that had been uploaded as multipart did not work with mc client. The mc client triggered completion of multipart upload and at the same time it polled for the status of the final object via getobjectinfo. The locks in getobjectinfo (fs-v1.go L:875 and L:888) in conjunction with the locks of complete multipart upload (fs-v1-multipart.go L: 604 and L:773) caused a deadlock at fs-v1-multipart.go L:773. The status polling request was stuck in fs-v1.go L:888 due to the filesystem write-lock of the complete multipart upload request. Thus the polling request did not release the NS read lock. This caused the completion request to get stuck in fs-v1-multipart.go L:773. I have moved the filesystem write lock to the same location as the NS lock.

Restoring of files that had been uploaded as multipart resulted in the deletion of the file in the remote tier and with empty xl.meta. This was caused by a missing metadata property (restore-status header). The same property already got set in non-multipart restore process. I have applied it to both kind of restore processes: multipart and non-multipart.

The meta lock file used to prevent the parallel creation of buckets with the same name has not been deleted after successfully creating the bucket. 1. meta file is now deleted during cleanup. As the cleanup method only deletes newly created meta files when an error is passed to the function, I have created some kind of pseudo error. This solved the deletion of the xl.meta file. -- seems a bit hacky though 2. although the xl.meta is now deleted, the folder of the pseudobucket still resists in meta tmp folder. So I changed the .lck files location to be directly located within the meta tmp folder and not within a sub folder (pseudobucket).

Removed env 'MINIO_SCANNER_START_DELAY_SECONDS' as this does not work correctly and there is already a working solution in place with env 'MINIO_SCANNER_CYCLE'.

This unit allows users to limit the maximum number of noncurrent versions of an object. To enable this rule you need the following *ilm.json* ``` cat >> ilm.json <<EOF { "Rules": [ { "ID": "test-max-noncurrent", "Status": "Enabled", "Filter": { "Prefix": "user-uploads/" }, "NoncurrentVersionExpiration": { "MaxNoncurrentVersions": 5 } } ] } EOF mc ilm import myminio/mybucket < ilm.json ```

- Rename MaxNoncurrentVersions tag to NewerNoncurrentVersions Note: We apply overlapping NewerNoncurrentVersions rules such that we honor the highest among applicable limits. e.g if 2 overlapping rules are configured with 2 and 3 noncurrent versions to be retained, we will retain 3. - Expire newer noncurrent versions after noncurrent days - MinIO extension: allow noncurrent days to be zero, allowing expiry of noncurrent version as soon as more than configured NewerNoncurrentVersions are present. - Allow NewerNoncurrentVersions rules on object-locked buckets - No x-amz-expiration when NewerNoncurrentVersions configured - ComputeAction should skip rules with NewerNoncurrentVersions > 0 - Add unit tests for lifecycle.ComputeAction - Support lifecycle rules with MaxNoncurrentVersions - Extend ExpectedExpiryTime to work with zero days - Fix all-time comparisons to be relative to UTC

Caused by 'LatestModTime: versions[0].ModTime' if versions slice was empty. fff

Instances now observe the name of the current leader and do not restart scanning if their name does not match the leadername. In this case the former leader starts campaigning for the leader role. In addition the campaign winner rechecks the leader name after the call to Campaign() To create a random name for each instance the randString function has been moved from tests to utils.go.

rluetzner · 2022-04-20T13:26:29Z

@tristanessquare , well this was stupid of me. Now we have a pull request, but I can no longer review it. 🤣
I'll figure out a way.

aweisser and others added 23 commits December 17, 2021 20:52

Fix: Datascanner did not handle files correctly in nasxl mode

ed43596

Fix: Condition of log statement had been deleted

e6f372d

Removed unnecessary logs from leader election

f9dd09a

Added env vars for globalDataScannerStartDelay and globalDataUsageUpd…

e3635ac

…ateDirCycles The value of globalDataScannerStartDelay can be set via env 'MINIO_SCANNER_START_DELAY_SECONDS' This value of globalDataUsageUpdateDirCycles can be set via env 'MINIO_USAGE_UPDATE_DIR_CYCLES'

ILM-Feature for nasxl mode

da6c9b3

* Activation of Workers (BackgroundTransition, BackgroundExpiry, TierDeletionJournal) * Implementation of TransitionObject and RestoreObject in fs-v1.go * Linting

Enabled Tier configuration in nasxl mode (single server)

f837c15

* Activated Router endpoints for tier configuration * Initializing subsystem * Handler methods now allow nasxl mode * only working with single server mode, because the tier config changes are not populated to other servers yet

Fix: format of http header 'x-amz-restore'

d8d0835

The format stored in xl.meta was not conform to the spec of aws: https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html#API_HeadObject_ResponseSyntax

Fix: Changed response code on PostRestoreObject request

b0021f1

The response code returned in case of success did not match the aws spec. See 'Responses' (200 OK vs. 202 Accepted): https://docs.aws.amazon.com/AmazonS3/latest/API/API_RestoreObject.html

Removed 'MINIO_SCANNER_START_DELAY_SECONDS'

50721f8

Removed env 'MINIO_SCANNER_START_DELAY_SECONDS' as this does not work correctly and there is already a working solution in place with env 'MINIO_SCANNER_CYCLE'.

fix: immediate tiering for NoncurrentVersionTransition (minio#13464)

6ced283

Fix: NilPointer on empty versions slice

7198aec

Caused by 'LatestModTime: versions[0].ModTime' if versions slice was empty. fff

Added documentation for ilm in nasxl mode

66b6fca

adjusted documentation - how to enable ilm in nasxl mode

8fc34ef

rluetzner assigned tristanexsquare Apr 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Object Lifecycle Management for Gateway Mode NasXL #27

Feature: Object Lifecycle Management for Gateway Mode NasXL #27

rluetzner commented Apr 20, 2022

rluetzner commented Apr 20, 2022

Feature: Object Lifecycle Management for Gateway Mode NasXL #27

Are you sure you want to change the base?

Feature: Object Lifecycle Management for Gateway Mode NasXL #27

Conversation

rluetzner commented Apr 20, 2022

Description

Types of changes

Checklist:

rluetzner commented Apr 20, 2022