Skip to content

Commit

Permalink
Allow reading restored GFR/GDA objects (#434)
Browse files Browse the repository at this point in the history
Signed-off-by: Vlad Volodkin <[email protected]>
  • Loading branch information
Vlad Volodkin committed Aug 30, 2023
1 parent 73a27c1 commit a04bec8
Show file tree
Hide file tree
Showing 15 changed files with 458 additions and 132 deletions.
33 changes: 1 addition & 32 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions doc/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Mountpoint uses the same [credentials configuration options](https://docs.aws.am
We recommend you use short-term AWS credentials whenever possible. Mountpoint supports several options for short-term AWS credentials:
* When running Mountpoint on an Amazon EC2 instance, you can [associate an IAM role with your instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) using an instance profile, and Mountpoint will automatically assume that IAM role.
* When running Mountpoint in an Amazon ECS task, you can similarly [associate an IAM role with the task](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html) for Mountpoint to automatically assume.
* When running Mountpoint in an Amazon ECS task, you can similarly [associate an IAM role with the task](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html) for Mountpoint to automatically assume.
* Otherwise, you can [acquire temporary AWS credentials for an IAM role](https://docs.aws.amazon.com/cli/latest/userguide/cli-authentication-short-term.html) from the AWS Console or with the `aws sts assume-role` AWS CLI command, and store them in the `~/.aws/credentials` file.

If you need to use long-term AWS credentials, you can [store them in the configuration and credentials files](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) in `~/.aws`, or [specify them with environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) (`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`).
Expand Down Expand Up @@ -164,7 +164,7 @@ Amazon S3 offers a [range of storage classes](https://aws.amazon.com/s3/storage-

For the full list of possible storage classes, see the [PutObject documentation](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#AmazonS3-PutObject-request-header-StorageClass) in the Amazon S3 User Guide.

Mountpoint supports reading existing objects from your S3 bucket when they are stored in any instant-retrieval storage class. You cannot use Mountpoint to read objects stored in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes, or the Archive Access or Deep Archive Access tiers of S3 Intelligent-Tiering. This limitation exists even if you have restored the object. However, you can still use Mountpoint to write new objects into these storage classes or S3 Intelligent-Tiering.
Mountpoint supports reading existing objects from your S3 bucket when they are stored in any instant-retrieval storage class. You cannot use Mountpoint to read objects stored in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes, or the Archive Access or Deep Archive Access tiers of S3 Intelligent-Tiering, unless they've been [restored](https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html). You can use Mountpoint to write new objects into these storage classes or S3 Intelligent-Tiering.

### File and directory permissions

Expand Down
6 changes: 3 additions & 3 deletions doc/SEMANTICS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ By default, Mountpoint does not allow deleting existing objects with commands li

You cannot rename an existing file using Mountpoint.

Objects in the S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage classes, and the Archive Access and Deep Archive Access tiers of S3 Intelligent-Tiering, are not accessible with Mountpoint even if they have been restored. To access these objects with Mountpoint, copy them to another storage class first.
Objects in the S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage classes, and the Archive Access and Deep Archive Access tiers of S3 Intelligent-Tiering, are only accessible with Mountpoint if they have been restored. To access these objects with Mountpoint, [restore](https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html) them first.

## Directories

Expand Down Expand Up @@ -108,13 +108,13 @@ S3 places fewer restrictions on [valid object keys](https://docs.aws.amazon.com/
* `blue/`
* `blue/image.jpg`
* `red/`

then mounting your bucket would give a file system with a `blue` directory containing an `image.jpg` file, and an empty `red` directory. The `blue/` and `red/` objects will not be accessible. Note that the S3 Console creates zero-byte objects like `blue/` and `red/` when creating directories in a bucket, and so these directories will work as expected.
* Files will be shadowed by directories with the same name. For example, if your bucket has the following object keys:

* `blue`
* `blue/image.jpg`

then mounting your bucket would give a file system with a `blue` directory, containing the file `image.jpg`. The `blue` object will not be accessible. Deleting the key `blue/image.jpg` will remove the `blue` directory, and cause the `blue` file to become visible.

We test Mountpoint against these restrictions using a [reference model](https://github.com/awslabs/mountpoint-s3/blob/main/mountpoint-s3/tests/reftests/reference.rs) that programmatically encodes the expected mapping between S3 objects and file system structure.
Expand Down
52 changes: 44 additions & 8 deletions mountpoint-s3-client/src/mock_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use std::ops::Range;
use std::pin::Pin;
use std::sync::{Arc, RwLock};
use std::task::{Context, Poll};
use std::time::{Duration, SystemTime};

use async_trait::async_trait;
use futures::{Stream, StreamExt};
Expand All @@ -20,7 +21,7 @@ use crate::object_client::{
ObjectClient, ObjectClientError, ObjectClientResult, ObjectInfo, PutObjectError, PutObjectParams, PutObjectResult,
UploadReview, UploadReviewPart,
};
use crate::{Checksum, ETag, ObjectAttribute, PutObjectRequest};
use crate::{Checksum, ETag, ObjectAttribute, PutObjectRequest, RestoreStatus};

pub const RAMP_MODULUS: usize = 251; // Largest prime under 256
static_assertions::const_assert!((RAMP_MODULUS > 0) && (RAMP_MODULUS <= 256));
Expand Down Expand Up @@ -56,12 +57,12 @@ pub struct MockClientConfig {
#[derive(Debug)]
pub struct MockClient {
config: MockClientConfig,
objects: Arc<RwLock<BTreeMap<String, Arc<MockObject>>>>,
objects: Arc<RwLock<BTreeMap<String, MockObject>>>,
in_progress_uploads: Arc<RwLock<BTreeSet<String>>>,
}

fn add_object(objects: &Arc<RwLock<BTreeMap<String, Arc<MockObject>>>>, key: &str, value: MockObject) {
objects.write().unwrap().insert(key.to_owned(), Arc::new(value));
fn add_object(objects: &Arc<RwLock<BTreeMap<String, MockObject>>>, key: &str, value: MockObject) {
objects.write().unwrap().insert(key.to_owned(), value);
}

impl MockClient {
Expand Down Expand Up @@ -108,13 +109,38 @@ impl MockClient {
Err(MockClientError("object not found".into()))
}
}

/// Returns error if object does not exist
pub fn restore_object(&self, key: &str) -> Result<(), MockClientError> {
match self.objects.write().unwrap().get_mut(key) {
Some(mock_object) => {
mock_object.restore_status = Some(RestoreStatus::Restored {
expiry: SystemTime::now() + Duration::from_secs(3600),
});
Ok(())
}
None => Err(MockClientError("object not found".into())),
}
}

pub fn is_object_restored(&self, key: &str) -> Result<bool, MockClientError> {
if let Some(mock_object) = self.objects.read().unwrap().get(key) {
Ok(matches!(
mock_object.restore_status,
Some(RestoreStatus::Restored { expiry: _ })
))
} else {
Err(MockClientError("object not found".into()))
}
}
}

#[derive(Clone)]
pub struct MockObject {
generator: Arc<dyn Fn(u64, usize) -> Box<[u8]> + Send + Sync>,
size: usize,
storage_class: Option<String>,
restore_status: Option<RestoreStatus>,
last_modified: OffsetDateTime,
etag: ETag,
}
Expand All @@ -131,6 +157,7 @@ impl MockObject {
size: bytes.len(),
generator: Arc::new(move |offset, size| bytes[offset as usize..offset as usize + size].into()),
storage_class: None,
restore_status: None,
last_modified: OffsetDateTime::now_utc(),
etag,
}
Expand All @@ -141,6 +168,7 @@ impl MockObject {
generator: Arc::new(move |_offset, size| vec![v; size].into_boxed_slice()),
size,
storage_class: None,
restore_status: None,
last_modified: OffsetDateTime::now_utc(),
etag,
}
Expand All @@ -161,6 +189,7 @@ impl MockObject {
}),
size,
storage_class: None,
restore_status: None,
last_modified: OffsetDateTime::now_utc(),
etag,
}
Expand All @@ -174,6 +203,10 @@ impl MockObject {
self.storage_class = storage_class;
}

pub fn set_restored(&mut self, restore_status: Option<RestoreStatus>) {
self.restore_status = restore_status;
}

pub fn len(&self) -> usize {
self.size
}
Expand All @@ -200,13 +233,14 @@ impl std::fmt::Debug for MockObject {
.field("storage_class", &self.storage_class)
.field("last_modified", &self.last_modified)
.field("etag", &self.etag)
.field("restored", &self.restore_status)
.finish()
}
}

#[derive(Debug)]
pub struct GetObjectResult {
object: Arc<MockObject>,
object: MockObject,
next_offset: u64,
length: usize,
part_size: usize,
Expand Down Expand Up @@ -316,7 +350,7 @@ impl ObjectClient for MockClient {
};

Ok(GetObjectResult {
object: Arc::clone(object),
object: object.clone(),
next_offset,
length,
part_size: self.config.part_size,
Expand Down Expand Up @@ -347,6 +381,7 @@ impl ObjectClient for MockClient {
last_modified: object.last_modified,
etag: object.etag.as_str().to_string(),
storage_class: object.storage_class.clone(),
restore_status: object.restore_status,
},
})
} else {
Expand Down Expand Up @@ -442,6 +477,7 @@ impl ObjectClient for MockClient {
last_modified: object.last_modified,
etag: object.etag.as_str().to_string(),
storage_class: object.storage_class.clone(),
restore_status: object.restore_status,
});
}
}
Expand Down Expand Up @@ -523,7 +559,7 @@ pub struct MockPutObjectRequest {
buffer: Vec<u8>,
part_size: usize,
params: PutObjectParams,
objects: Arc<RwLock<BTreeMap<String, Arc<MockObject>>>>,
objects: Arc<RwLock<BTreeMap<String, MockObject>>>,
in_progress_uploads: Arc<RwLock<BTreeSet<String>>>,
}

Expand All @@ -532,7 +568,7 @@ impl MockPutObjectRequest {
key: &str,
part_size: usize,
params: &PutObjectParams,
objects: &Arc<RwLock<BTreeMap<String, Arc<MockObject>>>>,
objects: &Arc<RwLock<BTreeMap<String, MockObject>>>,
in_progress_uploads: &Arc<RwLock<BTreeSet<String>>>,
) -> Self {
in_progress_uploads.write().unwrap().insert(key.to_owned());
Expand Down
17 changes: 17 additions & 0 deletions mountpoint-s3-client/src/object_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ use async_trait::async_trait;
use auto_impl::auto_impl;
use futures::Stream;
use std::str::FromStr;
use std::time::SystemTime;
use std::{
fmt::{self, Debug},
ops::Range,
Expand Down Expand Up @@ -315,6 +316,19 @@ pub enum PutObjectError {
NoSuchBucket,
}

/// Restoration status for S3 objects in GLACIER/DEEP_ARCHIVE storage class
/// See https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html#restore-archived-objects-status for more details.
#[derive(Debug, Clone, Copy)]
pub enum RestoreStatus {
/// S3 returns this status after it accepted a restoration request, but not have completed it yet.
/// Objects with this status are not readable.
InProgress,

/// This status means that restoration is fully completed. Note that restored objects are stored only
/// for the number of days that was specified in the request.
Restored { expiry: SystemTime },
}

/// Metadata about a single S3 object.
/// See https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html for more details.
#[derive(Debug)]
Expand All @@ -333,6 +347,9 @@ pub struct ObjectInfo {
/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html#API_HeadObject_Examples
pub storage_class: Option<String>,

/// Objects with GLACIER or DEEP_ARCHIVE storage classes are only acessable after restoration
pub restore_status: Option<RestoreStatus>,

/// Entity tag of this object.
pub etag: String,
}
Expand Down
Loading

0 comments on commit a04bec8

Please sign in to comment.