You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone,
I am currently using SageMaker connected to an S3 Bucket. I successfully downloaded data and obtained tagging results with Dolma without encountering any issues. However, during the final mixing step, I encountered the following error:
[2024-03-26T11:38:37Z INFO dolma::s3_util] Listing objects in bucket=h-datasets, prefix=pretraining/wikipedia/v0/documents/ thread '<unnamed>' panicked at src/s3_util.rs:249:18: called 'Result::unwrap()' on an 'Err' value: ServiceError(ServiceError { source: Unhandled(Unhandled { source: ErrorMetadata { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), extras: Some({"aws_request_id": "M0KWX3F3J6VJX7FS", "s3_extended_request_id": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo="}) }, meta: ErrorMetadata { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), extras: Some({"aws_request_id": "M0KWX3F3J6VJX7FS", "s3_extended_request_id": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo="}) } }), raw: Response { inner: Response { status: 301, version: HTTP/1.1, headers: {"x-amz-bucket-region": "us-west-2", "x-amz-request-id": "M0KWX3F3J6VJX7FS", "x-amz-id-2": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo=", "content-type": "application/xml", "transfer-encoding": "chunked", "date": "Tue, 26 Mar 2024 11:38:36 GMT", "server": "AmazonS3"}, body: SdkBody { inner: Once(Some(b"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint>h-datasets.s3-us-west-2.amazonaws.com</Endpoint><Bucket>h-datasets</Bucket><RequestId>M0KWX3F3J6VJX7FS</RequestId><HostId>dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo=</HostId></Error>")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag { contents: ["aws_smithy_http::connection::CaptureSmithyConnection", "aws_http::user_agent::AwsUserAgent", "aws_sdk_s3::endpoint::Params", "aws_credential_types::credentials_impl::Credentials", "aws_types::region::SigningRegion", "aws_types::region::Region", "aws_credential_types::cache::SharedCredentialsCache", "aws_smithy_types::endpoint::Endpoint", "aws_sig_auth::middleware::Signature", "aws_sig_auth::signer::OperationSigningConfig", "aws_types::SigningService", "alloc::vec::Vec<http::version::Version>", "aws_smithy_http::operation::Metadata"] }, poisoned: false, .. }) } }) note: run with 'RUST_BACKTRACE=1' environment variable to display a backtrace
Regarding permissions, everything appears to be functioning correctly since I can list the objects using my credentials, which are stored in aws/credentials and also exported in the environment variables.
Does anyone have any suggestions? Thank you for your time.
The text was updated successfully, but these errors were encountered:
Hello everyone,
I am currently using SageMaker connected to an S3 Bucket. I successfully downloaded data and obtained tagging results with Dolma without encountering any issues. However, during the final mixing step, I encountered the following error:
[2024-03-26T11:38:37Z INFO dolma::s3_util] Listing objects in bucket=h-datasets, prefix=pretraining/wikipedia/v0/documents/ thread '<unnamed>' panicked at src/s3_util.rs:249:18: called 'Result::unwrap()' on an 'Err' value: ServiceError(ServiceError { source: Unhandled(Unhandled { source: ErrorMetadata { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), extras: Some({"aws_request_id": "M0KWX3F3J6VJX7FS", "s3_extended_request_id": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo="}) }, meta: ErrorMetadata { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), extras: Some({"aws_request_id": "M0KWX3F3J6VJX7FS", "s3_extended_request_id": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo="}) } }), raw: Response { inner: Response { status: 301, version: HTTP/1.1, headers: {"x-amz-bucket-region": "us-west-2", "x-amz-request-id": "M0KWX3F3J6VJX7FS", "x-amz-id-2": "dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo=", "content-type": "application/xml", "transfer-encoding": "chunked", "date": "Tue, 26 Mar 2024 11:38:36 GMT", "server": "AmazonS3"}, body: SdkBody { inner: Once(Some(b"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint>h-datasets.s3-us-west-2.amazonaws.com</Endpoint><Bucket>h-datasets</Bucket><RequestId>M0KWX3F3J6VJX7FS</RequestId><HostId>dHPizkgVIoHo4PH0GgJQVt+LvOXiFOKdN7JKqIP6pDWGqvGLmAxrY4Ct7JSCA3geVAkkJxOCpwo=</HostId></Error>")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag { contents: ["aws_smithy_http::connection::CaptureSmithyConnection", "aws_http::user_agent::AwsUserAgent", "aws_sdk_s3::endpoint::Params", "aws_credential_types::credentials_impl::Credentials", "aws_types::region::SigningRegion", "aws_types::region::Region", "aws_credential_types::cache::SharedCredentialsCache", "aws_smithy_types::endpoint::Endpoint", "aws_sig_auth::middleware::Signature", "aws_sig_auth::signer::OperationSigningConfig", "aws_types::SigningService", "alloc::vec::Vec<http::version::Version>", "aws_smithy_http::operation::Metadata"] }, poisoned: false, .. }) } }) note: run with 'RUST_BACKTRACE=1' environment variable to display a backtrace
Regarding permissions, everything appears to be functioning correctly since I can list the objects using my credentials, which are stored in aws/credentials and also exported in the environment variables.
Does anyone have any suggestions? Thank you for your time.
The text was updated successfully, but these errors were encountered: