Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust S3DynamoDBLogStore to ScyllaDB's Alternator #2411

Open
2 of 9 tasks
rbushri opened this issue Dec 28, 2023 · 0 comments · May be fixed by #2410
Open
2 of 9 tasks

Adjust S3DynamoDBLogStore to ScyllaDB's Alternator #2411

rbushri opened this issue Dec 28, 2023 · 0 comments · May be fixed by #2410
Labels
enhancement New feature or request

Comments

@rbushri
Copy link
Contributor

rbushri commented Dec 28, 2023

Feature request

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • S3DynamoDBLogStore
  • Other (fill in here)

Overview

Adjust the S3DynamoDBLogStore to be compatible with ScyllaDB's Alternator.

Motivation

This adjustment aims to create a cloud-agnostic solution for the Delta Lake on S3 Multiple Writers issue using ScyllaDB's Alternator. It offers an open-source solution for S3 and S3-compatible storage lacking the putIfAbsent functionality.

Further details

The implementation includes the addition of an abstraction layer for DynamoDB LogStore (io.delta.storage.BaseDynamoDBLogStore) and introduces two implementations:

  1. io.delta.storage.DynamoDBLogStore - for DynamoDB (no configuration changes for DynamoDB implementation).
  2. io.delta.storage.S3ScyllaDBLogStore - for Scylla DB
    The configuration details for ScyllaDB are as follows:
spark.delta.logStore.s3a.impl=io.delta.storage.S3ScyllaDBLogStore
spark.io.delta.storage.S3ScyllaDBLogStore.ddb.endpoint=<ScyllaDB's Alternator cluster endpoint>
spark.io.delta.storage.S3ScyllaDBLogStore.credentials.provider=<The AWSCredentialsProvider used by the client, default DefaultAWSCredentialsProviderChain>
spark.io.delta.storage.S3ScyllaDBLogStore.ddb.tableName=<The name of the Scylla table to use, default delta_log>

I've opened a PR #2410 to introduce this configuration.

Willingness to contribute

The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

  • Yes. I can contribute this feature independently.
  • Yes. I would be willing to contribute this feature with guidance from the Delta Lake community.
  • No. I cannot contribute this feature at this time.
@rbushri rbushri added the enhancement New feature or request label Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant