Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't create iceberg source using MinIO and Hive Metastore. #18679

Closed
viethqb opened this issue Sep 24, 2024 · 3 comments · Fixed by #19111
Closed

Can't create iceberg source using MinIO and Hive Metastore. #18679

viethqb opened this issue Sep 24, 2024 · 3 comments · Fixed by #19111
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@viethqb
Copy link

viethqb commented Sep 24, 2024

Describe the bug

I use risingwave on k8s with config:

---
apiVersion: risingwave.risingwavelabs.com/v1alpha1
kind: RisingWave
metadata:
  name: risingwave
  namespace: dataplatform
spec:
  # frontendServiceType: NodePort
  image: risingwavelabs/risingwave:v2.0.0
  metaStore:
    etcd:
      endpoint: operator-etcd:2379
  stateStore:
    dataDirectory: risingwave
    minio:
      credentials:
        secretName: minio-credentials
      bucket: dataplatform
      endpoint: https://minio.xxx.xxx.xxx
  components:
    meta:
      nodeGroups:
        - replicas: 1
          name: ""
          template:
            spec:
              volumes:
                - name: heap
                  emptyDir:
                    sizeLimit: 1Gi
              volumeMounts:
                - mountPath: /heap
                  name: heap
              env:
                - name: MALLOC_CONF
                  value: prof:true,lg_prof_interval=-1,lg_prof_sample=20,prof_prefix:/heap/
                - name: RW_HEAP_PROFILING_DIR
                  value: /heap
              # resources:
              #   limits:
              #     cpu: 1
              #     memory: 2Gi
              #   requests:
              #     cpu: 1
              #     memory: 2Gi
    compactor:
      nodeGroups:
        - replicas: 1
          name: ""
          template:
            spec:
              volumes:
                - name: heap
                  emptyDir:
                    sizeLimit: 1Gi
              volumeMounts:
                - mountPath: /heap
                  name: heap
              env:
                - name: MALLOC_CONF
                  value: prof:true,lg_prof_interval=-1,lg_prof_sample=20,prof_prefix:/heap/
                - name: RW_HEAP_PROFILING_DIR
                  value: /heap
    frontend:
      nodeGroups:
        - replicas: 1
          name: ""
          template:
            spec:
              # resources:
                # limits:
                #   cpu: 1
                #   memory: 2Gi
                # requests:
                #   cpu: 1
                #   memory: 2Gi
    compute:
      nodeGroups:
        - replicas: 1
          name: ""
          template:
            spec:
              volumes:
                - name: heap
                  emptyDir:
                    sizeLimit: 1Gi
              volumeMounts:
                - mountPath: /heap
                  name: heap
              env:
                - name: MALLOC_CONF
                  value: prof:true,lg_prof_interval=-1,lg_prof_sample=20,prof_prefix:/heap/
                - name: RW_HEAP_PROFILING_DIR
                  value: /heap
              # resources:
              #   limits:
              #     cpu: 2
              #     memory: 4Gi # Memory limit will be set to `RW_TOTAL_MEMORY_BYTES`
              #   requests:
              #     cpu: 2
              #     memory: 4Gi

I create iceberg source with the following sql statement:

create source source_demo_hive
with (
    connector = 'iceberg',
    catalog.type = 'hive',
    catalog.uri = 'thrift://hive-metastore:9083',
    warehouse.path = 's3://dataplatform/lakehouse/',
    s3.endpoint = 'https://minio.xxx.xxx.xxx',
    s3.access.key = 'xxxxxxxxxx',
    s3.secret.key = 'xxxxxxxxxxx',
    s3.region = 'ap-southeast-1',
    catalog.name = 'lakehouse',
    database.name = 'demo',
    table.name = 'demo_table'
); 
  1. ERROR 1: Unable to load region from system settings. Region must be specified either via environment variable (AWS_REGION) => config s3.region = 'ap-southeast-1' not worked => I workaroud by creating environment variables AWS_REGION=ap-southeast-1
  2. ERROR 2: After passing error 1 => Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: dataplatform.minio.xxx.xxx.xxx => Path-style URL access must be enabled => This config is not available

Expectation:

  1. Fix s3.region not working
  2. add config s3.path.style.access for iceberg source.

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

@viethqb viethqb added the type/bug Something isn't working label Sep 24, 2024
@github-actions github-actions bot added this to the release-2.1 milestone Sep 24, 2024
@viethqb viethqb changed the title I can't create iceberg source when using MinIO and Hive Metastore. Can't create iceberg source using MinIO and Hive Metastore. Sep 25, 2024
@fuyufjh fuyufjh modified the milestones: release-2.1, release-2.2 Oct 17, 2024
@hienphdev
Copy link

Set the property s3.path.style.access = true seems work for the error 2nd.
The first error could be a bug because of missing forward s3.region into catalog properties https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/connector_common/iceberg/mod.rs#L167-L201

@viethqb
Copy link
Author

viethqb commented Oct 21, 2024

Set the property s3.path.style.access = true seems work for the error 2nd. The first error could be a bug because of missing forward s3.region into catalog properties https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/connector_common/iceberg/mod.rs#L167-L201

I tried property s3.path.style.access = true, it works with iceberg sink, but not work with iceberg source.

@chenzl25
Copy link
Contributor

I tried property s3.path.style.access = true, it works with iceberg sink, but not work with iceberg source.

Got it, we'll migrate this configuration to the Iceberg source later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants