Skip to content

AWSSDKPandas lambda layer does not include pyarrow dependencies to support encryption_configuration parameter #3149

Open
@everlastingcrown

Description

@everlastingcrown

Describe the bug

Methods such as awswrangler.s3.to_parquet() contain an encryption_configuration parameter (for encrypting the data), which relies on dependencies from pyarrow.parquet.encryption.

However, these dependencies do not exist in the "AWSSDKPandas-Python311" AWS Lambda Layer.

When running import pyarrow.parquet.encryption in an AWS Lambda on Python 3.11, the following error occurs:

Runtime.ImportModuleError: Unable to import module 'index': No module named 'pyarrow._parquet_encryption'

How to Reproduce

Deploy an AWS Lambda with the following configuration:

  • Runtime: Python 3.11
  • Layers: [AWSSDKPandas-Python311:17]

The lambda code is:

import pyarrow.parquet.encryption

Expected behavior

It is expected that the pyarrow.parquet.encryption import succeeds, so that it can then be used to define the encryption_configuration parameter.

Your project

No response

Screenshots

No response

OS

Linux

Python version

3.11

AWS SDK for pandas version

17

Additional context

The encryption_configuration feature is a relatively new addition (Feb 24): #2642. I suspect that part of pyarrow is not included in the layer build to reduce the total size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions