Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to use spacy "/opt/en_core_web_sm-2.2.5" #448

Open
willianfalbo opened this issue Oct 18, 2024 · 1 comment
Open

Not able to use spacy "/opt/en_core_web_sm-2.2.5" #448

willianfalbo opened this issue Oct 18, 2024 · 1 comment

Comments

@willianfalbo
Copy link

willianfalbo commented Oct 18, 2024

Hey guys, I've been trying to use python38-spacy:42 and python38-spacy_model_en_small:1, but they are not working. Could you please help me?

Here is my template yaml file:

# template.yaml

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31

Resources:
  GetWordCounts:
    Type: AWS::Serverless::Function
    Properties:
      Handler: word-counts/app.lambda_handler
      Runtime: python3.8
      CodeUri: .
      Timeout: 30
      Layers:
        - arn:aws:lambda:us-east-1:770693421928:layer:Klayers-python38-spacy:42
        - arn:aws:lambda:us-east-1:770693421928:layer:Klayers-python38-spacy_model_en_small:1
      Events:
        ApiGateway:
          Type: Api
          Properties:
            Path: /word-counts
            Method: get

Here is a simple file handler for the requests:

# word-counts/app.py

import json
import spacy

nlp = spacy.load("/opt/en_core_web_sm-2.2.5")

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

To start the API locally, I run the following command:

sam local start-api --profile my-profile

Then, it fails when I do a GET request to that endpoint, like:

# GET http://localhost:3000/word-counts

/opt/python/spacy/util.py:717: UserWarning: [W094] Model 'en_core_web_sm' (2.2.5) specifies an under-constrained spaCy version requirement: >=2.2.2. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.0.6,<3.1.0
  warnings.warn(warn_msg)
[ERROR] OSError: [E053] Could not read config.cfg from /opt/en_core_web_sm-2.2.5/config.cfg
Traceback (most recent call last):
  File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/task/word-counts/app.py", line 4, in <module>
    nlp = spacy.load("/opt/en_core_web_sm-2.2.5")
  File "/opt/python/spacy/__init__.py", line 50, in load
    return util.load_model(
  File "/opt/python/spacy/util.py", line 326, in load_model
    return load_model_from_path(Path(name), **kwargs)
  File "/opt/python/spacy/util.py", line 390, in load_model_from_path
    config = load_config(config_path, overrides=dict_to_dot(config))
  File "/opt/python/spacy/util.py", line 547, in load_config
    raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
18 Oct 2024 02:08:37,264 [ERROR] (rapid) Init failed error=Runtime exited with error: exit status 1 InvokeID=
18 Oct 2024 02:08:37,268 [ERROR] (rapid) Invoke failed InvokeID=7a3bcb82-b685-49c2-80f5-8cf98d53bd1f error=Runtime exited with error: exit status 1
18 Oct 2024 02:08:37,268 [ERROR] (rapid) Invoke DONE failed: Sandbox.Failure

I would appreciate any help. Thanks

@keithrozario
Copy link
Owner

SOrry this is a spacy specific issue, and it's been a while since I tried this.

Found a similar issue here:
explosion/spaCy#7453

It might fix your issue -- and you'll probably get better luck checking your queries there.

Tip: For these large packages (e.g. Spacy) it's probably better to use container images instead of Lambda layers -- this projects predates the ability of packing containers into lambda hence we tried supporting it for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants