Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ML Model APIs #733

Merged
merged 17 commits into from
Jan 9, 2025
Merged

Conversation

nathaliellenaa
Copy link
Contributor

@nathaliellenaa nathaliellenaa commented Dec 13, 2024

Description

Added missing ML Model APIs.

Issues Resolved

Part of opensearch-project/opensearch-py#867.

ML Model APIs to add

  • GET /_plugins/_ml/models/{model_id}
  • POST /_plugins/_ml/models/meta
  • POST /_plugins/_ml/models/_register_meta
  • POST /_plugins/_ml/models/_search
  • POST /_plugins/_ml/models/_undeploy
  • POST /_plugins/_ml/models/_unload
  • POST /_plugins/_ml/models/{model_id}/_unload
  • POST /_plugins/_ml/models/_upload
  • POST /_plugins/_ml/models/{model_id}/_load
  • POST /_plugins/_ml/models/{model_id}/_predict
  • POST /_plugins/_ml/models/{model_id}/chunk/{chunk_number}
  • POST /_plugins/_ml/models/{model_id}/upload_chunk/{chunk_number}
  • PUT /_plugins/_ml/models/{model_id}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Dec 13, 2024

Changes Analysis

Commit SHA: 7fa9b34
Comparing To SHA: 2b44e52

API Changes

Summary

├─┬Paths
│ ├──[➕] path (5331:3)
│ ├──[➕] path (5162:3)
│ ├──[➕] path (5405:3)
│ ├──[➕] path (5212:3)
│ ├──[➕] path (5349:3)
│ ├──[➕] path (5385:3)
│ ├──[➕] path (5426:3)
│ ├──[➕] path (5248:3)
│ ├──[➕] path (5445:3)
│ ├──[➕] path (5229:3)
│ ├─┬/_plugins/_ml/models/{model_id}
│ │ ├──[➕] get (5284:7)
│ │ ├──[➕] put (5299:7)
│ │ └─┬DELETE
│ │   └─┬Extensions
│ │     └──[➕] x-version-added (5271:24)
│ ├─┬/_plugins/_ml/models/_register
│ │ └─┬POST
│ │   ├─┬Requestbody
│ │   │ └─┬application/json
│ │   │   └─┬Schema
│ │   │     ├──[➖] required (27703:17)❌ 
│ │   │     ├─┬model_format
│ │   │     │ └──[🔀] $ref (34650:13)❌ 
│ │   │     ├─┬model_group_id
│ │   │     │ └──[🔀] $ref (34429:13)❌ 
│ │   │     └─┬version
│ │   │       └──[🔀] $ref (36018:13)❌ 
│ │   └─┬Extensions
│ │     └──[➕] x-version-added (5149:24)
│ ├─┬/_plugins/_ml/models/_search
│ │ ├──[➕] post (5197:7)
│ │ └─┬GET
│ │   └─┬Extensions
│ │     └──[➕] x-version-added (5183:24)
│ ├─┬/_plugins/_ml/models/{model_id}/_deploy
│ │ └─┬POST
│ │   └─┬Extensions
│ │     └──[➕] x-version-added (5319:24)
│ └─┬/_plugins/_ml/models/{model_id}/_undeploy
│   └─┬POST
│     ├──[➕] requestBody (28245:7)❌ 
│     └─┬Extensions
│       └──[➕] x-version-added (5371:24)
└─┬Components
  ├──[➕] requestBodies (27847:7)
  ├──[➕] requestBodies (28261:7)
  ├──[➕] requestBodies (28347:7)
  ├──[➕] requestBodies (28245:7)
  ├──[➕] requestBodies (27921:7)
  ├──[➕] requestBodies (28091:7)
  ├──[➕] requestBodies (27998:7)
  ├──[➕] responses (31740:7)
  ├──[➕] responses (31634:7)
  ├──[➕] responses (31467:7)
  ├──[➕] responses (31746:7)
  ├──[➕] responses (31424:7)
  ├──[➕] responses (31581:7)
  ├──[➕] responses (31607:7)
  ├──[➕] responses (31758:7)
  ├──[➕] responses (31569:7)
  ├──[➕] responses (31710:7)
  ├──[➕] requestBodies (28416:7)
  ├──[➕] requestBodies (28405:7)
  ├──[➕] parameters (23589:7)
  ├──[➕] parameters (23667:7)
  ├──[➕] parameters (23480:7)
  ├──[➕] parameters (23673:7)
  ├──[➕] parameters (23487:7)
  ├──[➕] parameters (23637:7)
  ├──[➕] parameters (23595:7)
  ├──[➕] parameters (23680:7)
  ├──[➕] parameters (23577:7)
  ├──[➕] schemas (55376:7)
  ├──[➕] schemas (56371:7)
  ├──[➕] schemas (55825:7)
  ├──[➕] schemas (55555:7)
  ├──[➕] schemas (55818:7)
  ├──[➕] schemas (56379:7)
  ├──[➕] schemas (55407:7)
  ├──[➕] schemas (55782:7)
  ├──[➕] schemas (56376:7)
  ├──[➕] schemas (56390:7)
  ├──[➕] schemas (34650:7)
  ├──[➕] schemas (55865:7)
  └─┬ml._common___Source
    └─┬model_format
      └──[🔀] $ref (34650:13)❌ 

Document Element Total Changes Breaking Changes
paths 23 5
components 41 1
  • BREAKING Changes: 6 out of 64
  • Modifications: 4
  • Removals: 1
  • Additions: 59
  • Breaking Removals: 1
  • Breaking Modifications: 4
  • Breaking Additions: 1

Report

The full API changes report is available at: https://github.com/opensearch-project/opensearch-api-specification/actions/runs/12698226238/artifacts/2409613270

API Coverage

Before After Δ
Covered (%) 630 (61.7 %) 643 (62.98 %) 13 (1.28 %)
Uncovered (%) 391 (38.3 %) 378 (37.02 %) -13 (-1.28 %)
Unknown 43 43 0

Copy link
Contributor

github-actions bot commented Dec 13, 2024

Spec Test Coverage Analysis

Total Tested
573 572 (99.83 %)

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I have some smallish nits. Iterate to green (validation is failing, etc.)

CHANGELOG.md Outdated Show resolved Hide resolved
tests/plugins/ml/ml/models.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/ml/models.yaml Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@dblock
Copy link
Member

dblock commented Dec 13, 2024

Btw, there's a whole bunch of others missing per #168. They don't all need to be done at the same time.

Screenshot 2024-12-13 at 5 07 47 PM

@nathaliellenaa
Copy link
Contributor Author

Yes, I'm planning to create some small PRs based on the APIs category (model, model groups, connector, agents, etc.) I will also add more missing Model APIs to this PR.

@nathaliellenaa nathaliellenaa changed the title Added missing ML Model APIs to the spec along with the tests Added ML Model APIs Dec 14, 2024
@dhrubo-os
Copy link

DCO is missing @nathaliellenaa

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Some nits below.

In tests try to get rid of multiple-paths-detected: false as much as possible by moving setup into prologues and teardown into epilogues. You should only need that set if there's a need to call an unrelated API (e.g. wait on a task) to complete the test. Otherwise chapters should only have chapters for the API being tested.

CHANGELOG.md Outdated Show resolved Hide resolved
spec/namespaces/ml.yaml Outdated Show resolved Hide resolved
spec/namespaces/ml.yaml Outdated Show resolved Hide resolved
spec/namespaces/ml.yaml Outdated Show resolved Hide resolved
spec/namespaces/ml.yaml Outdated Show resolved Hide resolved
spec/namespaces/ml.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/load.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/load.yaml Outdated Show resolved Hide resolved
@nathaliellenaa
Copy link
Contributor Author

The APIs POST /_plugins/_ml/models/{model_id}/{version}/_register and POST /_plugins/_ml/models/{model_id}/{version}/_upload will be deprecated since cx are now only using the POST /_plugins/_ml/models/_register API. Both APIs were non-functional in current and previous versions as well. I believe we should refrain from adding these APIs to the specification to avoid future deprecation efforts in the client side.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work.

Go through tests and make sure only the API being tested is in the chapters and everything else is in prologues or epilogues, and that the naming is consistent.

CHANGELOG.md Outdated Show resolved Hide resolved
tests/plugins/ml/models/create_metadata.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/predict.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/predict.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/predict.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/undeploy_specific_models.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/undeploy_specific_models.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/unload.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/unload_specific_models.yaml Outdated Show resolved Hide resolved
tests/plugins/ml/models/upload.yaml Outdated Show resolved Hide resolved
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like tests fail consistently in CI, https://github.com/opensearch-project/opensearch-api-specification/actions/runs/12520890851/job/34927206366?pr=733, maybe needs longer retries or maybe there's a real problem.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent tests/plugins/ml/models should be tests/plugins/ml/ml/models. I know the double ml looks weird there, but the folder ml is the name of the test suite (like default) and from there the path is the API path, and all the APIs being tested are under _ml/models.

@nathaliellenaa
Copy link
Contributor Author

nathaliellenaa commented Dec 30, 2024

Looks like tests fail consistently in CI, https://github.com/opensearch-project/opensearch-api-specification/actions/runs/12520890851/job/34927206366?pr=733, maybe needs longer retries or maybe there's a real problem.

The tests passed when get_completed_deploy_model_task is moved to chapters section, but they are failing when it is moved to prologues section. I'll look deeper into this issue.

Update: It seems that when the deploy task state is still CREATED, it immediately predicts the model without retrying it until the state is RUNNING or COMPLETED. I suspected that it's possibly picking up the state from registering a model previously, but I already verified that get_completed_deploy_model_task used the correct deploy model task id and task type.

@dblock Do you have any idea what causes this issue?

[INFO] => GET /_plugins/_ml/tasks/6I1wGZQBkjzF5RCKw0hU ({}) [application/json] 
[INFO] <= 200 (application/json; charset=UTF-8) | {
  "model_id": "541wGZQBkjzF5RCKdUiw",
  "task_type": "DEPLOY_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "CREATED",
  "worker_node": [
    "kLV6JVKsRpCfUABzKFFbsQ",
    "Wwv5YlwPRCeqFoTtk0-3Cg",
    "1Mel2oTKSH2Sxvy5W16QvQ",
    "VodBQqJ-TluOEBf4o5h72Q"
  ],
  "create_time": 1735593608020,
  "last_update_time": 1735593608020,
  "is_async": true
}
[INFO] $ {
  "outputs": {
    "model_id": "541wGZQBkjzF5RCKdUiw"
  }
}
[INFO] => POST /_plugins/_ml/models/541wGZQBkjzF5RCKdUiw/_predict ({}) [application/json] {
  "query_text": "The best selling book series in history is Harry Potter",
  "text_docs": [
    "Harry Potter is written by J.K. Rowling",
    "The Great Gatsby is a story of wealth and tragedy",
    "The Lord of the Rings is an epic high fantasy novel",
    "The best selling book series in history is Harry Potter"
  ]
}
[INFO] <= 400 (application/json) | {
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Model not ready yet. Please deploy the model first."
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Model not ready yet. Please deploy the model first."
  },
  "status": 400
}

@dblock
Copy link
Member

dblock commented Dec 31, 2024

I don't believe retry can do this:

      until:
        - path: payload.state
          equal: COMPLETED

Retry schema is here. You want to wait till the response payload is COMPLETED, so:

response:
  payload:
    state: COMPLETED

Then retry will keep retrying until that's true.

    retry:
      count: 3
      wait: 10000

There's a bug in the schema, it should have prevented you from adding until which doesn't exist loudly. Open a bug please, and feel free to fix.

@nathaliellenaa
Copy link
Contributor Author

Retry schema is here. You want to wait till the response payload is COMPLETED, so:

response:
  payload:
    state: COMPLETED

I checked the error and it still has the same issue on the get_completed_deploy_model_task, where it immediately predicts the model without waiting until deploy task state is COMPLETED. It seems that the prologues doesn't have response property (ref). Should the get_completed_deploy_model_task be moved to the chapters section then?

@dblock
Copy link
Member

dblock commented Jan 2, 2025

It seems that the prologues doesn't have response property (ref). Should the get_completed_deploy_model_task be moved to the chapters section then?

I think the right fix would be to add support for payload properties into the prologue. Want to give it a shot?

@nathaliellenaa
Copy link
Contributor Author

nathaliellenaa commented Jan 2, 2025

I think the right fix would be to add support for payload properties into the prologue. Want to give it a shot?

Sure, I'll try this approach.

Update: I added the response payload properties into the prologue, but same issue still persists.

@dblock
Copy link
Member

dblock commented Jan 7, 2025

I think the right fix would be to add support for payload properties into the prologue. Want to give it a shot?

Sure, I'll try this approach.

Update: I added the response payload properties into the prologue, but same issue still persists.

Add that code here, I can try to help. Also since #767 we have logs from CI so it might be helpful to look at those.

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were not actually checking prologues and epilogues payloads, implemented in #772.

.github/workflows/test-spec.yml Outdated Show resolved Hide resolved
@dblock
Copy link
Member

dblock commented Jan 9, 2025

@nathaliellenaa #772 was merged, rebase?

nathaliellenaa and others added 16 commits January 9, 2025 09:49
…o models/search.yaml, resolved conflicts and updated CHANGELOG

Signed-off-by: Nathalie Jonathan <[email protected]>
…project#732)

* Fixed /_search/scroll.

Signed-off-by: dblock <[email protected]>

* Added tests for GET and POST /_search.

Signed-off-by: dblock <[email protected]>

* Added a test for GET /_search/pipeline and DELETE /_search/pipeline/{id}.

Signed-off-by: dblock <[email protected]>

* Added missing _search/point_in_time tests.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
* Added tests for /_validate/query.

Signed-off-by: dblock <[email protected]>

* Added retry for opensearch-project#738.

Signed-off-by: dblock <[email protected]>

---------

Signed-off-by: dblock <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
…own to epilogues in predict.yaml and load.yaml, updated CHANGELOG format, updated API description, 'model_group_id' ID type, 'version' parameter, and made 'model_format' a type of its own in ml.yaml.

Signed-off-by: Nathalie Jonathan <[email protected]>
…t for deprecated model metadata creation API.

Signed-off-by: Nathalie Jonathan <[email protected]>
…pload to create_metadata.yaml, updated CHANGELOG.

Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
…cate map keys in ml._common.yaml, removed excluded parts and until property in the test files.

Signed-off-by: Nathalie Jonathan <[email protected]>
…prologue, attempted to fix errors in predict.yaml.

Signed-off-by: Nathalie Jonathan <[email protected]>
…y in prologues of predict.yaml, undeploy.yaml, unload.yaml, added version for ML Model APIs.

Signed-off-by: Nathalie Jonathan <[email protected]>
@nathaliellenaa
Copy link
Contributor Author

nathaliellenaa commented Jan 9, 2025

Looks like the implementation in #772 fixes the issue, thanks @dblock.

@dblock dblock merged commit c1651ec into opensearch-project:main Jan 9, 2025
30 checks passed
@nathaliellenaa nathaliellenaa deleted the add-ml-models-api branch January 9, 2025 21:41
@dblock
Copy link
Member

dblock commented Jan 9, 2025

Looks like at least 1 flaky test: #776

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants