diff --git a/.pylintrc b/.pylintrc index 5453b9ddb..7e364b960 100644 --- a/.pylintrc +++ b/.pylintrc @@ -510,7 +510,7 @@ int-import-graph= # Force import order to recognize a module as part of the standard # compatibility libraries. -known-standard-library=posixpath +known-standard-library=posixpath,typing,typing_extensions # Force import order to recognize a module as part of a third party library. known-third-party=enchant,cornice_swagger,cwltool,cwt,docker diff --git a/CHANGES.rst b/CHANGES.rst index 8dc0c6dc5..24b956d7d 100644 --- a/CHANGES.rst +++ b/CHANGES.rst @@ -12,11 +12,26 @@ Changes Changes: -------- -- No change. +- Add support of `Process` revisions (resolves `#107 `_). +- Add ``PATCH /processes/{processID}`` request, allowing ``MINOR`` and ``PATCH`` level modifications that can be + applied to an existing `Process` in order to revise non-execution critical information. Level ``PATCH`` is used to + identify changes with no impact on execution whatsoever, only affecting metadata such as its documented description. + Level ``MINOR`` is used to update components that affect only execution *methodology* (e.g.: sync/async) or `Process` + retrieval, but that do not directly impact *what* is executed (i.e.: the `Application Package` does not change). +- Add ``PUT /processes/{processID}`` request, allowing ``MAJOR`` revision to essentially redeploy a new `Process`, + but leaving some form of relationship with older versions by reusing the same `Process` ID. This ``MAJOR`` update + level implies a relatively critical change to execute the `Process`, such as the addition, removal or modification + of an input or output, directly impacting the `Application Package` definition and parameters the `Process` offers. +- Add support of ``{processID}:{version}`` representation in request path and ``processID`` of the `Job` definition + to reference the specific `Process` revisions when fetching a `Process` description or a `Job` status. +- Add search query ``version`` and ``revisions`` parameters to allow description of a specific `Process` revision, or + listing all its versions history. +- Add more entries in ``links`` referring to `Process` revisions whenever applicable. Fixes: ------ -- No change. +- Fix invalid ``minimum`` and ``maximum`` OpenAPI fields that were defined as ``minLength`` and ``maxLength`` + (duplicates definitions) for `Process` description and deployment schema validation. .. _changes_4.19.0: diff --git a/README.rst b/README.rst index 15c4bf041..f56e392f8 100644 --- a/README.rst +++ b/README.rst @@ -302,7 +302,7 @@ It is part of `PAVICS`_ and `Birdhouse`_ ecosystems and is available within the .. _ogc-eo-apps-pilot-er: http://docs.opengeospatial.org/per/20-045.html .. |ogc-best-practices-eo-apppkg| replace:: OGC Best Practice for Earth Observation Application Package .. _ogc-best-practices-eo-apppkg: https://docs.ogc.org/bp/20-089r1.html -.. |ogc-api-proc-ext-part2| replace:: `OGC API - Processes - Part 2: Deploy, Replace, Undeploy`_ (DRU) extension +.. |ogc-api-proc-part2| replace:: `OGC API - Processes - Part 2: Deploy, Replace, Undeploy`_ (DRU) extension .. _`OGC API - Processes - Part 2: Deploy, Replace, Undeploy`: https://github.com/opengeospatial/ogcapi-processes/tree/master/extensions/deploy_replace_undeploy .. |ogc-apppkg| replace:: `OGC Application Package` .. _ogc-apppkg: https://github.com/opengeospatial/ogcapi-processes/blob/master/extensions/deploy_replace_undeploy/standard/openapi/schemas/ogcapppkg.yaml diff --git a/docs/examples/update-process-minor.http b/docs/examples/update-process-minor.http new file mode 100644 index 000000000..600bb38b0 --- /dev/null +++ b/docs/examples/update-process-minor.http @@ -0,0 +1,9 @@ +PATCH /processes/test-process HTTP/1.1 +Host: weaver.example.com +Content-Type: application/json + +{ + "description": "process async only", + "jobControlOptions": ["async-execute"], + "version": "1.4.0" +} diff --git a/docs/examples/update-process-patch.http b/docs/examples/update-process-patch.http new file mode 100644 index 000000000..cd46a2b75 --- /dev/null +++ b/docs/examples/update-process-patch.http @@ -0,0 +1,17 @@ +PATCH /processes/test-process:1.2.3 HTTP/1.1 +Host: weaver.example.com +Content-Type: application/json + +{ + "description": "new description", + "inputs": { + "input": { + "description": "modified input description" + }, + "outputs": { + "output": { + "title": "modified title" + } + } + } +} diff --git a/docs/source/processes.rst b/docs/source/processes.rst index ff9a8eae5..2cee945ac 100644 --- a/docs/source/processes.rst +++ b/docs/source/processes.rst @@ -413,6 +413,133 @@ that define the process references and expected inputs/outputs. .. _`Provider requests`: https://pavics-weaver.readthedocs.io/en/latest/api.html#tag/Providers .. _`Process requests`: https://pavics-weaver.readthedocs.io/en/latest/api.html#tag/Processes +.. versionchanged:: 4.20 + +With the addition of :term:`Process` revisions (see :ref:`Update Operation ` request. The undeploy operation consist of a ``DELETE`` request targeting the +specific ``{WEAVER_URL}/processes/{processID}`` to be removed. + +.. note:: + The :term:`Process` must be accessible by the user considering any visibility configuration to perform this step. + See :ref:`proc_op_deploy` section for details. + +.. versionadded:: 4.20 + +Starting from version `4.20 `_, a :term:`Process` can be replaced or +updated using respectively the ``PUT`` and ``PATCH`` requests onto the specific ``{WEAVER_URL}/processes/{processID}`` +location of the reference to modify. + +.. note:: + The :term:`Process` partial update operation (using ``PATCH``) is specific to `Weaver` only. + |ogc-api-proc-part2|_ only mandates the definition of ``PUT`` request for full override of a :term:`Process`. + +When a :term:`Process` is modified using the ``PATCH`` operation, only the new definitions need to be provided, and +unspecified items are transferred over from the referenced :term:`Process` (i.e.: the previous revision). Using either +the ``PUT`` or ``PATCH`` requests, previous revisions can be referenced using two formats: + +- ``{processID}:{version}`` as request path parameters (instead of usual ``{processID}`` only) +- ``{processID}`` in the request path combined with ``?version={version}`` query parameter + +`Weaver` employs ``MAJOR.MINOR.PATCH`` semantic versioning to maintain revisions of updated or replaced :term:`Process` +definitions. The next revision number to employ for update or replacement can either be provided explicitly in the +request body using a ``version``, or be omitted. When omitted, the next revision will be guessed automatically based +on the previous available revision according to the level of changes required. In either cases, the resolved ``version`` +will have to be available and respect the expected update level to be accepted as a new valid :term:`Process` revision. +The applicable revision level depends on the contents being modified using submitted request body fields according +to the following table. When a combination of the below items occur, the higher update level is required. + ++-------------+-----------+------------------------------------+------------------------------------------------------+ +| HTTP Method | Level | Change | Examples | ++=============+===========+====================================+======================================================+ +| ``PATCH`` | ``PATCH`` | Modifications to metadata | - :term:`Process` ``description``, ``title`` strings | +| | | not impacting the :term:`Process` | - :term:`Process` ``keywords``, ``metadata`` lists | +| | | execution or definition. | - inputs/outputs ``description``, ``title`` strings | +| | | | - inputs/outputs ``keywords``, ``metadata`` lists | ++-------------+-----------+------------------------------------+------------------------------------------------------+ +| ``PATCH`` | ``MINOR`` | Modification that impacts *how* | - :term:`Process` ``jobControlOptions`` (async/sync) | +| | | the :term:`Process` could be | - :term:`Process` ``outputTransmission`` (ref/value) | +| | | executed, but not its definition. | - :term:`Process` ``visibility`` | ++-------------+-----------+------------------------------------+------------------------------------------------------+ +| ``PUT`` | ``MAJOR`` | Modification that impacts *what* | - Any :term:`Application Package` modification | +| | | the :term:`Process` executes. | - Any inputs/outputs change (formats, occurs, type) | +| | | | - Any inputs/outputs addition or removal | ++-------------+-----------+------------------------------------+------------------------------------------------------+ + +.. note:: + For all applicable fields of updating a :term:`Process`, refer to the schema of |update-req|_. + For replacing a :term:`Process`, refer instead to the schema of |replace-req|_. The replacement request contents + are extremely similar to the :ref:`Deploy ` schema since the full :term:`Process` definition must + be provided. + +For example, if the ``test-process:1.2.3`` was previously deployed, and is the active latest revision of that +:term:`Process`, submitting the below request body will produce a ``PATCH`` revision as ``test-process:1.2.4``. + +.. literalinclude:: ../examples/update-process-patch.http + :language: http + :caption: Sample request for ``PATCH`` revision + +Here, only metadata is adjusted and there is no risk to impact produced results or execution methods of the +:term:`Process`. An external user would probably not even notice the :term:`Process` changed, which is why ``PATCH`` +is reasonable in this case. Notice that the ``version`` is not explicitly provided in the body. It is guessed +automatically from the modified contents. Also, the example displays how :term:`Process`-level and +inputs/outputs-level metadata can be updated. + +Similarly, the following request would produce a ``MINOR`` revision of ``test-process``. Since both ``PATCH`` and +``MINOR`` level contents are defined for update, the higher ``MINOR`` revision is required. In this case ``MINOR`` is +required because ``jobControlOptions`` (forced to asynchronous execution for following versions) would break any future +request made by users that would expect the :term:`Process` to run (or support) synchronous execution. + +Notice that this time, the :term:`Process` reference does not indicate the revision in the path (no ``:1.2.4`` part). +This automatically resolves to the updated revision ``test-process:1.2.4`` that became the new latest revision following +our previous ``PATCH`` request. + +.. literalinclude:: ../examples/update-process-minor.http + :language: http + :caption: Sample request for ``MINOR`` revision + +In this case, the desired ``version`` (``1.4.0``) is also specified explicitly in the body. Since the updated number +(``MINOR = 4``) matches the expected update level from the above table and respects an higher level than the reference +``1.2.4`` :term:`Process`, this revision value will be accepted (instead of auto-resolved ``1.3.0`` otherwise). Note +that if ``2.4.0`` was specified instead, the version would be refused, as `Weaver` does not consider this modification +to be worth a ``MAJOR`` revision, and tries to keep version levels consistent. Skipping numbers (i.e.: ``1.3.0`` in this +case), is permitted as long as there are no other versions above of the same level (i.e.: ``1.4.0`` would be refused if +``1.5.0`` existed). This allows some level of flexibility with revisions in case users want to use specific numbering +values that have more meaning to them. It is recommended to let `Weaver` auto-update version values between updates if +this level of fined-grained control is not required. + +.. note:: + To avoid conflicting definitions, a :term:`Process` cannot be :ref:`Deployed ` directly using a + ``{processID}:{version}`` reference. Deployments are expected as the *first revision* and should only include the + ``{processID}`` portion as their identifier. + +If the user desires a specific version to deploy, the ``PUT`` request should be used with the appropriate ``version`` +within the request body. It is although up to the user to provide the full definition of that :term:`Process`, +as ``PUT`` request will completely replace the previous definition rather than transfer over previous updates +(i.e: ``PATCH`` requests). + +Even when a :term:`Process` is *"replaced"* using ``PUT``, the older revision is not actually removed and undeployed +(``DELETE`` request). It is therefore still possible to refer to the old revision using explicit references with the +corresponding ``version``. `Weaver` keeps track of revisions by corresponding ``{processID}`` entries such that if +the latest revision is undeployed, the previous revision will automatically become the latest once again. For complete +replacement, the user should instead perform a ``DELETE`` of all existing revisions (to avoid conflicts) followed by a +new :ref:`Deploy ` request. + .. _proc_op_execute: Execution of a process (Execute) @@ -483,6 +610,14 @@ and parametrization of various input/output combinations. Let's employ the follo .. |exec-api| replace:: OpenAPI Execute .. _exec-api: `exec-req`_ + +.. versionchanged:: 4.20 + +With the addition of :term:`Process` revisions (see :ref:`Update Operation `_ (latest) ======================================================================== -replace = +replace = `Unreleased `_ (latest) ======================================================================== - + Changes: -------- - No change. - + Fixes: ------ - No change. - + .. _changes_{new_version}: - + `{new_version} `_ ({now:%%Y-%%m-%%d}) ======================================================================== @@ -42,14 +42,14 @@ search = LABEL version="{current_version}" replace = LABEL version="{new_version}" [tool:pytest] -addopts = +addopts = --strict-markers --tb=native weaver/ log_cli = false log_level = DEBUG python_files = test_*.py -markers = +markers = cli: mark test as related to CLI operations testbed14: mark test as 'testbed14' validation functional: mark test as functionality validation @@ -67,7 +67,7 @@ lines_between_types = 0 combine_as_imports = true default_section = FIRSTPARTY sections = FUTURE,STDLIB,THIRDPARTY,FIRSTPARTY,LOCALFOLDER -known_standard_library = posixpath +known_standard_library = posixpath,typing,typing_extensions known_first_party = weaver known_third_party = cornice_swagger,cwltool,cwt,docker skip = *.egg*,build,env,src,venv @@ -80,7 +80,7 @@ targets = . [flake8] ignore = E126,E226,E402,F401,W503,W504 max-line-length = 120 -exclude = +exclude = src, .git, __pycache__, @@ -105,14 +105,14 @@ add_select = D201,D213 branch = true source = ./ include = weaver/* -omit = +omit = setup.py docs/* tests/* *_mako [coverage:report] -exclude_lines = +exclude_lines = pragma: no cover raise AssertionError raise NotImplementedError diff --git a/tests/functional/test_cli.py b/tests/functional/test_cli.py index b6033a770..5276c310d 100644 --- a/tests/functional/test_cli.py +++ b/tests/functional/test_cli.py @@ -245,6 +245,10 @@ def test_undeploy(self): def test_describe(self): result = mocked_sub_requests(self.app, self.client.describe, self.test_process["Echo"]) + assert self.test_payload["Echo"]["processDescription"]["process"]["version"] == "1.0", ( + "Original version submitted should be partial." + ) + assert result.success # see deployment file for details that are expected here assert result.body["id"] == self.test_process["Echo"] @@ -261,7 +265,9 @@ def test_describe(self): assert result.body["outputs"]["output"]["description"] == "Output file with echo message." assert result.body["outputs"]["output"]["formats"] == [{"default": True, "mediaType": ContentType.TEXT_PLAIN}] assert "undefined" not in result.message, "CLI should not have confused process description as response detail." - assert "description" not in result.body, "CLI should not have overridden the process description field." + assert result.body["description"] == ( + "Dummy process that simply echo's back the input message for testing purposes." + ), "CLI should not have overridden the process description field." def run_execute_inputs_schema_variant(self, inputs_param, process="Echo", preload=False, location=False, expect_success=True, mock_exec=True): diff --git a/tests/functional/utils.py b/tests/functional/utils.py index c9c19b201..037666e17 100644 --- a/tests/functional/utils.py +++ b/tests/functional/utils.py @@ -32,8 +32,9 @@ if TYPE_CHECKING: from typing import Any, Dict, Optional, Union + from typing_extensions import Literal - from weaver.typedefs import AnyRequestMethod, AnyResponseType, JSON, Literal, SettingsType + from weaver.typedefs import AnyRequestMethod, AnyResponseType, JSON, SettingsType ReferenceType = Literal["deploy", "describe", "execute", "package"] diff --git a/tests/test_datatype.py b/tests/test_datatype.py index d853c5a5b..bb6c5950c 100644 --- a/tests/test_datatype.py +++ b/tests/test_datatype.py @@ -169,3 +169,15 @@ def test_process_io_schema_ignore_uri(): # if 'default format' "schema" exists, it must not cause an error when parsing the object wps_proc = proc_obj.wps() assert any(isinstance(out.json.get("schema"), str) for out in wps_proc.outputs) + + +@pytest.mark.parametrize("process_id,result", [ + ("urn:test:1.2.3", ("urn:test", "1.2.3")), + ("urn:uuid:process:test", ("urn:uuid:process:test", None)), + ("urn:test:random-test:1.3.4", ("urn:test:random-test", "1.3.4")), + ("random-test:1", ("random-test", "1")), + ("random-test:1.3.4", ("random-test", "1.3.4")), + ("random-test:not.a.version", ("random-test:not.a.version", None)), +]) +def test_process_split_version(process_id, result): + assert Process.split_version(process_id) == result diff --git a/tests/test_opensearch.py b/tests/test_opensearch.py index 1d3049bb6..2f4814eb8 100644 --- a/tests/test_opensearch.py +++ b/tests/test_opensearch.py @@ -177,6 +177,9 @@ def _get_mocked(req): stack.enter_context(mock.patch("weaver.wps_restapi.processes.processes.get_db", side_effect=MockDB)) stack.enter_context(mock.patch("weaver.wps_restapi.processes.utils.get_db", side_effect=MockDB)) stack.enter_context(mock.patch("weaver.wps_restapi.processes.utils.get_settings", side_effect=_get_mocked)) + stack.enter_context(mock.patch("weaver.database.get_settings", side_effect=_get_mocked)) + stack.enter_context(mock.patch("weaver.database.mongodb.get_settings", side_effect=_get_mocked)) + stack.enter_context(mock.patch("weaver.datatype.get_settings", side_effect=_get_mocked)) stack.enter_context(mock.patch("weaver.processes.utils.get_db", side_effect=MockDB)) stack.enter_context(mock.patch("weaver.processes.utils.get_settings", side_effect=_get_mocked)) # given diff --git a/tests/test_utils.py b/tests/test_utils.py index 80b4d6a54..5a996b25e 100644 --- a/tests/test_utils.py +++ b/tests/test_utils.py @@ -4,6 +4,7 @@ import inspect import json import os +import random import shutil import tempfile import uuid @@ -35,6 +36,7 @@ from weaver.status import JOB_STATUS_CATEGORIES, STATUS_PYWPS_IDS, STATUS_PYWPS_MAP, Status, StatusCompliant, map_status from weaver.utils import ( NullType, + VersionLevel, apply_number_with_unit, assert_sane_name, bytes2str, @@ -46,6 +48,7 @@ get_sane_name, get_ssl_verify_option, get_url_without_query, + is_update_version, is_valid_url, localize_datetime, make_dirs, @@ -96,6 +99,54 @@ def test_is_url_valid(): assert is_valid_url(None) is False +def test_is_update_version(): + versions = [ + "0.1.2", + "1.0.3", + "1.2.0", + "1.2.3", + "1.2.4", + "1.3.1", + ] + random.shuffle(versions) # function must not depend on input order + + assert not is_update_version("0.1.0", versions, VersionLevel.PATCH) + assert not is_update_version("1.0.1", versions, VersionLevel.PATCH) + assert not is_update_version("1.2.1", versions, VersionLevel.PATCH) + assert not is_update_version("1.2.3", versions, VersionLevel.PATCH) + assert not is_update_version("1.3.0", versions, VersionLevel.PATCH) + assert not is_update_version("1.3.1", versions, VersionLevel.PATCH) + assert not is_update_version("1.4.5", versions, VersionLevel.PATCH) # no 1.4.x to update from + + assert not is_update_version("0.1.0", versions, VersionLevel.MINOR) + assert not is_update_version("0.1.4", versions, VersionLevel.MINOR) + assert not is_update_version("1.2.5", versions, VersionLevel.MINOR) + assert not is_update_version("1.3.2", versions, VersionLevel.MINOR) + assert not is_update_version("2.0.0", versions, VersionLevel.MINOR) # no 2.x to update from + assert not is_update_version("2.1.3", versions, VersionLevel.MINOR) + + assert not is_update_version("0.1.0", versions, VersionLevel.MAJOR) + assert not is_update_version("0.1.4", versions, VersionLevel.MAJOR) + assert not is_update_version("0.2.0", versions, VersionLevel.MAJOR) + assert not is_update_version("0.2.9", versions, VersionLevel.MAJOR) + assert not is_update_version("1.2.5", versions, VersionLevel.MAJOR) + assert not is_update_version("1.3.2", versions, VersionLevel.MAJOR) + assert not is_update_version("1.4.0", versions, VersionLevel.MAJOR) + + assert is_update_version("0.1.3", versions, VersionLevel.PATCH) + assert is_update_version("1.2.5", versions, VersionLevel.PATCH) + assert is_update_version("1.3.2", versions, VersionLevel.PATCH) + + assert is_update_version("0.2.0", versions, VersionLevel.MINOR) + assert is_update_version("0.2.1", versions, VersionLevel.MINOR) + assert is_update_version("0.3.0", versions, VersionLevel.MINOR) + assert is_update_version("1.4.0", versions, VersionLevel.MINOR) + assert is_update_version("1.5.0", versions, VersionLevel.MINOR) + + assert is_update_version("2.0.0", versions, VersionLevel.MAJOR) + assert is_update_version("2.1.3", versions, VersionLevel.MAJOR) + + def test_get_url_without_query(): url_h = "http://some-host.com/wps" url_q = f"{url_h}?service=WPS" diff --git a/tests/wps_restapi/test_jobs.py b/tests/wps_restapi/test_jobs.py index bf8e24378..5c6e9f894 100644 --- a/tests/wps_restapi/test_jobs.py +++ b/tests/wps_restapi/test_jobs.py @@ -909,7 +909,7 @@ def test_jobs_list_schema_not_required_fields(self): """ uri = sd.openapi_json_service.path resp = self.app.get(uri, headers=self.json_headers) - schema_prefix = sd.GetJobsQueries.__name__ + schema_prefix = sd.GetProcessJobsQuery.__name__ assert not resp.json["parameters"][f"{schema_prefix}.page"]["required"] assert not resp.json["parameters"][f"{schema_prefix}.limit"]["required"] diff --git a/tests/wps_restapi/test_processes.py b/tests/wps_restapi/test_processes.py index 7ebb27c89..82b986069 100644 --- a/tests/wps_restapi/test_processes.py +++ b/tests/wps_restapi/test_processes.py @@ -1,5 +1,4 @@ # pylint: disable=R1729 # ignore non-generator representation employed for displaying test log results - import base64 import contextlib import copy @@ -7,6 +6,7 @@ import os import tempfile import unittest +import uuid from copy import deepcopy from typing import TYPE_CHECKING @@ -45,10 +45,10 @@ from weaver.wps_restapi import swagger_definitions as sd if TYPE_CHECKING: - from typing import Optional + from typing import List, Optional, Tuple from weaver.processes.constants import ProcessSchemaType - from weaver.typedefs import CWL, JSON + from weaver.typedefs import CWL, JSON, AnyHeadersContainer, AnyVersion # pylint: disable=C0103,invalid-name @@ -270,6 +270,107 @@ def test_get_processes_bad_request_paging_providers(self): assert resp.status_code == 400 assert "ListingInvalidParameter" in resp.json["error"] + def deploy_process_revisions(self, process_id): + # type: (str) -> List[str] + """ + Generates some revisions of a given process. + """ + versions = ["1.2.0"] + cwl, _ = self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=process_id, version=versions[0]) + data = {"title": "first revision", "version": "1.2.3"} + resp = self.app.patch_json(f"/processes/{process_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + versions.append(data["version"]) + data = {"title": "second revision", "version": "1.2.5"} + resp = self.app.patch_json(f"/processes/{process_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + versions.append(data["version"]) + data = {"title": "third revision", "version": "1.3.2", "jobControlOptions": [ExecuteControlOption.SYNC]} + resp = self.app.patch_json(f"/processes/{process_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + versions.append(data["version"]) + data = {"title": "fourth revision", "version": "1.3.4"} + resp = self.app.patch_json(f"/processes/{process_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + versions.append(data["version"]) + data = copy.deepcopy(cwl) # type: JSON + data.update({"version": "2.0.0", "inputs": {"message": {"type": "string"}}}) + resp = self.app.put_json(f"/processes/{process_id}", params=data, headers=self.json_headers) + assert resp.status_code == 201 + versions.append(data["version"]) + data = {"value": Visibility.PUBLIC} # must make visible otherwise will not be listed/retrievable + resp = self.app.put_json(f"/processes/{process_id}/visibility", params=data, headers=self.json_headers) + assert resp.status_code == 200 + return sorted(versions) + + def test_get_processes_with_tagged_revisions(self): + """ + Listing of mixed processes with and without revisions. + + .. versionadded:: 4.20 + """ + path = get_path_kvp("/processes", revisions=True, detail=False) + resp = self.app.get(path, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 200 + body = resp.json + proc_no_revs = body["processes"] + assert len(proc_no_revs) > 0, "cannot test mixed no-revision/with-revisions listing without prior processes" + + # create some processes with different combinations of revisions, no-version, single-version + proc1_id = "first-process" + proc1_versions = self.deploy_process_revisions(proc1_id) + proc1_tags = [f"{proc1_id}:{ver}" for ver in proc1_versions] + proc2_id = "other-process" + proc2_versions = self.deploy_process_revisions(proc2_id) + proc2_tags = [f"{proc2_id}:{ver}" for ver in proc2_versions] + proc_total = len(proc_no_revs) + len(proc1_versions) + len(proc2_versions) + + path = get_path_kvp("/processes", revisions=True, detail=False) + resp = self.app.get(path, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 200 + body = resp.json + assert len(body["processes"]) == proc_total + assert body["processes"] == sorted(proc_no_revs + proc1_tags + proc2_tags) + + path = get_path_kvp("/processes", revisions=True, detail=True) + resp = self.app.get(path, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 200 + body = resp.json + assert len(body["processes"]) == proc_total + proc_result = [(proc["id"], proc["version"]) for proc in body["processes"]] + proc_expect = [(proc_id, "0.0.0") for proc_id in proc_no_revs] + proc_expect += [(tag, ver) for tag, ver in zip(proc1_tags, proc1_versions)] + proc_expect += [(tag, ver) for tag, ver in zip(proc2_tags, proc2_versions)] + assert proc_result == sorted(proc_expect) + + def test_get_processes_with_history_revisions(self): + """ + When requesting specific process ID with revisions, version history of this process is listed. + + .. versionadded:: 4.20 + """ + p_id = "test-process-history-revision" + versions = self.deploy_process_revisions(p_id) + revisions = [f"{p_id}:{ver}" for ver in versions] + + path = get_path_kvp("/processes", process=p_id, revisions=True, detail=False) + resp = self.app.get(path, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert "processes" in body and len(body["processes"]) > 0 + assert body["processes"] == revisions, ( + "sorted processes by version with tagged representation expected when requesting revisions" + ) + + path = get_path_kvp("/processes", process=p_id, revisions=True, detail=True) + resp = self.app.get(path, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert "processes" in body and len(body["processes"]) > 0 + result = [(proc["id"], proc["version"]) for proc in body["processes"]] + expect = list(zip(revisions, versions)) + assert result == expect + @mocked_remote_server_requests_wps1([ resources.TEST_REMOTE_SERVER_URL, resources.TEST_REMOTE_PROCESS_GETCAP_WPS1_XML, @@ -558,11 +659,11 @@ def assert_deployed_wps3(response_json, expected_process_id, assert_io=True): assert proc["outputs"][0]["formats"][0]["mediaType"] == ContentType.APP_JSON def deploy_process_make_visible_and_fetch_deployed(self, - deploy_payload, - expected_process_id, - headers=None, - assert_io=True, - ): + deploy_payload, # type: JSON + expected_process_id, # type: str + headers=None, # type: Optional[AnyHeadersContainer] + assert_io=True, # type: bool + ): # type: (...) -> JSON """ Deploy, make visible and obtain process description. @@ -597,6 +698,7 @@ def deploy_process_make_visible_and_fetch_deployed(self, return resp.json def get_application_package(self, process_id): + # type: (str) -> CWL resp = self.app.get(f"/processes/{process_id}/package", headers=self.json_headers) assert resp.status_code == 200 return resp.json @@ -665,10 +767,14 @@ def test_deploy_process_CWL_direct_raised_missing_id(self): assert resp.status_code == 400 assert "'Deploy.DeployCWL.id': 'Missing required field.'" in resp.json["cause"] - def deploy_process_CWL_direct(self, content_type, graph_count=0): - p_id = "test-direct-cwl-json" + def deploy_process_CWL_direct(self, + content_type, # type: ContentType + graph_count=0, # type: int + process_id="test-direct-cwl-json", # type: str + version=None, # type: Optional[AnyVersion] + ): # type: (...) -> Tuple[CWL, JSON] cwl_core = { - "id": p_id, + "id": process_id, "class": "CommandLineTool", "baseCommand": ["python3", "-V"], "inputs": {}, @@ -684,6 +790,8 @@ def deploy_process_CWL_direct(self, content_type, graph_count=0): cwl = {} cwl_base = {"cwlVersion": "v1.0"} cwl.update(cwl_base) + if version: + cwl["version"] = version if graph_count: cwl["$graph"] = [cwl_core] * graph_count else: @@ -691,8 +799,8 @@ def deploy_process_CWL_direct(self, content_type, graph_count=0): if "yaml" in content_type: cwl = yaml.safe_dump(cwl, sort_keys=False) headers = {"Content-Type": content_type} - desc = self.deploy_process_make_visible_and_fetch_deployed(cwl, p_id, headers=headers, assert_io=False) - pkg = self.get_application_package(p_id) + desc = self.deploy_process_make_visible_and_fetch_deployed(cwl, process_id, headers=headers, assert_io=False) + pkg = self.get_application_package(process_id) assert desc["deploymentProfile"] == "http://www.opengis.net/profiles/eoc/dockerizedApplication" # once parsed, CWL I/O are converted to listing form @@ -707,7 +815,7 @@ def deploy_process_CWL_direct(self, content_type, graph_count=0): # process description should have been generated with relevant I/O proc = desc["process"] - assert proc["id"] == p_id + assert proc["id"] == process_id assert proc["inputs"] == [] assert proc["outputs"] == [{ "id": "output", @@ -715,6 +823,7 @@ def deploy_process_CWL_direct(self, content_type, graph_count=0): "schema": {"type": "string", "contentMediaType": "text/plain"}, "formats": [{"default": True, "mediaType": "text/plain"}] }] + return cwl, desc def test_deploy_process_CWL_direct_JSON(self): self.deploy_process_CWL_direct(ContentType.APP_CWL_JSON) @@ -1088,6 +1197,453 @@ def test_deploy_process_WPS3_DescribeProcess_owsContext(self): def test_deploy_process_WPS3_DescribeProcess_executionUnit(self): raise NotImplementedError + def test_deploy_process_with_revision_invalid(self): + """ + Ensure that new deployment directly using a ``{processID}:{version}`` reference is not allowed. + + This nomenclature is reserved for revisions accomplished with PUT or PATCH requests with controlled versioning. + + .. versionadded:: 4.20 + """ + cwl = { + "cwlVersion": "v1.0", + "class": "CommandLineTool", + "baseCommand": ["python3", "-V"], + "inputs": {}, + "outputs": { + "output": { + "type": "File", + "outputBinding": { + "glob": "stdout.log" + }, + } + }, + } + + headers = {"Content-Type": ContentType.APP_CWL_JSON, "Accept": ContentType.APP_JSON} + data = copy.deepcopy(cwl) + data["id"] = "invalid-process:1.2.3" + resp = self.app.post_json("/processes", params=cwl, headers=headers, expect_errors=True) + assert resp.status_code in [400, 422] + assert "invalid" in resp.json["description"] + + data = { + "processDescription": {"process": {"id": "invalid-process:1.2.3"}}, + "executionUnit": [{"unit": cwl}], + "deploymentProfileName": "http://www.opengis.net/profiles/eoc/dockerizedApplication", + } + resp = self.app.post_json("/processes", params=data, headers=self.json_headers, expect_errors=True) + assert resp.status_code in [400, 422] + assert "invalid" in resp.json["description"] + + def test_update_process_not_found(self): + resp = self.app.patch_json("/processes/not-found", params={}, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 404 + + def test_update_process_no_data(self): + """ + Error expected if no data is provided for an update request. + + .. versionadded:: 4.20 + """ + p_id = "test-update-no-data" + self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id) + resp = self.app.patch_json(f"/processes/{p_id}", params={}, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 400 + assert resp.json["title"] == "Failed process parameter update." + + data = {"description": None, "title": None} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 400 + assert resp.json["title"] == "Failed process parameter update." + + data = {"unknown-field": "new content"} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 400 + assert resp.json["title"] == "Failed process parameter update." + + def test_update_process_latest_valid(self): + """ + Update the current process revision with new metadata (making it an older revision and new one becomes latest). + + Change should be marked as PATCH revision. + + .. versionadded:: 4.20 + """ + p_id = "test-update-cwl-json" + _, desc = self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id) + assert desc["process"]["version"] is None, "No version provided should be reported as such." + data = { + "description": "New description", + "title": "Another title", + } + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == data["title"] + assert body["processSummary"]["description"] == data["description"] + + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["title"] == data["title"] + assert body["description"] == data["description"] + assert body["version"] == "0.0.1", ( + "PATCH revision expected. Since previous did not have a version, " + "it should be assumed 0.0.0, making this revision 0.0.1." + ) + + def test_update_process_older_valid(self): + """ + Update an older process (already a previous revision) with new metadata. + + The older and updated process references must then be adjusted to ensure that fetching by process name only + returns the new latest definition, while specific process tag returns the expected revision. + + .. versionadded:: 4.20 + """ + p_id = "test-update-cwl-json" + version = "1.2.3" + self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id, version=version) + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + assert resp.json["version"] == version + assert "description" not in resp.json + + data = { + "description": "New description", + "title": "Another title", + } + resp = self.app.patch_json(f"/processes/{p_id}:{version}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == "1.2.4", "Patch update expected" + assert body["title"] == data["title"] + assert body["description"] == data["description"] + + data = { + "title": "Another change with version", + "version": "1.2.7", # doesn't have to be the one right after latest (as long as greater than 1.2.4) + } + resp = self.app.patch_json(f"/processes/{p_id}:{version}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + + # check that previous 'latest' can be fetched by specific version + resp = self.app.get(f"/processes/{p_id}:{version}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["id"] == f"{p_id}:{version}" + assert body["version"] == version + assert "description" not in body + + # check final result with both explicit '1.2.7' version and new 'latest' + for p_ref in [p_id, f"{p_id}:1.2.7"]: + resp = self.app.get(f"/processes/{p_ref}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == data["version"], "Specific version update expected" + assert body["title"] == data["title"] + assert "description" not in body, ( + "Not modified since no new value, value from reference process must be used. " + "Must not make use of the intermediate '1.2.4' version, since '1.2.3' explicitly requested for update." + ) + + def test_update_process_auto_revision(self): + """ + When updating a process, if not version is explicitly provided, the next one is automatically applied. + + Next version depends on the level of changes implied. Proper semantic level should be bumped using corresponding + information that gets updated. + """ + p_id = "test-process-auto-revision" + cwl, _ = self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id, version="1.0") + + data = {"title": "new title"} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == data["title"] + assert body["processSummary"]["version"] == "1.0.1", "only metadata updated, PATCH auto-revision expected" + + old_title = data["title"] + data = {"description": "modify description"} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == old_title + assert body["processSummary"]["version"] == "1.0.2", "only metadata updated, PATCH auto-revision expected" + assert body["processSummary"]["description"] == data["description"] + assert body["processSummary"]["jobControlOptions"] == [ExecuteControlOption.ASYNC] # default, validate for next + + old_desc = data["description"] + data = {"jobControlOptions": ExecuteControlOption.values(), "title": "any exec control"} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == data["title"] + assert body["processSummary"]["version"] == "1.1.0", "MINOR revision expected for change that affects execute" + assert body["processSummary"]["description"] == old_desc + assert body["processSummary"]["jobControlOptions"] == data["jobControlOptions"] + + old_title = data["title"] + old_jco = data["jobControlOptions"] + data = {"outputs": {"output": {"title": "the output"}}} + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == old_title + assert body["processSummary"]["version"] == "1.1.1", "only metadata updated, PATCH auto-revision expected" + assert body["processSummary"]["description"] == old_desc + assert body["processSummary"]["jobControlOptions"] == old_jco + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["outputs"]["output"]["title"] == data["outputs"]["output"]["title"] + + cwl["inputs"] = {"message": {"type": "string"}} + cwl.pop("version", None) # make sure none specified, let MAJOR auto-revision with latest + data = { + "processDescription": {"process": {"id": p_id, "visibility": Visibility.PUBLIC}}, + "executionUnit": [{"unit": cwl}] + } + resp = self.app.put_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 201 + body = resp.json + assert "title" not in body["processSummary"] # everything resets because PUT replaces, not updated like PATCH + assert "description" not in body["processSummary"] + assert body["processSummary"]["version"] == "2.0.0", "full process updated, MAJOR auto-revision expected" + assert body["processSummary"]["jobControlOptions"] == ExecuteControlOption.values() + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert "message" in body["inputs"] + assert "description" not in body["outputs"]["output"] + assert body["outputs"]["output"]["title"] == "output", "default title generated from ID since none provided" + + data = { # validate mixed format use and distinct PATCH/MINOR level changes + "title": "mixed format", + "outputs": [{"id": "output", "title": "updated output title", "description": "new description added"}], + "inputs": {"message": {"description": "message input"}}, + } + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processSummary"]["title"] == data["title"] + assert body["processSummary"]["version"] == "2.0.1", "only metadata updated, PATCH auto-revision expected" + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == "2.0.1", "only metadata updated, PATCH auto-revision expected" + assert body["inputs"]["message"]["description"] == data["inputs"]["message"]["description"] + assert body["outputs"]["output"]["title"] == data["outputs"][0]["title"] + assert body["outputs"]["output"]["description"] == data["outputs"][0]["description"] + + def test_update_process_jobs_adjusted(self): + """ + Validate that given a valid process update, associated jobs update their references to preserve links. + + If links were not updated with the new tagged revision, older jobs would refer to the updated (latest) process, + which might not make sense according to the level of modifications applied for this process. + + .. versionadded:: 4.20 + """ + p_id = "test-update-job-refs" + self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id) + job = self.job_store.save_job(task_id=uuid.uuid4(), process=p_id, access=Visibility.PUBLIC) + + # verify that job initially refers to "latest" process + path = f"/jobs/{job.id}" + resp = self.app.get(path, headers=self.json_headers) + body = resp.json + assert "processID" in body and body["processID"] == p_id + path = get_path_kvp(f"/processes/{p_id}/jobs", detail=False) + resp = self.app.get(path, headers=self.json_headers) + body = resp.json + assert len(body["jobs"]) == 1 and str(job.id) in body["jobs"] + + # update process + data = { + "description": "New description", + "title": "Another title", + } + resp = self.app.patch_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert "version" in body["processSummary"] and body["processSummary"]["version"] not in [None, "0.0.0"] + + # verify job was updated with new reference + path = f"/jobs/{job.id}" + resp = self.app.get(path, headers=self.json_headers) + body = resp.json + assert "processID" in body and body["processID"] == f"{p_id}:0.0.0" + + path = get_path_kvp(f"/processes/{p_id}/jobs", detail=False) + resp = self.app.get(path, headers=self.json_headers) + body = resp.json + assert len(body["jobs"]) == 0 + path = get_path_kvp(f"/processes/{p_id}:0.0.0/jobs", detail=False) + resp = self.app.get(path, headers=self.json_headers) + body = resp.json + assert len(body["jobs"]) == 1 and str(job.id) in body["jobs"] + + def test_replace_process_valid(self): + """ + Redeploy a process by replacing its definition (MAJOR revision update). + + Validate both different deploy formats (CWL, OAS, OGC) and different resolution methods of target version based + auto-resolved latest process when omitted or using an explicit version specification in the payload. + + .. versionadded:: 4.20 + """ + # first deploy uses direct CWL, following update uses OGC-AppPackage schema, and last used OpenAPI schema + # validate that distinct deployment schema does not pose a problem to parse update contents + p_id = "test-replace-cwl" + v1 = "1.2.3" + cwl_v1, desc_v1 = self.deploy_process_CWL_direct(ContentType.APP_JSON, process_id=p_id, version=v1) + assert desc_v1["process"]["version"] == v1 + assert desc_v1["process"]["inputs"] == [] + + cwl_v2 = copy.deepcopy(cwl_v1) + cwl_v2.pop("id", None) # ensure no reference + cwl_v2.pop("version", None) # avoid conflict + cwl_v2["inputs"]["test"] = {"type": "string"} # type: ignore + data = { + "processDescription": {"process": { + # must include ID in deploy payload for counter validation against updated process + # (error otherwise since reusing same Deploy schema that requires it) + "id": p_id, + # make public to allow retrieving the process + # since we "override" the revision with PUT, omitting this would make the new version private + "visibility": Visibility.PUBLIC, + # new information to apply, validate that it is also considered, not just the package redefinition + "title": "Updated CWL" + }}, + "executionUnit": [{"unit": cwl_v2}], + "deploymentProfileName": "http://www.opengis.net/profiles/eoc/dockerizedApplication", + } + v2 = "2.0.0" # not explicitly provided, expected resolved MAJOR update for revision + resp = self.app.put_json(f"/processes/{p_id}", params=data, headers=self.json_headers) + assert resp.status_code == 201 + body = resp.json + assert body["processSummary"]["title"] == data["processDescription"]["process"]["title"], ( + "Even though MAJOR update for CWL is accomplished, other fields that usually correspond to MINOR changes " + "should also applied at the same time since the operation replaces the new process definition (PUT)." + ) + assert ( + "description" not in body["processSummary"] or # if undefined, dropped from body + body["processSummary"]["description"] != desc_v1["description"] # just in case, check otherwise + ), ( + "Description should not have remained from previous version since this is a replacement (PUT)," + "not a revision update (PATCH)." + ) + + path = get_path_kvp(f"/processes/{p_id}:{v1}", schema=ProcessSchema.OGC) + resp = self.app.get(path, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert "title" not in body, "should be missing as in original definition" + assert "description" not in body, "should be missing as in original definition" + assert body["version"] == v1 + assert body["inputs"] == {}, "empty mapping due to OGC schema, no input as in original definition" + + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["title"] == data["processDescription"]["process"]["title"] + assert "description" not in body + assert body["version"] == v2, f"Since no version was specified, next MAJOR version after {v1} was expected." + assert len(body["inputs"]) == 1 and "test" in body["inputs"] + + # redeploy with explicit version + cwl_v3 = copy.deepcopy(cwl_v2) + cwl_v3.pop("version", None) # avoid conflict + # need to provide basic input definition in CWL to avoid dropping it when checked against payload definitions + # add extra literal data domain information in OAS structure + cwl_v3["inputs"]["number"] = {"type": "int"} # type: ignore + v3 = "4.3.2" # does not necessarily need to be the next one + data = { + "processDescription": { + "id": p_id, # required to fulfill schema validation, must omit 'version' part and match request path ID + "version": "4.3.2", # explicitly provided to avoid auto-bump to '3.0.0' + # use OAS representation in this case to validate it is still valid using update request + "inputs": {"number": {"schema": {"type": "integer", "minimum": 1, "maximum": 3}}}, + "visibility": Visibility.PUBLIC, # ensure we can retrieve the description later + }, + "executionUnit": [{"unit": cwl_v3}], + } + # don't need to refer to "latest" process since we provide an explicit version that is available + resp = self.app.put_json(f"/processes/{p_id}:{v1}", params=data, headers=self.json_headers) + assert resp.status_code == 201 + + # check all versions are properly resolved + resp = self.app.get(f"/processes/{p_id}:{v1}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == v1 + assert body["id"] == f"{p_id}:{v1}" + resp = self.app.get(f"/processes/{p_id}:{v2}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == v2 + assert body["id"] == f"{p_id}:{v2}" + resp = self.app.get(f"/processes/{p_id}:{v3}", headers=self.json_headers) # explicitly the latest by version + assert resp.status_code == 200 + body = resp.json + assert body["version"] == v3 + assert body["id"] == p_id + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) # latest implicitly + assert resp.status_code == 200 + body = resp.json + assert body["version"] == v3 + assert body["id"] == p_id + + def test_delete_process_revision(self): + """ + Process revisions can be deleted (undeployed) just like any other process. + + In the event that the revision to delete happens to be the active latest, the next one by semantic version + should become the new latest revision. + + .. versionadded:: 4.20 + """ + p_id = "test-delete-process-revision" + versions = self.deploy_process_revisions(p_id) + + # delete a process revision + del_ver = versions[3] # pick any not latest + path = f"/processes/{p_id}:{del_ver}" + resp = self.app.delete_json(path, headers=self.json_headers) + assert resp.status_code == 200 + + # check that revision was properly removed + resp = self.app.get(path, headers=self.json_headers, expect_errors=True) + assert resp.status_code == 404 + path = get_path_kvp("/processes", detail=False, revisions=True, process=p_id) + resp = self.app.get(path, headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["processes"] == [f"{p_id}:{ver}" for ver in versions if ver != del_ver] + + # check that latest version was not affected since it wasn't the latest that was deleted + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == versions[-1] + + # delete latest to valide it gets updated with the version before it + latest_ver = versions[-1] + path = f"/processes/{p_id}:{latest_ver}" + resp = self.app.delete_json(path, headers=self.json_headers) + assert resp.status_code == 200 + + resp = self.app.get(f"/processes/{p_id}", headers=self.json_headers) + assert resp.status_code == 200 + body = resp.json + assert body["version"] == versions[-2], "new latest should be the version before the previously removed latest" + def test_delete_process_success(self): path = f"/processes/{self.process_public.identifier}" resp = self.app.delete_json(path, headers=self.json_headers) diff --git a/tests/wps_restapi/test_swagger_definitions.py b/tests/wps_restapi/test_swagger_definitions.py new file mode 100644 index 000000000..f01c50c59 --- /dev/null +++ b/tests/wps_restapi/test_swagger_definitions.py @@ -0,0 +1,55 @@ +import uuid + +import colander +import pytest + +from weaver.wps_restapi import swagger_definitions as sd + + +def test_process_id_with_version_tag_deploy_invalid(): + """ + Validate process ID with version label is not allowed as definition during deployment or update. + + To take advantage of auto-resolution of unique :meth:`StoreProcesses.fetch_by_id` with version references injected + in the process ID stored in the database, deployment and description of processes must not allow it to avoid + conflicts. The storage should take care of replacing the ID value transparently after it was resolved. + """ + test_id_version_invalid = [ + "process:1.2.3", + "test-process:4.5.6", + "other_process:1", + "invalid-process:1_2_3", + f"{uuid.uuid4()}:7.8.9", + ] + for test_id in test_id_version_invalid: + with pytest.raises(colander.Invalid): + sd.ProcessIdentifier().deserialize(test_id) + for test_id in test_id_version_invalid: + test_id = test_id.split(":", 1)[0] + assert sd.ProcessIdentifier().deserialize(test_id) == test_id + + +def test_process_id_with_version_tag_get_valid(): + """ + Validate that process ID with tagged version is permitted for request path parameter to retrieve it. + """ + test_id_version_valid = [ + "test-ok", + "test-ok1", + "test-ok:1", + "test-ok:1.2.3", + "also_ok:1.3", + ] + test_id_version_invalid = [ + "no-:1.2.3", + "not-ok1.2.3", + "no:", + "not-ok:", + "not-ok11:", + "not-ok1.2.3:", + ] + for test_id in test_id_version_invalid: + with pytest.raises(colander.Invalid): + sd.ProcessIdentifierTag().deserialize(test_id) + for test_id in test_id_version_valid: + assert sd.ProcessIdentifierTag().deserialize(test_id) == test_id diff --git a/weaver/database/__init__.py b/weaver/database/__init__.py index 8d4b8e6a6..3700557af 100644 --- a/weaver/database/__init__.py +++ b/weaver/database/__init__.py @@ -1,6 +1,7 @@ import logging from typing import TYPE_CHECKING +from pyramid.request import Request from pyramid.settings import asbool from weaver.database.mongodb import MongoDatabase @@ -28,6 +29,10 @@ def get_db(container=None, reset_connection=False): It is preferable to provide a registry reference to reuse any available connection whenever possible. Giving application settings will require establishing a new connection. """ + if not reset_connection and isinstance(container, Request): + db = getattr(container, "db", None) + if isinstance(db, MongoDatabase): + return db registry = get_registry(container, nothrow=True) if not reset_connection and registry and isinstance(getattr(registry, "db", None), MongoDatabase): return registry.db diff --git a/weaver/database/base.py b/weaver/database/base.py index 6b5a12dd3..7e5caf61f 100644 --- a/weaver/database/base.py +++ b/weaver/database/base.py @@ -4,10 +4,11 @@ from weaver.store.base import StoreInterface if TYPE_CHECKING: - from typing import Any + from typing import Any, Type, Union + from typing_extensions import Literal from weaver.store.base import StoreBills, StoreJobs, StoreProcesses, StoreQuotes, StoreServices, StoreVault - from weaver.typedefs import AnySettingsContainer, JSON, Literal, Type, Union + from weaver.typedefs import AnySettingsContainer, JSON AnyStore = Union[ StoreBills, @@ -64,6 +65,36 @@ def _get_store_type(store_type): return store_type raise TypeError(f"Unsupported store type selector: [{store_type}] ({type(store_type)})") + @overload + def get_store(self, store_type): + # type: (StoreBillsSelector) -> StoreBills + ... + + @overload + def get_store(self, store_type): + # type: (StoreQuotesSelector) -> StoreQuotes + ... + + @overload + def get_store(self, store_type): + # type: (StoreJobsSelector) -> StoreJobs + ... + + @overload + def get_store(self, store_type): + # type: (StoreProcessesSelector) -> StoreProcesses + ... + + @overload + def get_store(self, store_type): + # type: (StoreServicesSelector) -> StoreServices + ... + + @overload + def get_store(self, store_type): + # type: (StoreVaultSelector) -> StoreVault + ... + @overload def get_store(self, store_type, *store_args, **store_kwargs): # type: (StoreBillsSelector, *Any, **Any) -> StoreBills diff --git a/weaver/datatype.py b/weaver/datatype.py index 4d849b62e..01e1b85db 100644 --- a/weaver/datatype.py +++ b/weaver/datatype.py @@ -37,8 +37,11 @@ from weaver.processes.types import ProcessType from weaver.quotation.status import QuoteStatus from weaver.status import JOB_STATUS_CATEGORIES, Status, StatusCategory, map_status +from weaver.store.base import StoreProcesses from weaver.utils import localize_datetime # for backward compatibility of previously saved jobs not time-locale-aware from weaver.utils import ( + VersionFormat, + as_version_major_minor_patch, fully_qualified_name, get_job_log_msg, get_log_date_fmt, @@ -54,7 +57,7 @@ from weaver.wps_restapi.utils import get_wps_restapi_base_url if TYPE_CHECKING: - from typing import Any, Callable, Dict, IO, List, Optional, Union + from typing import Any, Callable, Dict, IO, Iterator, List, Optional, Tuple, Union from owslib.wps import WebProcessingService @@ -68,6 +71,7 @@ AnyProcess, AnySettingsContainer, AnyUUID, + AnyVersion, ExecutionInputs, ExecutionOutputs, Number, @@ -126,6 +130,21 @@ def __repr__(self): _repr = dict.__repr__(self) return f"{_type} ({_repr})" + @classmethod + def properties(cls, fget=True, fset=True): + # type: (bool, bool) -> Iterator[str] + """ + Get names of properties stored in the object, optionally filtered by read-only or write-only conditions. + """ + return iter( + name for name, prop in inspect.getmembers(cls) + if not name.startswith("_") and isinstance(prop, property) and ( + (fget and fset and prop.fget is not None and prop.fset is not None) or + (fget and not fset and prop.fget is not None and prop.fset is None) or + (not fget and fset and prop.fget is None and prop.fset is not None) + ) + ) + def dict(self): # type: () -> AnyParams """ @@ -462,9 +481,9 @@ def summary(self, container, fetch=True, ignore=False): When metadata fetching is disabled, the generated summary will contain only information available locally. - :param container: employed to retrieve application settings. - :param fetch: indicates whether metadata should be fetched from remote. - :param ignore: indicates if failing metadata retrieval/parsing should be silently discarded or raised. + :param container: Employed to retrieve application settings. + :param fetch: Indicates whether metadata should be fetched from remote. + :param ignore: Indicates if failing metadata retrieval/parsing should be silently discarded or raised. :return: generated summary information. :raises ServiceParsingError: If the target service provider is not reachable, content is not parsable or any other error related to @@ -581,7 +600,7 @@ def check_accessible(self, settings, ignore=True): class Job(Base): """ - Dictionary that contains OWS service jobs. + Dictionary that contains :term:`Job` details for local :term:`Process` or remote :term:`OWS` execution. It always has ``id`` and ``task_id`` keys. """ @@ -1765,40 +1784,94 @@ def __init__(self, *args, **kwargs): if "id" not in self and "identifier" not in self: raise TypeError("'id' OR 'identifier' is required") if "id" not in self: - self["id"] = self.pop("identifier") + self.id = self.pop("identifier") if "package" not in self: raise TypeError("'package' is required") - @property - def id(self): + def _get_id(self): # type: () -> str return dict.__getitem__(self, "id") + def _set_id(self, _id): + # type: (str) -> None + self["id"] = _id + + id = identifier = property(fget=_get_id, fset=_set_id, doc=( + "Unique process identifier with optional version number if it corresponds to an older revision." + )) + + @classmethod + def split_version(cls, process_id): + # type: (str) -> Tuple[str, Optional[str]] + """ + Split the tagged version from the :term:`Process` identifier considering any required special handling. + + :returns: Process ID (only) and the version if any was available in tagged reference. + """ + # note:: + # Consider 'urn:...' prefix that could cause ':' characters to be present although maybe no version in ID? + # Mot currently permitted due to schema validation parsing of ID on deploy, but could become permitted... + result = process_id.rsplit(":", 1) + if not len(result) == 2: + return process_id, None + p_id, version = result + return (p_id, version) if all(str.isnumeric(part) for part in version.split(".")) else (process_id, None) + + @property + def latest(self): + # type: () -> bool + """ + Checks if this :term:`Process` corresponds to the latest revision. + """ + # if ID loaded from DB contains a version, it is not the latest by design + return self.split_version(self.id)[-1] is None + @property - def identifier(self): + def name(self): # type: () -> str + """ + Obtain only the :term:`Process` name portion of the unique identifier. + """ + if self.version: + return self.split_version(self.id)[0] return self.id - @identifier.setter - def identifier(self, value): - # type: (str) -> None - self["id"] = value + @property + def tag(self): + # type: () -> str + """ + Full identifier including the version for an unique reference. + """ + proc_id = self.split_version(self.id)[0] + # bw-compat, if no version available, no update was applied (single deploy) + # there is no need to define a tag as only one result can be found + # on next (if any) update request, this revision will be updated with a default version + if self.version is None: + return proc_id + version = as_version_major_minor_patch(self.version, VersionFormat.STRING) + return f"{proc_id}:{version}" @property def title(self): # type: () -> str return self.get("title", self.id) - @property - def abstract(self): - # type: () -> str - return self.get("abstract", "") + @title.setter + def title(self, title): + # type: (str) -> None + self["title"] = title - @property - def description(self): - # OGC-API-Processes v1 field representation + def _get_desc(self): + # type: () -> str + # OGC-API-Processes v1 field representation use 'description' # bw-compat with existing processes that defined it as abstract - return self.abstract or self.get("description", "") + return self.get("abstract", "") or self.get("description", "") + + def _set_desc(self, description): + # type: (str) -> None + self["abstract"] = description + + description = abstract = property(fget=_get_desc, fset=_set_desc, doc="Process description.") @property def keywords(self): @@ -1809,16 +1882,33 @@ def keywords(self): self["keywords"] = keywords return dict.__getitem__(self, "keywords") + @keywords.setter + def keywords(self, keywords): + # type: (List[str]) -> None + self["keywords"] = list(set(sd.KeywordList().deserialize(keywords))) + @property def metadata(self): # type: () -> List[Metadata] return self.get("metadata", []) + @metadata.setter + def metadata(self, metadata): + # type: (List[Metadata]) -> None + self["metadata"] = sd.MetadataList().deserialize(metadata) + @property def version(self): # type: () -> Optional[str] return self.get("version") + @version.setter + def version(self, version): + # type: (AnyVersion) -> None + if not isinstance(version, str): + version = as_version_major_minor_patch(version, VersionFormat.STRING) + self["version"] = sd.Version().deserialize(version) + @property def inputs(self): # type: () -> Optional[List[Dict[str, JSON]]] @@ -1856,6 +1946,13 @@ def inputs(self): _input["schema"] = self._decode(input_schema) return inputs + @inputs.setter + def inputs(self, inputs): + # type: (List[Dict[str, JSON]]) -> None + if not isinstance(inputs, list): + raise TypeError("Inputs container expected as list to normalize process definitions.") + self["inputs"] = inputs + @property def outputs(self): # type: () -> Optional[List[Dict[str, JSON]]] @@ -1885,6 +1982,13 @@ def outputs(self): _output["schema"] = self._decode(output_schema) return outputs + @outputs.setter + def outputs(self, outputs): + # type: (List[Dict[str, JSON]]) -> None + if not isinstance(outputs, list): + raise TypeError("Outputs container expected as list to normalize process definitions.") + self["outputs"] = outputs + @property def jobControlOptions(self): # noqa: N802 # type: () -> List[AnyExecuteControlOption] @@ -1892,7 +1996,6 @@ def jobControlOptions(self): # noqa: N802 Control options that indicate which :term:`Job` execution modes are supported by the :term:`Process`. .. note:: - There are no official mentions about the ordering of ``jobControlOptions``. Nevertheless, it is often expected that the first item can be considered the default mode when none is requested explicitly (at execution time). With the definition of execution mode through the ``Prefer`` @@ -2103,6 +2206,7 @@ def params(self): "keywords": self.keywords, "metadata": self.metadata, "version": self.version, + "additional_links": self.additional_links, # escape potential OpenAPI JSON $ref in 'schema' also used by Mongo BSON "inputs": [self._encode(_input) for _input in self.inputs or []], "outputs": [self._encode(_output) for _output in self.outputs or []], @@ -2151,6 +2255,12 @@ def json(self): """ return sd.Process().deserialize(self.dict()) + _links = property( + fget=lambda self: self.get("_links", []), + fset=lambda self, value: dict.__setitem__(self, "_links", value), + doc="Cache pre-computed links." + ) + def links(self, container=None): # type: (Optional[AnySettingsContainer]) -> List[Link] """ @@ -2158,12 +2268,18 @@ def links(self, container=None): :param container: object that helps retrieve instance details, namely the host URL. """ + from weaver.database import get_db + + if self._links: # save re-computation time if already done + return self._links + proc_desc = self.href(container) proc_list = proc_desc.rsplit("/", 1)[0] jobs_list = proc_desc + sd.jobs_service.path proc_exec = proc_desc + "/execution" + proc_self = (proc_list + "/" + self.tag) if self.version else proc_desc links = [ - {"href": proc_desc, "rel": "self", "title": "Current process description."}, + {"href": proc_self, "rel": "self", "title": "Current process description."}, {"href": proc_desc, "rel": "process-meta", "title": "Process definition."}, {"href": proc_exec, "rel": "http://www.opengis.net/def/rel/ogc/1.0/execute", "title": "Process execution endpoint for job submission."}, @@ -2173,6 +2289,28 @@ def links(self, container=None): "title": "List of job executions corresponding to this process."}, {"href": proc_list, "rel": "up", "title": "List of processes registered under the service."}, ] + if self.version: + proc_tag = f"{proc_list}/{self.tag}" + proc_hist = f"{proc_list}?detail=false&revisions=true&process={self.id}" + links.extend([ + {"href": proc_tag, "rel": "working-copy", "title": "Tagged version of this process description."}, + {"href": proc_desc, "rel": "latest-version", "title": "Most recent revision of this process."}, + {"href": proc_hist, "rel": "version-history", "title": "Listing of all revisions of this process."}, + ]) + versions = get_db(container).get_store(StoreProcesses).find_versions(self.name, VersionFormat.OBJECT) + proc_ver = as_version_major_minor_patch(self.version, VersionFormat.OBJECT) + prev_ver = list(filter(lambda ver: ver < proc_ver, versions)) + next_ver = list(filter(lambda ver: ver > proc_ver, versions)) + if prev_ver: + proc_prev = f"{proc_desc}:{prev_ver[-1]!s}" + links.append( + {"href": proc_prev, "rel": "predecessor-version", "title": "Previous revision of this process."} + ) + if next_ver: + proc_next = f"{proc_desc}:{next_ver[0]!s}" + links.append( + {"href": proc_next, "rel": "successor-version", "title": "Next revision of this process."} + ) if self.service: api_base_url = proc_list.rsplit("/", 1)[0] wps_base_url = self.processEndpointWPS1.split("?")[0] @@ -2192,11 +2330,35 @@ def links(self, container=None): link.setdefault("hreflang", AcceptLanguage.EN_CA) # add user-provided additional links, no type/hreflang added since we cannot guess them known_links = {link.get("rel") for link in links} - extra_links = self.get("additional_links", []) + extra_links = self.additional_links extra_links = [link for link in extra_links if link.get("rel") not in known_links] links.extend(extra_links) + self._links = links return links + @property + def additional_links(self): + # type: () -> List[Link] + return self.get("additional_links", []) + + @additional_links.setter + def additional_links(self, links): + # type: (List[Link]) -> None + links = sd.LinkList().deserialize(links) + self["additional_links"] = [] # don't flag an existing rel that is about to be overridden as conflicting + self._links = [] # need recompute + all_rel = [link["rel"] for link in self.links()] + for link in links: + rel = link["rel"] + if rel in all_rel: + raise ValueError( + f"Value of '{self.__name__}.additional_links' is not valid. " + f"Unique links relations are required but '{rel}' is already taken." + ) + all_rel.append(rel) + self["additional_links"] = links + self._links = [] # need recompute on future call + def href(self, container=None): # type: (Optional[AnySettingsContainer]) -> str """ @@ -2257,12 +2419,17 @@ def offering(self, schema=ProcessSchema.OGC): # process fields directly at root + I/O as mappings return sd.ProcessDescriptionOGC().deserialize(process) - def summary(self): - # type: () -> JSON + def summary(self, revision=False): + # type: (bool) -> JSON """ Obtains the JSON serializable summary representation of the process. + + :param revision: Replace the process identifier by the complete tag representation. """ - return sd.ProcessSummary().deserialize(self.dict()) + data = self.dict() + if revision: + data["id"] = self.tag + return sd.ProcessSummary().deserialize(data) @staticmethod def from_wps(wps_process, **extra_params): diff --git a/weaver/processes/convert.py b/weaver/processes/convert.py index 387787f46..648df6978 100644 --- a/weaver/processes/convert.py +++ b/weaver/processes/convert.py @@ -2609,9 +2609,9 @@ def normalize_ordered_io(io_section, order_hints=None): First, converts I/O definitions defined as dictionary to an equivalent :class:`list` representation, in order to work only with a single representation method. The :class:`list` is chosen over :class:`dict` because - sequences can enforce a specific order, while mapping have no particular order. The list representation ensures - that I/O order is preserved when written to file and reloaded afterwards regardless of each server and/or library's - implementation of the mapping container. + sequences can enforce a specific order, while mapping (when saved as :term:`JSON` or :term:`YAML`) have no specific + order. The list representation ensures that I/O order is preserved when written to file and reloaded afterwards + regardless of server and/or library's implementation of the mapping container. If this function fails to correctly order any I/O or cannot correctly guarantee such result because of the provided parameters (e.g.: no hints given when required), the result will not break nor change the final processing behaviour diff --git a/weaver/processes/execution.py b/weaver/processes/execution.py index 7f7172045..14351d3e8 100644 --- a/weaver/processes/execution.py +++ b/weaver/processes/execution.py @@ -559,7 +559,7 @@ def submit_job(request, reference, tags=None): lang = request.accept_language.header_value # can only preemptively check if local process if isinstance(reference, Process): service_url = reference.processEndpointWPS1 - process_id = reference.id + process_id = reference.identifier # explicit 'id:version' process revision if available, otherwise simply 'id' visibility = reference.visibility is_workflow = reference.type == ProcessType.WORKFLOW is_local = True diff --git a/weaver/processes/utils.py b/weaver/processes/utils.py index a3250c77a..1b1da9033 100644 --- a/weaver/processes/utils.py +++ b/weaver/processes/utils.py @@ -1,3 +1,4 @@ +import copy import logging import os import pathlib @@ -17,8 +18,7 @@ HTTPForbidden, HTTPNotFound, HTTPOk, - HTTPUnprocessableEntity, - HTTPUnsupportedMediaType + HTTPUnprocessableEntity ) from pyramid.settings import asbool @@ -43,76 +43,115 @@ ServiceNotFound, log_unhandled_exceptions ) -from weaver.formats import ContentType +from weaver.formats import ContentType, repr_json +from weaver.processes.convert import get_field, normalize_ordered_io, set_field from weaver.processes.types import ProcessType -from weaver.store.base import StoreProcesses, StoreServices +from weaver.store.base import StoreJobs, StoreProcesses, StoreServices from weaver.utils import ( + VersionFormat, + VersionLevel, + as_version_major_minor_patch, fully_qualified_name, generate_diff, + get_any_id, get_header, get_sane_name, get_settings, - get_url_without_query + get_url_without_query, + is_update_version ) from weaver.visibility import Visibility from weaver.wps.utils import get_wps_client from weaver.wps_restapi import swagger_definitions as sd -from weaver.wps_restapi.utils import get_wps_restapi_base_url +from weaver.wps_restapi.processes.utils import resolve_process_tag +from weaver.wps_restapi.utils import get_wps_restapi_base_url, parse_content LOGGER = logging.getLogger(__name__) if TYPE_CHECKING: from typing import List, Optional, Tuple, Union - from pyramid.request import Request - from weaver.typedefs import ( AnyHeadersContainer, AnyRegistryContainer, AnyRequestType, AnySettingsContainer, + AnyVersion, CWL, FileSystemPathType, JSON, + Literal, + PyramidRequest, + NotRequired, Number, - SettingsType + SettingsType, + TypedDict ) + UpdateFieldListMethod = Literal["append", "override"] + UpdateFieldListSpec = TypedDict("UpdateFieldListSpec", { + "source": str, + "target": NotRequired[str], + "unique": NotRequired[bool], + "method": UpdateFieldListMethod, + }, total=True) + UpdateFields = List[Union[str, UpdateFieldListMethod]] + # FIXME: # https://github.com/crim-ca/weaver/issues/215 # define common Exception classes that won't require this type of conversion -def get_process(process_id=None, request=None, settings=None, store=None): - # type: (Optional[str], Optional[Request], Optional[SettingsType], Optional[StoreProcesses]) -> Process +def get_process(process_id=None, request=None, settings=None, revision=True): + # type: (Optional[str], Optional[PyramidRequest], Optional[SettingsType], bool) -> Process """ Obtain the specified process and validate information, returning appropriate HTTP error if invalid. Process identifier must be provided from either the request path definition or literal ID. Database must be retrievable from either the request, underlying settings, or direct store reference. + .. versionchanged:: 4.20 + Process identifier can also be an 'id:version' tag. Also, the request query parameter 'version' can be used. + If using the :paramref:`process_id` explicitly instead of the request, a versioned :term:`Process` reference + MUST employ the tagged representation to resolve the appropriate :term:`Process` revision. + Different parameter combinations are intended to be used as needed or more appropriate, such that redundant operations can be reduced where some objects are already fetched from previous operations. + + :param process_id: Explicit :term:`Process` identifier to employ for lookup. + :param request: When no explicit ID specified, try to find information from the request. + :param settings: + Application settings for database connection. Can be guessed from local thread or request object if not given. + :param revision: + When parsing the :term:`Process` ID (either explicit or from request), indicate if any tagged revision + specifier should be used or dropped. """ - if process_id is None and request is not None: - process_id = request.matchdict.get("process_id") - if store is None: - store = get_db(settings or request).get_store(StoreProcesses) + store = get_db(settings or request).get_store(StoreProcesses) try: + if process_id is None and request is not None: + process_id = resolve_process_tag(request) + if not revision: + process_id = Process.split_version(process_id)[0] process = store.fetch_by_id(process_id, visibility=Visibility.PUBLIC) return process - except (InvalidIdentifierValue, MissingIdentifierValue) as ex: - raise HTTPBadRequest(str(ex)) + except (InvalidIdentifierValue, MissingIdentifierValue, colander.Invalid) as exc: + msg = getattr(exc, "msg", str(exc)) + raise HTTPBadRequest(json={ + "type": "InvalidIdentifierValue", + "title": "Process ID is invalid.", + "description": "Failed schema validation to retrieve process.", + "cause": f"Invalid schema: [{msg}]", + "error": exc.__class__.__name__, + "value": str(process_id) + }) except ProcessNotAccessible: raise HTTPForbidden(f"Process with ID '{process_id!s}' is not accessible.") except ProcessNotFound: raise ProcessNotFound(json={ "title": "NoSuchProcess", "type": "http://www.opengis.net/def/exceptions/ogcapi-processes-1/1.0/no-such-process", - "detail": "Process with specified reference identifier does not exist.", + "detail": sd.NotFoundProcessResponse.description, "status": ProcessNotFound.code, "cause": str(process_id) }) - except colander.Invalid as ex: - raise HTTPBadRequest(f"Invalid schema:\n[{ex!r}].") def map_progress(progress, range_min, range_max): @@ -257,37 +296,8 @@ def _validate_deploy_process_info(process_info, reference, package, settings, he raise HTTPUnprocessableEntity(detail=msg) -def _load_payload(payload, content_type): - # type: (Union[JSON, str], ContentType) -> Union[JSON, CWL] - """ - Load the request payload with validation of expected content type. - """ - try: - content_type = sd.DeployContentType().deserialize(content_type) - if isinstance(payload, str): - payload = yaml.safe_load(payload) - if not isinstance(payload, dict): - raise TypeError("Not a valid JSON body for process deployment.") - except colander.Invalid as exc: - raise HTTPUnsupportedMediaType(json={ - "title": "Unsupported Media Type", - "type": "UnsupportedMediaType", - "detail": str(exc), - "status": HTTPUnsupportedMediaType.code, - "cause": str(content_type), - }) - except Exception as exc: - raise HTTPBadRequest(json={ - "title": "Bad Request", - "type": "BadRequest", - "detail": "Unable to parse process deployment content.", - "status": HTTPBadRequest.code, - "cause": str(exc), - }) - return payload - - # FIXME: supported nested process and $graph multi-deployment (https://github.com/crim-ca/weaver/issues/56) +# see also: https://www.commonwl.org/v1.2/CommandLineTool.html#Packed_documents def resolve_cwl_graph(package): # type: (CWL) -> CWL if "$graph" in package and isinstance(package["$graph"], list) and len(package["$graph"]) == 1: @@ -298,7 +308,7 @@ def resolve_cwl_graph(package): def deploy_process_from_payload(payload, container, overwrite=False): # pylint: disable=R1260,too-complex - # type: (Union[JSON, str], Union[AnySettingsContainer, AnyRequestType], bool) -> HTTPException + # type: (Union[JSON, str], Union[AnySettingsContainer, AnyRequestType], Union[bool, Process]) -> HTTPException """ Deploy the process after resolution of all references and validation of the parameters from payload definition. @@ -309,7 +319,12 @@ def deploy_process_from_payload(payload, container, overwrite=False): # pylint: :param container: Container to retrieve application settings. If it is a ``request``-like object, additional parameters may be used to identify the payload schema. - :param overwrite: Whether to allow override of an existing process definition if conflict occurs. + :param overwrite: + In case of a pure deployment (from scratch), indicates (using :class:`bool`) whether to allow override of + an existing process definition if conflict occurs. No versioning is applied in this case (full replacement). + In case of an update deployment (from previous), indicates which process to be replaced with updated version. + The new version should not conflict with another existing process version. If payload doesn't provide a new + version, the following `MAJOR` version from the specified overwrite process is used to define the new revision. :returns: HTTPOk if the process registration was successful. :raises HTTPException: for any invalid process deployment step. """ @@ -317,7 +332,12 @@ def deploy_process_from_payload(payload, container, overwrite=False): # pylint: c_type = ContentType.get(get_header("Content-Type", headers), default=ContentType.APP_OGC_PKG_JSON) # use deepcopy of to remove any circular dependencies before writing to mongodb or any updates to the payload - payload = _load_payload(payload, c_type) + payload = parse_content( + request=None, + content=payload, + content_type=c_type, + content_type_schema=sd.DeployContentType, + ) payload_copy = deepcopy(payload) payload = _check_deploy(payload) @@ -346,7 +366,7 @@ def deploy_process_from_payload(payload, container, overwrite=False): # pylint: reference = content.get("href") found = isinstance(reference, str) elif c_type in (list(ContentType.ANY_CWL) + [ContentType.APP_JSON]) and "cwlVersion" in payload: - process_info = {} + process_info = {"version": payload.pop("version", None)} package = resolve_cwl_graph(payload) found = True else: # ogc-apppkg type, but no explicit check since used by default (backward compat) @@ -407,9 +427,6 @@ def deploy_process_from_payload(payload, container, overwrite=False): # pylint: process_info["owsContext"] = {"offering": {"content": {"href": str(reference)}}} elif isinstance(ows_context, dict): process_info["owsContext"] = ows_context - # bw-compat abstract/description (see: ProcessDeployment schema) - if "description" not in process_info or not process_info["description"]: - process_info["description"] = process_info.get("abstract", "") # if user provided additional links that have valid schema, # process them separately since links are generated dynamically from API settings per process # don't leave them there as they would be seen as if the 'Process' class generated the field @@ -418,32 +435,384 @@ def deploy_process_from_payload(payload, container, overwrite=False): # pylint: # FIXME: handle colander invalid directly in tween (https://github.com/crim-ca/weaver/issues/112) try: - store = get_db(container).get_store(StoreProcesses) - process = Process(process_info) - sd.ProcessSummary().deserialize(process) # make if fail before save if invalid - store.save_process(process, overwrite=overwrite) - process_summary = process.summary() - except ProcessRegistrationError as exc: - raise HTTPConflict(detail=str(exc)) + process = Process(process_info) # if 'version' was provided in deploy info, it will be added as hint here + if isinstance(overwrite, Process): + process_summary = _update_deploy_process_version(process, overwrite, VersionLevel.MAJOR, container) + else: + process_summary = _save_deploy_process(process, overwrite, container) except ValueError as exc: LOGGER.error("Failed schema validation of deployed process summary:\n%s", exc) raise HTTPBadRequest(detail=str(exc)) + except HTTPException: + raise + data = { + "description": sd.OkPostProcessesResponse.description, + "processSummary": process_summary, + "deploymentDone": True, + "links": process.links(container), + } + if deployment_profile_name: + data["deploymentProfileName"] = deployment_profile_name + return HTTPCreated(json=data) + + +def _save_deploy_process(process, override, container): + # type: (Process, bool, AnySettingsContainer) -> JSON + """ + Store the :class:`Process` to database with error handling and appropriate message reporting the problem. + """ + try: + sd.ProcessSummary().deserialize(process) # make it fail before save if invalid, then apply for real + db = get_db(container) + store = db.get_store(StoreProcesses) + new_process = store.save_process(process, overwrite=override) + process_summary = new_process.summary() + except ProcessRegistrationError as exc: + raise HTTPConflict(json={ + "type": "ProcessRegistrationError", + "title": "Process definition conflict.", + "detail": str(exc), + "status": HTTPConflict.code, + "cause": {"process_id": process.id}, + }) except colander.Invalid as exc: - LOGGER.error("Failed schema validation of deployed process summary:\n%s", exc) + LOGGER.error("Failed schema validation of updated process summary:\n%s", exc) raise HTTPBadRequest(json={ - "description": "Failed schema validation of deployed process summary.", + "description": "Failed schema validation of process summary.", "cause": f"Invalid schema: [{exc.msg or exc!s}]", "error": exc.__class__.__name__, "value": exc.value }) + return process_summary + + +def _update_deploy_process_version(process, process_overwrite, update_level, container=None): + # type: (Process, Process, VersionLevel, Optional[AnySettingsContainer]) -> JSON + """ + Handle all necessary update operations of a :term:`Process` definition. + + Validate that any specified version for :term:`Process` deployment is valid against any other existing versions. + Perform any necessary database adjustments to replace the old :term:`Process` references for the creation of the + updated :term:`Process` to ensure all versions and links remain valid against their original references. + + :param process: Desired new process definition. + :param process_overwrite: Old process from which update of the definition in database could be required. + :param update_level: + Minimum semantic version level required for this update operation. + If the new :term:`Process` definition did not provide a version explicitly, this level will be used to + automatically infer the following revision number based on the old :term:`Process` reference. + :param container: Any container to retrieve a database connection. + :returns: Process summary with definition retrieved from storage (saved) after all operations were applied. + :raises HTTPException: Relevant error is raised in the even of any erroneous process definition (old and new). + """ + if not process.mutable: + raise HTTPForbidden(json={ + "type": "ProcessImmutable", + "title": "Process immutable.", + "detail": "Cannot update an immutable process.", + "status": HTTPForbidden.code, + "cause": {"mutable": False} + }) + + if process.name != process_overwrite.name: + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Invalid process identifier.", + "detail": "Specified process identifier in payload definition does not match expected ID in request path.", + "status": HTTPBadRequest.code, + "cause": {"pathProcessID": process_overwrite.name, "bodyProcessID": process.name} + }) + + db = get_db(container) + store = db.get_store(StoreProcesses) + + # if no new version was specified, simply take the next one relevant for the update level + # then check that new version is within available range against target process for required update level + new_version = process.version if process.version else _bump_process_version(process_overwrite.version, update_level) + taken_versions = store.find_versions(process.id, VersionFormat.STRING) # string format for output if error + if not is_update_version(new_version, taken_versions, update_level): + new_version = as_version_major_minor_patch(new_version, VersionFormat.STRING) + ref_version = as_version_major_minor_patch(process_overwrite.version, VersionFormat.STRING) + if new_version in taken_versions: + http_error = HTTPConflict + message = "Process version conflicts with already taken revisions." + else: + http_error = HTTPUnprocessableEntity + message = "Semantic version is not of appropriate update level for requested changes." + raise http_error(json={ + "type": "InvalidParameterValue", + "title": "Invalid version value.", + "detail": message, + "status": http_error.code, + "cause": { + "revisions": taken_versions, + "reference": ref_version, + "version": new_version, + "change": update_level, + } + }) + + old_version = None + op_override = process_overwrite + try: + # if source process for update is not the latest, no need to rewrite 'id:version' since it is not only 'id' + # otherwise, replace latest process 'id' by explicit 'id:version' + if process_overwrite.latest: + pid_only = process_overwrite.name + old_version = process_overwrite.version + old_process = store.update_version(pid_only, old_version) + process_tag = old_process.tag + # since 'id' reference changes from old to new process, + # reflect the change in any job that could refer to it + job_store = db.get_store(StoreJobs) + n_updated = job_store.batch_update_jobs({"process": pid_only}, {"process": process_tag}) + LOGGER.debug("Updated %s jobs from process [%s] to old revision [%s]", n_updated, pid_only, process_tag) + op_override = False # make sure no conflict when saving process afterward + process.version = new_version + process_summary = _save_deploy_process(process, op_override, container) + # add more version information to already handled error to better report the real conflict of revisions if any + except (ProcessRegistrationError, HTTPConflict): + if old_version is not None: + old_version = as_version_major_minor_patch(old_version, VersionFormat.STRING) + if new_version is not None: + new_version = as_version_major_minor_patch(new_version, VersionFormat.STRING) + raise HTTPConflict(json={ + "type": "ProcessRegistrationError", + "title": "Process definition conflict.", + "detail": "Failed update of process conflicting with another definition or revision.", + "status": HTTPConflict.code, + "cause": {"process_id": process.id, "old_version": old_version, "new_version": new_version}, + }) + return process_summary + + +def _bump_process_version(version, update_level): + # type: (AnyVersion, VersionLevel) -> AnyVersion + """ + Obtain the relevant version with specified level incremented by one. + """ + new_version = list(as_version_major_minor_patch(version, VersionFormat.PARTS)) + if update_level == VersionLevel.PATCH: + new_version[2] += 1 + elif update_level == VersionLevel.MINOR: + new_version[1] += 1 + new_version[2] = 0 + elif update_level == VersionLevel.MAJOR: + new_version[0] += 1 + new_version[1] = 0 + new_version[2] = 0 + return new_version + + +def _apply_process_metadata(process, update_data): # pylint: disable=R1260,too-complex # FIXME + # type: (Process, JSON) -> VersionLevel + """ + Apply requested changes for update of the :term:`Process`. + + Assumes that update data was pre-validated with appropriate schema validation to guarantee relevant typings + and formats are applied for expected fields. Validation of fields metadata with their specific conditions is + accomplished when attempting to apply changes. + + .. seealso:: + Schema :class:`sd.PatchProcessBodySchema` describes specific field handling based on unspecified value, null + or empty-list. Corresponding update levels required for fields are also provided in this schema definition. + + :param process: Process to modify. Can be the latest or a previously tagged version. + :param update_data: Fields with updated data to apply to the process. + :return: Applicable update level based on updates to be applied. + """ + patch_update_fields = [ + "title", + "description", + dict(source="keywords", method="append", unique=True), + dict(source="metadata", method="append"), + dict(source="links", method="append", target="additional_links"), + ] + minor_update_fields = [ + dict(source="jobControlOptions", method="override", unique=True), + dict(source="outputTransmission", method="override", unique=True), + "visibility", + ] + update_level = VersionLevel.PATCH # metadata only, elevate to MINOR if corresponding fields changed + field = value = None # any last set item that raises an unexpected error can be reported in exception handler + + def _apply_change(data, dest, name, update_fields): + # type: (JSON, Union[Process, JSON], str, UpdateFields) -> bool + """ + Apply sub-changes to relevant destination container. + + :param data: New information changes to be applied. + :param dest: Target location to set new value changes. + :param name: Target location name for error reporting. + :param update_fields: Fields that can be updated, with extra specifications on how to handle them. + :return: Status indicating if any change was applied. + """ + nonlocal field, value # propagate outside function + + any_change = False + for source in update_fields: + target = source + method = None + unique = False + if isinstance(source, dict): + src = source["source"] + target = source.get("target", src) + method = source.get("method", None) + unique = source.get("unique", False) + source = src + value = data.get(source) + if value is None: + continue + field = f"{name}.{target}" + # list appends new content unless explicitly empty to reset + # list override always replace full content + if isinstance(value, list) and method == "append": + if not len(value): + current = get_field(dest, target, default=[]) + if current != value: + set_field(dest, target, []) + any_change = True + else: + current = get_field(dest, source, default=[]) + merged = copy.deepcopy(current) + merged.extend(value) + if unique: + merged = list(dict.fromkeys(merged)) # not set to preserve order + if current != merged: + set_field(dest, target, current) + any_change = True + else: + current = get_field(dest, target, default=None) + if unique and isinstance(value, list): + value = list(dict.fromkeys(value)) # not set to preserve order + if current != value: + set_field(dest, target, value) + any_change = True + return any_change + + try: + any_update_inputs = any_update_outputs = False + + inputs = update_data.get("inputs") + if inputs: + inputs_current = process.inputs + inputs_changes = normalize_ordered_io(inputs, order_hints=inputs_current) + inputs_updated = {get_any_id(i, pop=True): i for i in copy.deepcopy(inputs_current)} + inputs_allowed = list(inputs_updated) + for input_data in inputs_changes: + input_id = get_any_id(input_data) + if input_id not in inputs_allowed: + raise HTTPUnprocessableEntity(json={ + "type": "InvalidParameterValue", + "title": "Unknown input identifier.", + "detail": "Process update parameters specified an input unknown to this process.", + "status": HTTPUnprocessableEntity.code, + "cause": {"input.id": input_id, "inputs": inputs_allowed}, + }) + input_def = inputs_updated[input_id] + input_name = f"process.inputs[{input_id}]" + any_update_inputs |= _apply_change(input_data, input_def, input_name, patch_update_fields) + + # early exit if nothing was updated when fields were specified expecting something to be applied + # avoid potentially indicating that update was accomplished when it would not be + if not any_update_inputs: + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Failed process input parameter update.", + "detail": "Provided parameters not applicable for update or no changed values could be detected.", + "value": repr_json(update_data, force_string=False), + }) + field = "process.inputs" + value = normalize_ordered_io(inputs_updated, order_hints=inputs_current) + process.inputs = value + + outputs = update_data.get("outputs") + if outputs: + outputs_current = process.outputs + outputs_changes = normalize_ordered_io(outputs, order_hints=outputs_current) + outputs_updated = {get_any_id(o, pop=True): o for o in copy.deepcopy(outputs_current)} + outputs_allowed = list(outputs_updated) + for output_data in outputs_changes: + output_id = get_any_id(output_data) + if output_id not in outputs_allowed: + raise HTTPUnprocessableEntity(json={ + "type": "InvalidParameterValue", + "title": "Unknown output identifier.", + "detail": "Process update parameters specified an output unknown to this process.", + "status": HTTPUnprocessableEntity.code, + "cause": {"output.id": output_id, "outputs": outputs_allowed}, + }) + output_def = outputs_updated[output_id] + output_name = f"process.outputs[{output_id}]" + any_update_outputs |= _apply_change(output_data, output_def, output_name, patch_update_fields) + + # early exit if nothing was updated when fields were specified expecting something to be applied + # avoid potentially indicating that update was accomplished when it would not be + if not any_update_outputs: + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Failed process output parameter update.", + "detail": "Provided parameters not applicable for update or no changed values could be detected.", + "value": repr_json(update_data, force_string=False), + }) + field = "process.outputs" + value = normalize_ordered_io(outputs_updated, order_hints=outputs_current) + process.outputs = value + + any_update_process = _apply_change(update_data, process, "process", patch_update_fields) + any_update_minor = _apply_change(update_data, process, "process", minor_update_fields) + if any_update_minor: + update_level = VersionLevel.MINOR + + if not any((any_update_process, any_update_minor, any_update_inputs, any_update_outputs)): + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Failed process parameter update.", + "detail": "Provided parameters not applicable for update or no changed values could be detected.", + "value": repr_json(update_data, force_string=False), + }) + except HTTPException: + raise + except Exception as exc: + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Failed process parameter update.", + "detail": "Process update parameters failed validation or produced an error when applying change.", + "error": fully_qualified_name(exc), + "cause": {"message": str(exc), "field": field}, + "value": repr_json(value, force_string=False), + }) + + return update_level + + +def update_process_metadata(request): + # type: (AnyRequestType) -> HTTPException + """ + Update only MINOR or PATCH level :term:`Process` metadata. + + Desired new version can be eiter specified explicitly in request payload, or will be guessed accordingly to + detected changes to be applied. + """ + data = parse_content(request, content_schema=sd.PatchProcessBodySchema) + old_process = get_process(request=request) + new_process = copy.deepcopy(old_process) + update_level = _apply_process_metadata(new_process, data) + # apply the new version requested by the user, + # or make sure the old one is removed to avoid conflict + user_version = data.get("version") or None + if user_version: + new_process.version = user_version + else: + new_process.pop("version", None) + new_process.id = old_process.name # remove any version reference in ID + process_summary = _update_deploy_process_version(new_process, old_process, update_level, request) data = { - "description": sd.OkPostProcessesResponse.description, + "description": sd.OkPatchProcessResponse.description, "processSummary": process_summary, - "deploymentDone": True + "links": new_process.links(request), } - if deployment_profile_name: - data["deploymentProfileName"] = deployment_profile_name - return HTTPCreated(json=data) + return HTTPOk(json=data) def parse_wps_process_config(config_entry): diff --git a/weaver/processes/wps_testing.py b/weaver/processes/wps_testing.py index afad16af5..cb2e1c8d6 100644 --- a/weaver/processes/wps_testing.py +++ b/weaver/processes/wps_testing.py @@ -21,7 +21,7 @@ def __init__(self, **kw): # remove duplicates/unsupported keywords title = kw.pop("title", kw.get("identifier")) - version = kw.pop("version", "0.0") + version = kw.pop("version", "0.0.0") kw.pop("inputs", None) kw.pop("outputs", None) kw.pop("payload", None) diff --git a/weaver/sort.py b/weaver/sort.py index 9fd0702b4..6b08efc6f 100644 --- a/weaver/sort.py +++ b/weaver/sort.py @@ -12,6 +12,7 @@ class Sort(Constants): PRICE = "price" ID = "id" ID_LONG = "identifier" # long form employed by Processes in DB representation + VERSION = "version" class SortMethods(Constants): @@ -20,6 +21,7 @@ class SortMethods(Constants): Sort.ID_LONG, # will replace by short ID to conform with JSON representation Sort.PROCESS, # since listing processes, can be an alias to ID Sort.CREATED, + Sort.VERSION, ]) JOB = frozenset([ Sort.CREATED, diff --git a/weaver/store/base.py b/weaver/store/base.py index 1d3b08186..7a09f9bbb 100644 --- a/weaver/store/base.py +++ b/weaver/store/base.py @@ -1,9 +1,11 @@ import abc from typing import TYPE_CHECKING +from weaver.utils import VersionFormat + if TYPE_CHECKING: import datetime - from typing import Dict, List, Optional, Tuple, Union + from typing import Any, Dict, List, Optional, Tuple, Union from pyramid.request import Request from pywps import Process as ProcessWPS @@ -12,6 +14,7 @@ from weaver.execute import AnyExecuteResponse from weaver.typedefs import ( AnyUUID, + AnyVersion, ExecutionInputs, ExecutionOutputs, DatetimeIntervalType, @@ -90,6 +93,8 @@ def list_processes(self, limit=None, # type: Optional[int] sort=None, # type: Optional[str] total=False, # type: bool + revisions=False, # type: bool + process=None, # type: Optional[str] ): # type: (...) -> Union[List[Process], Tuple[List[Process], int]] raise NotImplementedError @@ -98,6 +103,16 @@ def fetch_by_id(self, process_id, visibility=None): # type: (str, Optional[AnyVisibility]) -> Process raise NotImplementedError + @abc.abstractmethod + def find_versions(self, process_id, version_format=VersionFormat.OBJECT): + # type: (str, VersionFormat) -> List[AnyVersion] + raise NotImplementedError + + @abc.abstractmethod + def update_version(self, process_id, version): + # type: (str, AnyVersion) -> Process + raise NotImplementedError + @abc.abstractmethod def get_visibility(self, process_id): # type: (str) -> AnyVisibility @@ -138,6 +153,11 @@ def save_job(self, ): # type: (...) -> Job raise NotImplementedError + @abc.abstractmethod + def batch_update_jobs(self, job_filter, job_update): + # type: (Dict[str, Any], Dict[str, Any]) -> int + raise NotImplementedError + @abc.abstractmethod def update_job(self, job): # type: (Job) -> Job diff --git a/weaver/store/mongodb.py b/weaver/store/mongodb.py index 109965ac7..9d6e8dcbe 100644 --- a/weaver/store/mongodb.py +++ b/weaver/store/mongodb.py @@ -1,14 +1,14 @@ """ Stores to read/write data to from/to `MongoDB` using pymongo. """ - +import copy import logging import uuid from typing import TYPE_CHECKING import pymongo -from pymongo import ASCENDING, DESCENDING from pymongo.collation import Collation +from pymongo.collection import ReturnDocument from pymongo.errors import DuplicateKeyError from pyramid.request import Request from pywps import Process as ProcessWPS @@ -43,7 +43,16 @@ from weaver.sort import Sort, SortMethods from weaver.status import JOB_STATUS_CATEGORIES, Status, map_status from weaver.store.base import StoreBills, StoreJobs, StoreProcesses, StoreQuotes, StoreServices, StoreVault -from weaver.utils import fully_qualified_name, get_base_url, get_sane_name, get_weaver_url, islambda, now +from weaver.utils import ( + VersionFormat, + as_version_major_minor_patch, + fully_qualified_name, + get_base_url, + get_sane_name, + get_weaver_url, + islambda, + now +) from weaver.visibility import Visibility from weaver.wps.utils import get_wps_url @@ -55,13 +64,21 @@ from weaver.execute import AnyExecuteResponse from weaver.processes.types import AnyProcessType from weaver.store.base import DatetimeIntervalType, JobGroupCategory, JobSearchResult - from weaver.typedefs import AnyProcess, AnyProcessClass, AnyUUID, AnyValueType, ExecutionInputs, ExecutionOutputs + from weaver.typedefs import ( + AnyProcess, + AnyProcessClass, + AnyUUID, + AnyValueType, + AnyVersion, + ExecutionInputs, + ExecutionOutputs + ) from weaver.visibility import AnyVisibility MongodbValue = Union[AnyValueType, datetime.datetime] - MongodbSearchFilter = Dict[str, Union[MongodbValue, List[MongodbValue], Dict[str, AnyValueType]]] - MongodbSearchStep = Union[MongodbValue, MongodbSearchFilter] - MongodbSearchPipeline = List[Dict[str, Union[str, Dict[str, MongodbSearchStep]]]] + MongodbAggregateExpression = Dict[str, Union[MongodbValue, List[MongodbValue], Dict[str, AnyValueType]]] + MongodbAggregateStep = Union[MongodbValue, MongodbAggregateExpression] + MongodbAggregatePipeline = List[Dict[str, Union[str, Dict[str, MongodbAggregateStep]]]] LOGGER = logging.getLogger(__name__) @@ -190,7 +207,7 @@ def clear_services(self): class ListingMixin(object): @staticmethod def _apply_paging_pipeline(page, limit): - # type: (Optional[int], Optional[int]) -> List[MongodbSearchStep] + # type: (Optional[int], Optional[int]) -> List[MongodbAggregateStep] if isinstance(page, int) and isinstance(limit, int): return [{"$skip": page * limit}, {"$limit": limit}] if page is None and isinstance(limit, int): @@ -199,7 +216,7 @@ def _apply_paging_pipeline(page, limit): @staticmethod def _apply_sort_method(sort_field, sort_default, sort_allowed): - # type: (Optional[str], str, List[str]) -> MongodbSearchFilter + # type: (Optional[str], str, List[str]) -> MongodbAggregateExpression sort = sort_field # keep original sort field in case of error if sort is None: sort = sort_default @@ -211,12 +228,12 @@ def _apply_sort_method(sort_field, sort_default, sort_allowed): "cause": "sort", "value": str(sort_field), }) - sort_order = DESCENDING if sort in (Sort.FINISHED, Sort.CREATED) else ASCENDING + sort_order = pymongo.DESCENDING if sort in (Sort.FINISHED, Sort.CREATED) else pymongo.ASCENDING return {sort: sort_order} @staticmethod def _apply_total_result(search_pipeline, extra_pipeline): - # type: (MongodbSearchPipeline, MongodbSearchPipeline) -> MongodbSearchPipeline + # type: (MongodbAggregatePipeline, MongodbAggregatePipeline) -> MongodbAggregatePipeline """ Extends the pipeline operations in order to obtain the grand total of matches in parallel to other filtering. @@ -442,11 +459,21 @@ def delete_process(self, process_id, visibility=None): If ``visibility=None``, the process is deleted (if existing) regardless of its visibility value. """ - sane_name = get_sane_name(process_id, **self.sane_name_config) - process = self.fetch_by_id(sane_name, visibility=visibility) + process = self.fetch_by_id(process_id, visibility=visibility) # ensure accessible before delete if not process: - raise ProcessNotFound(f"Process '{sane_name}' could not be found.") - return bool(self.collection.delete_one({"identifier": sane_name}).deleted_count) + raise ProcessNotFound(f"Process '{process_id}' could not be found.") + revisions = self.find_versions(process_id, VersionFormat.STRING) + search, _ = self._get_revision_search(process_id) + status = bool(self.collection.delete_one(search).deleted_count) + if not status or not len(revisions) > 1 or not process.version: + return status + # if process was the latest revision, fallback to previous one as new latest + version = as_version_major_minor_patch(process.version, VersionFormat.STRING) + if version == revisions[-1]: + latest = revisions[-2] # prior version + proc_latest = f"{process_id}:{latest}" + self.revert_latest(proc_latest) + return status def list_processes(self, visibility=None, # type: Optional[AnyVisibility, List[AnyVisibility]] @@ -454,20 +481,33 @@ def list_processes(self, limit=None, # type: Optional[int] sort=None, # type: Optional[str] total=False, # type: bool + revisions=False, # type: bool + process=None, # type: Optional[str] ): # type: (...) -> Union[List[Process], Tuple[List[Process], int]] """ Lists all processes in database, optionally filtered by `visibility`. :param visibility: One or many value amongst :class:`Visibility`. - :param page: page number to return when using result paging. - :param limit: number of processes per page when using result paging. - :param sort: field which is used for sorting results (default: process ID, descending). - :param total: request the total number of processes to be calculated (ignoring paging). + :param page: Page number to return when using result paging. + :param limit: Number of processes per page when using result paging. + :param sort: Field which is used for sorting results (default: process ID, descending). + :param total: Request the total number of processes to be calculated (ignoring paging). + :param revisions: Include all process revisions instead of only latest ones. + :param process: Limit results only to specified process ID (makes sense mostly when combined with revisions). :returns: List of sorted, and possibly page-filtered, processes matching queries. If ``total`` was requested, return a tuple of this list and the number of processes. """ - search_filters = {} + search_filters = {} # type: MongodbAggregateExpression + + if process and revisions: + search_filters["identifier"] = {"$regex": rf"^{process}(:.*)?$"} # revisions of that process + elif process and not revisions: + search_filters["identifier"] = process # not very relevant 'listing', but valid (explicit ID) + elif not process and not revisions: + search_filters["identifier"] = {"$regex": r"^[\w\-]+$"} # exclude ':' to keep only latest (default) + # otherwise, last case returns 'everything', so nothing to 'filter' + if visibility is None: visibility = Visibility.values() if not isinstance(visibility, list): @@ -477,6 +517,7 @@ def list_processes(self, if vis not in Visibility: raise ValueError(f"Invalid visibility value '{v!s}' is not one of {list(Visibility.values())!s}") search_filters["visibility"] = {"$in": list(visibility)} + insert_fields = [] # type: MongodbAggregatePipeline # processes do not have 'created', but ObjectID in '_id' has the particularity of embedding creation time if sort == Sort.CREATED: @@ -485,9 +526,25 @@ def list_processes(self, if sort in [Sort.ID, Sort.PROCESS]: sort = Sort.ID_LONG sort_allowed = list(SortMethods.PROCESS) + ["_id"] - sort_method = {"$sort": self._apply_sort_method(sort, Sort.ID_LONG, sort_allowed)} - - search_pipeline = [{"$match": search_filters}, sort_method] + sort_fields = self._apply_sort_method(sort, Sort.ID_LONG, sort_allowed) + if revisions and sort in [Sort.ID, Sort.ID_LONG, Sort.PROCESS]: + # If listing many revisions, sort by version on top of ID to make listing more natural. + # Because the "latest version" is saved with 'id' only while "older revisions" are saved with 'id:version', + # that more recent version would always appear first since alphabetical sort: 'id' (latest) < 'id:version'. + # Work around this by dynamically reassigning 'id' by itself. + insert_fields = [ + {"$set": {"tag": {"$cond": { + "if": {"identifier": "/^.*:.*$/"}, + "then": "$identifier", + "else": {"$concat": ["$identifier", ":", "$version"]}, + }}}}, + {"$set": {"id_version": {"$split": ["$tag", ":"]}}}, + {"$set": {"identifier": {"$arrayElemAt": ["$id_version", 0]}}}, + ] + sort_fields = {"identifier": pymongo.ASCENDING, "version": pymongo.ASCENDING} + sort_method = [{"$sort": sort_fields}] + + search_pipeline = insert_fields + [{"$match": search_filters}] + sort_method paging_pipeline = self._apply_paging_pipeline(page, limit) if total: pipeline = self._apply_total_result(search_pipeline, paging_pipeline) @@ -502,6 +559,24 @@ def list_processes(self, return items, total return [Process(item) for item in found] + def _get_revision_search(self, process_id): + # type: (str) -> Tuple[MongodbAggregateExpression, Optional[str]] + """ + Obtain the search criteria and version of the specified :term:`Process` ID if it specified a revision tag. + + :return: Database search operation and the matched version as string. + """ + version = None + if ":" in process_id: + process_id, version = Process.split_version(process_id) + sane_name = get_sane_name(process_id, **self.sane_name_config) + search = {"identifier": sane_name} + if version: + version = as_version_major_minor_patch(version, VersionFormat.STRING) # make sure it is padded + sane_tag = sane_name + ":" + version + search = {"$or": [{"identifier": sane_tag}, {"identifier": sane_name, "version": version}]} + return search, version + def fetch_by_id(self, process_id, visibility=None): # type: (str, Optional[AnyVisibility]) -> Process """ @@ -509,17 +584,86 @@ def fetch_by_id(self, process_id, visibility=None): If ``visibility=None``, the process is retrieved (if existing) regardless of its visibility value. - :param process_id: process identifier - :param visibility: one value amongst :py:mod:`weaver.visibility`. + :param process_id: Process identifier (optionally with version tag). + :param visibility: One value amongst :py:mod:`weaver.visibility`. :return: An instance of :class:`weaver.datatype.Process`. """ - sane_name = get_sane_name(process_id, **self.sane_name_config) - process = self.collection.find_one({"identifier": sane_name}) + search, version = self._get_revision_search(process_id) + process = self.collection.find_one(search) if not process: - raise ProcessNotFound(f"Process '{sane_name}' could not be found.") + raise ProcessNotFound(f"Process '{process_id}' could not be found.") process = Process(process) + if version: + process.version = version # ensure version was applied just in case if visibility is not None and process.visibility != visibility: - raise ProcessNotAccessible(f"Process '{sane_name}' cannot be accessed.") + raise ProcessNotAccessible(f"Process '{process_id}' cannot be accessed.") + return process + + def find_versions(self, process_id, version_format=VersionFormat.OBJECT): + # type: (str, VersionFormat) -> List[AnyVersion] + """ + Retrieves all existing versions of a given process. + """ + process_id = Process.split_version(process_id)[0] # version never needed to fetch all revisions + sane_name = get_sane_name(process_id, **self.sane_name_config) + version_name = rf"^{sane_name}(:[0-9]+\.[0-9]+.[0-9]+)?$" + versions = self.collection.find( + filter={"identifier": {"$regex": version_name}}, + projection={"_id": False, "version": True}, + sort=[(Sort.VERSION, pymongo.ASCENDING)], + ) + return [as_version_major_minor_patch(ver["version"], version_format) for ver in versions] + + def update_version(self, process_id, version): + # type: (str, AnyVersion) -> Process + """ + Updates the specified (latest) process ID to become an older revision. + + .. seealso:: + Use :meth:`revert_latest` for the inverse operation. + + :returns: Updated process definition with older revision. + """ + sane_name = get_sane_name(process_id, **self.sane_name_config) + version = as_version_major_minor_patch(version, VersionFormat.STRING) + # update ID to allow direct fetch by ID using tagged version + # this also clears the unique ID index requirement + new_name = sane_name + ":" + version + process = self.collection.find_one_and_update( + filter={"identifier": sane_name}, + update={"$set": {"identifier": new_name, "version": version}}, + return_document=ReturnDocument.AFTER, + ) + if not process: + raise ProcessNotFound(f"Process '{sane_name}' could not be found for version update.") + process = Process(process) + return process + + def revert_latest(self, process_id): + # type: (str) -> Process + """ + Makes the specified (older) revision process the new latest revision. + + Assumes there are no active *latest* in storage. If one is still defined, it will generate a conflict. + The process ID must also contain a tagged revision. Failing to provide a version will fail the operation. + + .. seealso:: + Use :meth:`update_version` for the inverse operation. + + :returns: Updated process definition with older revision. + """ + search, version = self._get_revision_search(process_id) + if not version: + raise ProcessNotFound(f"Process '{process_id}' missing version part to revert as latest.") + p_name = Process.split_version(process_id)[0] + process = self.collection.find_one_and_update( + filter=search, + update={"$set": {"identifier": p_name, "version": version}}, + return_document=ReturnDocument.AFTER, + ) + if not process: + raise ProcessNotFound(f"Process '{process_id}' could not be found for revert as latest.") + process = Process(process) return process def get_visibility(self, process_id): @@ -527,7 +671,7 @@ def get_visibility(self, process_id): """ Get `visibility` of a process. - :return: One value amongst `weaver.visibility`. + :returns: One value amongst `weaver.visibility`. """ process = self.fetch_by_id(process_id) return process.visibility @@ -631,6 +775,32 @@ def save_job(self, raise JobRegistrationError("Failed to retrieve registered job.") return job + def batch_update_jobs(self, job_filter, job_update): + # type: (Dict[str, Any], Dict[str, Any]) -> int + """ + Update specified fields of matched jobs against filters. + + :param job_update: Fields and values to update on matched jobs. + :param job_filter: Fields to filter jobs to be updated. + :return: Number of affected jobs. + """ + filter_keys = list(Job.properties()) + job_update = copy.deepcopy(job_update) + for job_key in list(job_update): + if job_key not in filter_keys: + job_update.pop(job_key) + job_filter = copy.deepcopy(job_filter) + for job_key in list(job_filter): + if job_key not in filter_keys: + job_filter.pop(job_key) + if not job_update: + raise JobUpdateError("No job parameters specified to apply update.") + job_update = {"$set": job_update} + LOGGER.debug("Batch jobs update:\nfilter:\n%s\nupdate:\n%s", + repr_json(job_filter, indent=2), repr_json(job_update, indent=2)) + result = self.collection.update_many(filter=job_filter, update=job_update) + return result.modified_count + def update_job(self, job): # type: (Job) -> Job """ @@ -677,7 +847,7 @@ def list_jobs(self): For user-specific access to available jobs, use :meth:`MongodbJobStore.find_jobs` instead. """ jobs = [] - for job in self.collection.find().sort("id", ASCENDING): + for job in self.collection.find().sort(Sort.ID, pymongo.ASCENDING): jobs.append(Job(job)) return jobs @@ -773,7 +943,7 @@ def find_jobs(self, return results def _find_jobs_grouped(self, pipeline, group_categories): - # type: (MongodbSearchPipeline, List[str]) -> Tuple[JobGroupCategory, int] + # type: (MongodbAggregatePipeline, List[str]) -> Tuple[JobGroupCategory, int] """ Retrieves jobs regrouped by specified field categories and predefined search pipeline filters. """ @@ -811,7 +981,7 @@ def _find_jobs_grouped(self, pipeline, group_categories): return items, total def _find_jobs_paging(self, search_pipeline, page, limit): - # type: (MongodbSearchPipeline, Optional[int], Optional[int]) -> Tuple[List[Job], int] + # type: (MongodbAggregatePipeline, Optional[int], Optional[int]) -> Tuple[List[Job], int] """ Retrieves jobs limited by specified paging parameters and predefined search pipeline filters. """ @@ -840,7 +1010,7 @@ def _apply_tags_filter(tags): @staticmethod def _apply_access_filter(access, request): - # type: (AnyVisibility, Request) -> MongodbSearchFilter + # type: (AnyVisibility, Request) -> MongodbAggregateExpression search_filters = {} if not request: search_filters["access"] = Visibility.PUBLIC @@ -862,9 +1032,9 @@ def _apply_access_filter(access, request): @staticmethod def _apply_ref_or_type_filter(job_type, process, service): - # type: (Optional[str], Optional[str], Optional[str]) -> MongodbSearchFilter + # type: (Optional[str], Optional[str], Optional[str]) -> MongodbAggregateExpression - search_filters = {} # type: MongodbSearchFilter + search_filters = {} # type: MongodbAggregateExpression if job_type == "process": search_filters["service"] = None elif job_type == "provider": @@ -890,8 +1060,8 @@ def _apply_ref_or_type_filter(job_type, process, service): @staticmethod def _apply_status_filter(status): - # type: (Optional[str]) -> MongodbSearchFilter - search_filters = {} # type: MongodbSearchFilter + # type: (Optional[str]) -> MongodbAggregateExpression + search_filters = {} # type: MongodbAggregateExpression if status in JOB_STATUS_CATEGORIES: category_statuses = list(JOB_STATUS_CATEGORIES[status]) search_filters["status"] = {"$in": category_statuses} @@ -901,7 +1071,7 @@ def _apply_status_filter(status): @staticmethod def _apply_datetime_filter(datetime_interval): - # type: (Optional[DatetimeIntervalType]) -> MongodbSearchFilter + # type: (Optional[DatetimeIntervalType]) -> MongodbAggregateExpression search_filters = {} if datetime_interval is not None: if datetime_interval.get("after", False): @@ -918,7 +1088,7 @@ def _apply_datetime_filter(datetime_interval): @staticmethod def _apply_duration_filter(pipeline, min_duration, max_duration): - # type: (MongodbSearchPipeline, Optional[int], Optional[int]) -> MongodbSearchPipeline + # type: (MongodbAggregatePipeline, Optional[int], Optional[int]) -> MongodbAggregatePipeline """ Generate the filter required for comparing against :meth:`Job.duration`. @@ -1029,7 +1199,7 @@ def list_quotes(self): Lists all quotes in `MongoDB` storage. """ quotes = [] - for quote in self.collection.find().sort("id", ASCENDING): + for quote in self.collection.find().sort("id", pymongo.ASCENDING): quotes.append(Quote(quote)) return quotes @@ -1051,7 +1221,7 @@ def find_quotes(self, process_id=None, page=0, limit=10, sort=None): if sort not in SortMethods.QUOTE: raise QuoteNotFound(f"Invalid sorting method: '{sort!s}'") - sort_order = ASCENDING + sort_order = pymongo.ASCENDING sort_criteria = [(sort, sort_order)] found = self.collection.find(search_filters) count = found.count() @@ -1105,7 +1275,7 @@ def list_bills(self): Lists all bills in `MongoDB` storage. """ bills = [] - for bill in self.collection.find().sort("id", ASCENDING): + for bill in self.collection.find().sort(Sort.ID, pymongo.ASCENDING): bills.append(Bill(bill)) return bills @@ -1127,7 +1297,7 @@ def find_bills(self, quote_id=None, page=0, limit=10, sort=None): if sort not in SortMethods.BILL: raise BillNotFound(f"Invalid sorting method: '{sort!r}'") - sort_order = ASCENDING + sort_order = pymongo.ASCENDING sort_criteria = [(sort, sort_order)] found = self.collection.find(search_filters) count = found.count() diff --git a/weaver/typedefs.py b/weaver/typedefs.py index 54c5d9c03..22156c1c6 100644 --- a/weaver/typedefs.py +++ b/weaver/typedefs.py @@ -1,31 +1,20 @@ from typing import TYPE_CHECKING # pragma: no cover +# FIXME: +# replace invalid 'Optional' (type or None) used instead of 'NotRequired' (optional key) when better supported +# https://youtrack.jetbrains.com/issue/PY-53611/Support-PEP-655-typingRequiredtypingNotRequired-for-TypedDicts if TYPE_CHECKING: import os import sys import typing import uuid from datetime import datetime + from distutils.version import LooseVersion from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple, Type, Union import psutil + from typing_extensions import Literal, NotRequired, Protocol, TypeAlias, TypedDict - if hasattr(typing, "TypedDict"): - from typing import TypedDict # pylint: disable=E0611,no-name-in-module # Python >= 3.8 - else: - from typing_extensions import TypedDict - if hasattr(typing, "Literal"): - from typing import Literal # pylint: disable=E0611,no-name-in-module # Python >= 3.8 - else: - from typing_extensions import Literal - if hasattr(typing, "Protocol"): - from typing import Protocol # pylint: disable=E0611,no-name-in-module # Python >= 3.8 - else: - from typing_extensions import Protocol - if hasattr(typing, "TypeAlias"): - from typing import TypeAlias # pylint: disable=E0611,no-name-in-module # Python >= 3.10 - else: - from typing_extensions import TypeAlias if hasattr(os, "PathLike"): FileSystemPathType = Union[os.PathLike, str] else: @@ -84,6 +73,7 @@ AnyValueType = Optional[ValueType] # avoid naming ambiguity with PyWPS AnyValue AnyKey = Union[str, int] AnyUUID = Union[str, uuid.UUID] + AnyVersion = Union[LooseVersion, Number, str, Tuple[int, ...], List[int]] # add more levels of explicit definitions than necessary to simulate JSON recursive structure better than 'Any' # amount of repeated equivalent definition makes typing analysis 'work well enough' for most use cases _JSON: TypeAlias = "JSON" @@ -98,15 +88,15 @@ "rel": str, "title": str, "href": str, - "hreflang": Optional[str], - "type": Optional[str], # IANA Media-Type + "hreflang": NotRequired[str], + "type": NotRequired[str], # IANA Media-Type }, total=False) Metadata = TypedDict("Metadata", { "title": str, "role": str, # URL "value": str, - "lang": str, - "type": str, # FIXME: relevant? + "lang": NotRequired[str], + "type": NotRequired[str], # FIXME: relevant? }, total=False) LogLevelStr = Literal[ @@ -117,7 +107,11 @@ # CWL definition GlobType = TypedDict("GlobType", {"glob": Union[str, List[str]]}, total=False) - CWL_IO_FileValue = TypedDict("CWL_IO_FileValue", {"class": str, "path": str, "format": Optional[str]}, total=True) + CWL_IO_FileValue = TypedDict("CWL_IO_FileValue", { + "class": str, + "path": str, + "format": NotRequired[Optional[str]], + }, total=True) CWL_IO_Value = Union[AnyValueType, List[AnyValueType], CWL_IO_FileValue, List[CWL_IO_FileValue]] CWL_IO_NullableType = Union[str, List[str]] # "?" or ["", "null"] CWL_IO_NestedType = TypedDict("CWL_IO_NestedType", {"type": CWL_IO_NullableType}, total=True) @@ -133,21 +127,21 @@ CWL_IO_TypeItem = Union[str, CWL_IO_NestedType, CWL_IO_ArrayType, CWL_IO_EnumType] CWL_IO_DataType = Union[CWL_IO_TypeItem, List[CWL_IO_TypeItem]] CWL_Input_Type = TypedDict("CWL_Input_Type", { - "id": Optional[str], # representation used by plain CWL definition - "name": Optional[str], # representation used by parsed tool instance + "id": NotRequired[str], # representation used by plain CWL definition + "name": NotRequired[str], # representation used by parsed tool instance "type": CWL_IO_DataType, - "items": Union[str, CWL_IO_EnumType], - "symbols": Optional[CWL_IO_EnumSymbols], - "format": Optional[Union[str, List[str]]], - "inputBinding": Optional[Any], - "default": Optional[AnyValueType], + "items": NotRequired[Union[str, CWL_IO_EnumType]], + "symbols": NotRequired[CWL_IO_EnumSymbols], + "format": NotRequired[Optional[Union[str, List[str]]]], + "inputBinding": NotRequired[Any], + "default": NotRequired[Optional[AnyValueType]], }, total=False) CWL_Output_Type = TypedDict("CWL_Output_Type", { - "id": Optional[str], # representation used by plain CWL definition - "name": Optional[str], # representation used by parsed tool instance + "id": NotRequired[str], # representation used by plain CWL definition + "name": NotRequired[str], # representation used by parsed tool instance "type": CWL_IO_DataType, - "format": Optional[Union[str, List[str]]], - "outputBinding": Optional[GlobType] + "format": NotRequired[Optional[Union[str, List[str]]]], + "outputBinding": NotRequired[GlobType] }, total=False) CWL_Inputs = Union[List[CWL_Input_Type], Dict[str, CWL_Input_Type]] CWL_Outputs = Union[List[CWL_Output_Type], Dict[str, CWL_Output_Type]] @@ -180,20 +174,21 @@ "class": CWL_Class, "label": str, "doc": str, - "id": Optional[str], + "id": NotRequired[str], + "intent": NotRequired[str], "s:keywords": List[str], - "baseCommand": Optional[Union[str, List[str]]], - "parameters": Optional[List[str]], - "requirements": CWL_AnyRequirements, - "hints": CWL_AnyRequirements, + "baseCommand": NotRequired[Optional[Union[str, List[str]]]], + "parameters": NotRequired[List[str]], + "requirements": NotRequired[CWL_AnyRequirements], + "hints": NotRequired[CWL_AnyRequirements], "inputs": CWL_Inputs, "outputs": CWL_Outputs, - "steps": Dict[CWL_WorkflowStepID, CWL_WorkflowStep], - "stderr": str, - "stdout": str, - "$namespaces": Dict[str, str], - "$schemas": Dict[str, str], - "$graph": CWL_Graph, + "steps": NotRequired[Dict[CWL_WorkflowStepID, CWL_WorkflowStep]], + "stderr": NotRequired[str], + "stdout": NotRequired[str], + "$namespaces": NotRequired[Dict[str, str]], + "$schemas": NotRequired[Dict[str, str]], + "$graph": NotRequired[CWL_Graph], }, total=False) CWL_WorkflowStepPackage = TypedDict("CWL_WorkflowStepPackage", { "id": str, # reference ID of the package @@ -220,12 +215,12 @@ CWL_RuntimeOutputFile = TypedDict("CWL_RuntimeOutputFile", { "class": str, "location": str, - "format": Optional[str], + "format": NotRequired[Optional[str]], "basename": str, "nameroot": str, "nameext": str, - "checksum": Optional[str], - "size": Optional[str] + "checksum": NotRequired[str], + "size": NotRequired[str] }, total=False) CWL_RuntimeInput = Union[CWL_RuntimeLiteral, CWL_RuntimeInputFile] CWL_RuntimeInputsMap = Dict[str, CWL_RuntimeInput] @@ -291,39 +286,41 @@ def __call__(self, message: str, progress: Number, status: AnyStatusType, *args: # data source configuration DataSourceFileRef = TypedDict("DataSourceFileRef", { - "ades": str, # target ADES to dispatch - "netloc": str, # definition to match file references against - "default": Optional[bool], # default ADES when no match was possible (single one allowed in config) + "ades": str, # target ADES to dispatch + "netloc": str, # definition to match file references against + "default": NotRequired[bool], # default ADES when no match was possible (single one allowed in config) }, total=True) DataSourceOpenSearch = TypedDict("DataSourceOpenSearch", { - "ades": str, # target ADES to dispatch - "netloc": str, # where to send OpenSearch request - "collection_id": Optional[str], # OpenSearch collection ID to match against - "default": Optional[bool], # default ADES when no match was possible (single one allowed) - "accept_schemes": Optional[List[str]], # allowed URL schemes (http, https, etc.) - "mime_types": Optional[List[str]], # allowed Media-Types (text/xml, application/json, etc.) - "rootdir": str, # root position of the data to retrieve - "osdd_url": str, # global OpenSearch description document to employ + "ades": str, # target ADES to dispatch + "netloc": str, # where to send OpenSearch request + "collection_id": NotRequired[str], # OpenSearch collection ID to match against + "default": NotRequired[bool], # default ADES when no match was possible (single one allowed) + "accept_schemes": NotRequired[List[str]], # allowed URL schemes (http, https, etc.) + "mime_types": NotRequired[List[str]], # allowed Media-Types (text/xml, application/json, etc.) + "rootdir": str, # root position of the data to retrieve + "osdd_url": str, # global OpenSearch description document to employ }, total=True) DataSource = Union[DataSourceFileRef, DataSourceOpenSearch] DataSourceConfig = Dict[str, DataSource] # JSON/YAML file contents JobValueFormat = TypedDict("JobValueFormat", { - "mime_type": Optional[str], - "media_type": Optional[str], - "encoding": Optional[str], - "schema": Optional[str], - "extension": Optional[str], + "mime_type": NotRequired[str], + "media_type": NotRequired[str], + "encoding": NotRequired[str], + "schema": NotRequired[str], + "extension": NotRequired[str], }, total=False) JobValueFile = TypedDict("JobValueFile", { - "href": Optional[str], - "format": Optional[JobValueFormat], + "href": str, + "format": NotRequired[JobValueFormat], }, total=False) JobValueData = TypedDict("JobValueData", { - "data": Optional[AnyValueType], - "value": Optional[AnyValueType], + "data": AnyValueType, + }, total=False) + JobValueValue = TypedDict("JobValueValue", { + "value": AnyValueType, }, total=False) - JobValueObject = Union[JobValueData, JobValueFile] + JobValueObject = Union[JobValueData, JobValueValue, JobValueFile] JobValueFileItem = TypedDict("JobValueFileItem", { "id": str, "href": Optional[str], @@ -331,8 +328,11 @@ def __call__(self, message: str, progress: Number, status: AnyStatusType, *args: }, total=False) JobValueDataItem = TypedDict("JobValueDataItem", { "id": str, - "data": Optional[AnyValueType], - "value": Optional[AnyValueType], + "data": AnyValueType, + }, total=False) + JobValueValueItem = TypedDict("JobValueValueItem", { + "id": str, + "value": AnyValueType, }, total=False) JobValueItem = Union[JobValueDataItem, JobValueFileItem] JobExpectItem = TypedDict("JobExpectItem", {"id": str}, total=True) @@ -355,13 +355,18 @@ def __call__(self, message: str, progress: Number, status: AnyStatusType, *args: ExecutionOutputsList = List[ExecutionOutputItem] ExecutionOutputsMap = Dict[str, ExecutionOutputObject] ExecutionOutputs = Union[ExecutionOutputsList, ExecutionOutputsMap] - ExecutionResultObject = TypedDict("ExecutionResultObject", { - "value": Optional[AnyValueType], + ExecutionResultObjectRef = TypedDict("ExecutionResultObjectRef", { "href": Optional[str], - "type": Optional[str], + "type": NotRequired[str], + }, total=False) + ExecutionResultObjectValue = TypedDict("ExecutionResultObjectValue", { + "value": Optional[AnyValueType], + "type": NotRequired[str], }, total=False) + ExecutionResultObject = Union[ExecutionResultObjectRef, ExecutionResultObjectValue] ExecutionResultArray = List[ExecutionResultObject] - ExecutionResults = Dict[str, Union[ExecutionResultObject, ExecutionResultArray]] + ExecutionResultValue = Union[ExecutionResultObject, ExecutionResultArray] + ExecutionResults = Dict[str, ExecutionResultValue] # reference employed as 'JobMonitorReference' by 'WPS1Process' JobExecution = TypedDict("JobExecution", {"execution": WPSExecution}) @@ -397,8 +402,8 @@ def __call__(self, message: str, progress: Number, status: AnyStatusType, *args: "sizeBytes": int, }, total=True) Statistics = TypedDict("Statistics", { - "application": Optional[ApplicationStatistics], - "process": Optional[ProcessStatistics], + "application": NotRequired[ApplicationStatistics], + "process": NotRequired[ProcessStatistics], "outputs": Dict[str, OutputStatistics], }, total=False) @@ -418,36 +423,36 @@ def __call__(self, message: str, progress: Number, status: AnyStatusType, *args: }, total=False) OpenAPISchemaProperty = TypedDict("OpenAPISchemaProperty", { "type": OpenAPISchemaTypes, - "format": str, - "default": Any, - "example": Any, - "title": str, - "description": str, - "enum": List[Union[str, Number]], - "items": List[_OpenAPISchema, OpenAPISchemaReference], - "required": List[str], - "nullable": bool, - "deprecated": bool, - "readOnly": bool, - "writeOnly": bool, - "multipleOf": Number, - "minimum": Number, - "maximum": Number, - "exclusiveMinimum": bool, - "exclusiveMaximum": bool, - "minLength": Number, - "maxLength": Number, - "pattern": str, - "minItems": Number, - "maxItems": Number, - "uniqueItems": bool, - "minProperties": Number, - "maxProperties": Number, - "contentMediaType": str, - "contentEncoding": str, - "contentSchema": str, - "properties": Dict[str, _OpenAPISchemaProperty], - "additionalProperties": Union[bool, Dict[str, Union[_OpenAPISchema, OpenAPISchemaReference]]], + "format": NotRequired[str], + "default": NotRequired[Any], + "example": NotRequired[Any], + "title": NotRequired[str], + "description": NotRequired[str], + "enum": NotRequired[List[Union[str, Number]]], + "items": NotRequired[List[_OpenAPISchema, OpenAPISchemaReference]], + "required": NotRequired[List[str]], + "nullable": NotRequired[bool], + "deprecated": NotRequired[bool], + "readOnly": NotRequired[bool], + "writeOnly": NotRequired[bool], + "multipleOf": NotRequired[Number], + "minimum": NotRequired[Number], + "maximum": NotRequired[Number], + "exclusiveMinimum": NotRequired[bool], + "exclusiveMaximum": NotRequired[bool], + "minLength": NotRequired[Number], + "maxLength": NotRequired[Number], + "pattern": NotRequired[str], + "minItems": NotRequired[Number], + "maxItems": NotRequired[Number], + "uniqueItems": NotRequired[bool], + "minProperties": NotRequired[Number], + "maxProperties": NotRequired[Number], + "contentMediaType": NotRequired[str], + "contentEncoding": NotRequired[str], + "contentSchema": NotRequired[str], + "properties": NotRequired[Dict[str, _OpenAPISchemaProperty]], + "additionalProperties": NotRequired[Union[bool, Dict[str, Union[_OpenAPISchema, OpenAPISchemaReference]]]], }, total=False) OpenAPISchemaObject = TypedDict("OpenAPISchemaObject", { "type": Literal["object"], diff --git a/weaver/utils.py b/weaver/utils.py index 33cede80d..4ce092333 100644 --- a/weaver/utils.py +++ b/weaver/utils.py @@ -15,7 +15,8 @@ import warnings from copy import deepcopy from datetime import datetime -from typing import TYPE_CHECKING +from distutils.version import LooseVersion +from typing import TYPE_CHECKING, overload from urllib.parse import ParseResult, unquote, urlparse, urlunsplit import boto3 @@ -49,6 +50,7 @@ from werkzeug.wrappers import Request as WerkzeugRequest from yaml.scanner import ScannerError +from weaver.base import Constants from weaver.execute import ExecuteControlOption, ExecuteMode from weaver.formats import ContentType, get_content_type from weaver.status import map_status @@ -57,7 +59,20 @@ if TYPE_CHECKING: from types import FrameType - from typing import Any, Callable, Dict, List, Iterable, MutableMapping, NoReturn, Optional, Type, Tuple, Union + from typing import ( + Any, + Callable, + Dict, + List, + Iterable, + MutableMapping, + NoReturn, + Optional, + Type, + Tuple, + Union + ) + from typing_extensions import TypeGuard from weaver.execute import AnyExecuteControlOption, AnyExecuteMode from weaver.status import Status @@ -68,11 +83,14 @@ AnyRegistryContainer, AnyRequestMethod, AnyResponseType, + AnyUUID, AnyValueType, + AnyVersion, HeadersType, JSON, KVP, KVP_Item, + Literal, OpenAPISchema, Number, SettingsType @@ -559,15 +577,160 @@ def get_url_without_query(url): def is_valid_url(url): - # type: (Optional[str]) -> bool + # type: (Optional[str]) -> TypeGuard[str] try: return bool(urlparse(url).scheme) except Exception: # noqa: W0703 # nosec: B110 return False +class VersionLevel(Constants): + MAJOR = "major" + MINOR = "minor" + PATCH = "patch" + + +class VersionFormat(Constants): + OBJECT = "object" # LooseVersion + STRING = "string" # "x.y.z" + PARTS = "parts" # tuple/list + + +@overload +def as_version_major_minor_patch(version, version_format): + # type: (AnyVersion, Literal[VersionFormat.OBJECT]) -> LooseVersion + ... + + +@overload +def as_version_major_minor_patch(version, version_format): + # type: (AnyVersion, Literal[VersionFormat.STRING]) -> str + ... + + +@overload +def as_version_major_minor_patch(version, version_format): + # type: (AnyVersion, Literal[VersionFormat.PARTS]) -> Tuple[int, int, int] + ... + + +@overload +def as_version_major_minor_patch(version): + # type: (AnyVersion) -> Tuple[int, int, int] + ... + + +def as_version_major_minor_patch(version, version_format=VersionFormat.PARTS): + # type: (Optional[AnyVersion], VersionFormat) -> AnyVersion + """ + Generates a ``MAJOR.MINOR.PATCH`` version with padded with zeros for any missing parts. + """ + if isinstance(version, (str, float, int)): + ver_parts = list(LooseVersion(str(version)).version) + elif isinstance(version, (list, tuple)): + ver_parts = [int(part) for part in version] + else: + ver_parts = [] # default "0.0.0" for backward compatibility + ver_parts = ver_parts[:3] + ver_tuple = tuple(ver_parts + [0] * max(0, 3 - len(ver_parts))) + if version_format in [VersionFormat.STRING, VersionFormat.OBJECT]: + ver_str = ".".join(str(part) for part in ver_tuple) + if version_format == VersionFormat.STRING: + return ver_str + return LooseVersion(ver_str) + return ver_tuple + + +def is_update_version(version, taken_versions, version_level=VersionLevel.PATCH): + # type: (AnyVersion, Iterable[AnyVersion], VersionLevel) -> TypeGuard[AnyVersion] + """ + Determines if the version corresponds to an available update version of specified level compared to existing ones. + + If the specified version corresponds to an older version compared to available ones (i.e.: a taken more recent + version also exists), the specified version will have to fit within the version level range to be considered valid. + For example, requesting ``PATCH`` level will require that the specified version is greater than the last available + version against other existing versions with equivalent ``MAJOR.MINOR`` parts. If ``1.2.0`` and ``2.0.0`` were + taken versions, and ``1.2.3`` has to be verified as the update version, it will be considered valid since its + ``PATCH`` number ``3`` is greater than all other ``1.2.x`` versions (it falls within the ``[1.2.x, 1.3.x[`` range). + Requesting instead ``MINOR`` level will require that the specified version is greater than the last available + version against existing versions of same ``MAJOR`` part only. Using again the same example values, ``1.3.0`` + would be valid since its ``MINOR`` part ``3`` is greater than any other ``1.x`` taken versions. On the other hand, + version ``1.2.4`` would not be valid as ``x = 2`` is already taken by other versions considering same ``1.x`` + portion (``PATCH`` part is ignored in this case since ``MINOR`` is requested, and ``2.0.0`` is ignored as not the + same ``MAJOR`` portion of ``1`` as the tested version). Finally, requesting a ``MAJOR`` level will require + necessarily that the specified version is greater than all other existing versions for update, since ``MAJOR`` is + the highest possible semantic part, and higher parts are not available to define an upper version bound. + + .. note:: + As long as the version level is respected, the actual number of this level and all following ones can be + anything as long as they are not taken. For example, ``PATCH`` with existing ``1.2.3`` does not require that + the update version be ``1.2.4``. It can be ``1.2.5``, ``1.2.24``, etc. as long as ``1.2.x`` is respected. + Similarly, ``MINOR`` update can provide any ``PATCH`` number, since ``1.x`` only must be respected. From + existing ``1.2.3``, ``MINOR`` update could specify ``1.4.99`` as valid version. The ``PATCH`` part does not + need to start back at ``0``. + + :param version: Version to validate as potential update revision. + :param taken_versions: Existing versions that cannot be reused. + :param version_level: Minimum level to consider availability of versions as valid revision number for update. + :return: Status of availability of the version. + """ + + def _pad_incr(_parts, _index=None): # type: (Tuple[int, ...], Optional[int]) -> Tuple[int, ...] + """ + Pads versions to always have 3 parts in case some were omitted, then increment part index if requested. + """ + _parts = list(_parts) + [0] * max(0, 3 - len(_parts)) + if _index is not None: + _parts[_index] += 1 + return tuple(_parts) + + if not taken_versions: + return True + + version = as_version_major_minor_patch(version) + other_versions = sorted([as_version_major_minor_patch(ver) for ver in taken_versions]) + ver_min = other_versions[0] + for ver in other_versions: # find versions just above and below specified + if ver == version: + return False + if version < ver: + # if next versions are at the same semantic level as requested one, + # then not an update version since another higher one is already defined + # handle MAJOR separately since it can only be the most recent one + if version_level == VersionLevel.MINOR: + if _pad_incr(version[:1]) == _pad_incr(ver[:1]): + return False + elif version_level == VersionLevel.PATCH: + if _pad_incr(version[:2]) == _pad_incr(ver[:2]): + return False + break + ver_min = ver + else: + # major update must be necessarily the last version, + # so no lower version found to break out of loop + if version_level == VersionLevel.MAJOR: + return _pad_incr(version[:1]) > _pad_incr(other_versions[-1][:1]) + + # if found previous version and next version was not already taken + # the requested one must be one above previous one at same semantic level, + # and must be one below the upper semantic level to be an available version + # (eg: if PATCH requested and found min=1.3.4, then max=1.4.0, version can be anything in between) + if version_level == VersionLevel.MAJOR: + min_version = _pad_incr(ver_min[:1], 0) + max_version = (float("inf"), float("inf"), float("inf")) + elif version_level == VersionLevel.MINOR: + min_version = _pad_incr(ver_min[:2], 1) + max_version = _pad_incr(ver_min[:1], 0) + elif version_level == VersionLevel.PATCH: + min_version = _pad_incr(ver_min[:3], 2) + max_version = _pad_incr(ver_min[:2], 1) + else: + raise NotImplementedError(f"Unknown version level: {version_level!s}") + return min_version <= tuple(version) < max_version + + def is_uuid(maybe_uuid): - # type: (Any) -> bool + # type: (Any) -> TypeGuard[AnyUUID] """ Evaluates if the provided input is a UUID-like string. """ @@ -1627,7 +1790,7 @@ def load_file(file_path, text=False): def is_remote_file(file_location): - # type: (str) -> bool + # type: (str) -> TypeGuard[str] """ Parses to file location to figure out if it is remotely available or a local path. """ diff --git a/weaver/wps_restapi/examples/local_process_not_found.json b/weaver/wps_restapi/examples/local_process_not_found.json new file mode 100644 index 000000000..4c270f9a3 --- /dev/null +++ b/weaver/wps_restapi/examples/local_process_not_found.json @@ -0,0 +1,7 @@ +{ + "title": "NoSuchProcess", + "type": "http://www.opengis.net/def/exceptions/ogcapi-processes-1/1.0/no-such-process", + "detail": "Process with specified reference identifier does not exist.", + "status": 404, + "cause": "does-not-exist" +} diff --git a/weaver/wps_restapi/jobs/utils.py b/weaver/wps_restapi/jobs/utils.py index ca987cdfc..eee1823fd 100644 --- a/weaver/wps_restapi/jobs/utils.py +++ b/weaver/wps_restapi/jobs/utils.py @@ -18,7 +18,7 @@ from pyramid_celery import celery_app from weaver.database import get_db -from weaver.datatype import Job +from weaver.datatype import Job, Process from weaver.exceptions import ( InvalidIdentifierValue, JobGone, @@ -48,6 +48,7 @@ from weaver.wps.utils import get_wps_output_dir, get_wps_output_url, map_wps_output_location from weaver.wps_restapi import swagger_definitions as sd from weaver.wps_restapi.constants import JobInputsOutputsSchema +from weaver.wps_restapi.processes.utils import resolve_process_tag from weaver.wps_restapi.providers.utils import forbid_local_only if TYPE_CHECKING: @@ -76,8 +77,17 @@ def get_job(request): # type: (PyramidRequest) -> Job """ - Obtain a job from request parameters. + Obtain a :term:`Job` from request parameters. + .. versionchanged:: 4.20 + When looking for :term:`Job` that refers to a local :term:`Process`, allow implicit resolution of the + unspecified ``version`` portion to automatically resolve the identifier. Consider that validation of + the expected :term:`Process` for this :term:`Job` is "good enough", since the specific ID is not actually + required to obtain the :term:`Job` (could be queried by ID only on the ``/jobs/{jobId}`` endpoint. + If the ``version`` is provided though (either query parameter or tagged representation), the validation + will ensure that it matches explicitly. + + :param request: Request with path and query parameters to retrieve the desired job. :returns: Job information if found. :raise HTTPNotFound: with JSON body details on missing/non-matching job, process, provider IDs. """ @@ -107,7 +117,11 @@ def get_job(request): ) provider_id = request.matchdict.get("provider_id", job.service) - process_id = request.matchdict.get("process_id", job.process) + process_tag = request.matchdict.get("process_id") + if process_tag: + process_tag = resolve_process_tag(request) # find version if available as well + else: + process_tag = job.process if provider_id: forbid_local_only(request) @@ -121,11 +135,13 @@ def get_job(request): "type": "http://www.opengis.net/def/exceptions/ogcapi-processes-1/1.0/no-such-job", "detail": desc, "status": OWSNotFound.code, - "cause": str(process_id) + "cause": str(provider_id) }, code=title, locator="provider", description=desc # old format ) - if job.process != process_id: + + process_id = Process.split_version(process_tag)[0] + if job.process not in [process_id, process_tag]: title = "NoSuchProcess" desc = "Could not find job reference corresponding to specified process reference." raise OWSNotFound( @@ -136,7 +152,7 @@ def get_job(request): "type": "http://www.opengis.net/def/exceptions/ogcapi-processes-1/1.0/no-such-job", "detail": desc, "status": OWSNotFound.code, - "cause": str(process_id) + "cause": str(process_tag) }, code=title, locator="process", description=desc # old format ) @@ -415,7 +431,7 @@ def get_job_results_response(job, container, headers=None): is_raw = job.execution_response == ExecuteResponse.RAW results, refs = get_results(job, container, value_key="value", schema=JobInputsOutputsSchema.OGC, # not strict to provide more format details - link_references=is_raw) # type: Union[ExecutionResults, HeadersTupleType] + link_references=is_raw) headers = headers or {} if "location" not in headers: headers["Location"] = job.status_url(container) @@ -434,10 +450,11 @@ def get_job_results_response(job, container, headers=None): refs.extend(headers.items()) return HTTPNoContent(headers=refs) - # raw response can be only data value, only link or a mix of them + # raw response can be data-only value, link-only or a mix of them if results: # https://docs.ogc.org/is/18-062r2/18-062r2.html#req_core_process-execute-sync-raw-value-one - out_info = list(results.items())[0][-1] + out_vals = list(results.items()) + out_info = out_vals[0][-1] out_type = get_any_value(out_info, key=True) out_data = get_any_value(out_info) diff --git a/weaver/wps_restapi/processes/__init__.py b/weaver/wps_restapi/processes/__init__.py index aef31a735..74f7b8d60 100644 --- a/weaver/wps_restapi/processes/__init__.py +++ b/weaver/wps_restapi/processes/__init__.py @@ -33,6 +33,10 @@ def includeme(config): request_method="POST", renderer=OutputFormat.JSON) config.add_view(p.get_local_process, route_name=sd.process_service.name, request_method="GET", renderer=OutputFormat.JSON) + config.add_view(p.patch_local_process, route_name=sd.process_service.name, + request_method="PATCH", renderer=OutputFormat.JSON) + config.add_view(p.put_local_process, route_name=sd.process_service.name, + request_method="PUT", renderer=OutputFormat.JSON) config.add_view(p.delete_local_process, route_name=sd.process_service.name, request_method="DELETE", renderer=OutputFormat.JSON) config.add_view(p.get_local_process_package, route_name=sd.process_package_service.name, diff --git a/weaver/wps_restapi/processes/processes.py b/weaver/wps_restapi/processes/processes.py index 521e68c00..737ce5523 100644 --- a/weaver/wps_restapi/processes/processes.py +++ b/weaver/wps_restapi/processes/processes.py @@ -18,10 +18,10 @@ from weaver.formats import OutputFormat, repr_json from weaver.processes import opensearch from weaver.processes.execution import submit_job -from weaver.processes.utils import deploy_process_from_payload, get_process +from weaver.processes.utils import deploy_process_from_payload, get_process, update_process_metadata from weaver.status import Status from weaver.store.base import StoreJobs, StoreProcesses -from weaver.utils import fully_qualified_name, get_any_id +from weaver.utils import clean_json_text_body, fully_qualified_name, get_any_id from weaver.visibility import Visibility from weaver.wps_restapi import swagger_definitions as sd from weaver.wps_restapi.processes.utils import get_process_list_links, get_processes_filtered_by_valid_schemas @@ -128,9 +128,15 @@ def get_processes(request): }) except HTTPException: raise - # FIXME: handle colander invalid directly in tween (https://github.com/crim-ca/weaver/issues/112) - except colander.Invalid as ex: - raise HTTPBadRequest(f"Invalid schema: [{ex!s}]") + except colander.Invalid as exc: + raise HTTPBadRequest(json={ + "type": "InvalidParameterValue", + "title": "Invalid parameter value.", + "description": "Submitted request parameters are invalid or could not be processed.", + "cause": clean_json_text_body(f"Invalid schema: [{exc.msg or exc!s}]"), + "error": exc.__class__.__name__, + "value": repr_json(exc.value, force_string=False), + }) @sd.processes_service.post(tags=[sd.TAG_PROCESSES, sd.TAG_DEPLOY], renderer=OutputFormat.JSON, @@ -144,6 +150,37 @@ def add_local_process(request): return deploy_process_from_payload(request.text, request) # use text to allow parsing as JSON or YAML +@sd.process_service.put(tags=[sd.TAG_PROCESSES, sd.TAG_DEPLOY], renderer=OutputFormat.JSON, + schema=sd.PutProcessEndpoint(), response_schemas=sd.put_process_responses) +@log_unhandled_exceptions(logger=LOGGER, message=sd.InternalServerErrorResponseSchema.description) +def put_local_process(request): + # type: (PyramidRequest) -> AnyViewResponse + """ + Update a registered local process with a new definition. + + Updates the new process MAJOR semantic version from the previous one if not specified explicitly. + For MINOR or PATCH changes to metadata of the process definition, consider using the PATCH request. + """ + process = get_process(request=request, revision=False) # ignore tagged version since must always be latest + return deploy_process_from_payload(request.text, request, overwrite=process) + + +@sd.process_service.patch(tags=[sd.TAG_PROCESSES, sd.TAG_DEPLOY], renderer=OutputFormat.JSON, + schema=sd.PatchProcessEndpoint(), response_schemas=sd.patch_process_responses) +@log_unhandled_exceptions(logger=LOGGER, message=sd.InternalServerErrorResponseSchema.description) +def patch_local_process(request): + # type: (PyramidRequest) -> AnyViewResponse + """ + Update metadata of a registered local process. + + Updates the new process MINOR or PATCH semantic version if not specified explicitly, based on updated contents. + Changes that impact only metadata such as description or keywords imply PATCH update. + Changes to properties that might impact process operation such as supported formats implies MINOR update. + Changes that completely redefine the process require a MAJOR update using PUT request. + """ + return update_process_metadata(request) + + @sd.process_service.get(tags=[sd.TAG_PROCESSES, sd.TAG_DESCRIBEPROCESS], renderer=OutputFormat.JSON, schema=sd.ProcessEndpoint(), response_schemas=sd.get_process_responses) @log_unhandled_exceptions(logger=LOGGER, message=sd.InternalServerErrorResponseSchema.description) @@ -241,7 +278,7 @@ def delete_local_process(request): """ db = get_db(request) proc_store = db.get_store(StoreProcesses) - process = get_process(request=request, store=proc_store) + process = get_process(request=request) process_id = process.id if not process.mutable: raise HTTPForbidden(json={ diff --git a/weaver/wps_restapi/processes/utils.py b/weaver/wps_restapi/processes/utils.py index f5dd3c658..e5fd0f7df 100644 --- a/weaver/wps_restapi/processes/utils.py +++ b/weaver/wps_restapi/processes/utils.py @@ -4,7 +4,6 @@ import colander from pyramid.httpexceptions import HTTPBadRequest -from pyramid.request import Request from pyramid.settings import asbool from weaver.config import WeaverFeature, get_weaver_configuration @@ -18,14 +17,48 @@ if TYPE_CHECKING: from typing import Dict, List, Optional, Tuple - from weaver.datatype import Service - from weaver.typedefs import JSON + from weaver.datatype import Service, Process + from weaver.typedefs import JSON, PyramidRequest LOGGER = logging.getLogger(__name__) +def resolve_process_tag(request, process_query=False): + # type: (PyramidRequest, bool) -> str + """ + Obtain the tagged :term:`Process` reference from request path and/or query according to available information. + + Whether the :term:`Process` is specified by path or query, another ``version`` query can be provided to specify + the desired revision by itself. This ``version`` query is considered only if another version indication is not + already specified in the :term:`Process` reference using the tagged semantic. + + When ``process_query = False``, possible combinations are as follows: + + - ``/processes/{processID}:{version}`` + - ``/processes/{processID}?version={version}`` + + When ``process_query = True``, possible combinations are as follows: + + - ``/...?process={processID}:{version}`` + - ``/...?process={processID}&version={version}`` + + :param request: Request from which to retrieve the process reference. + :param process_query: Whether the process ID reference is located in request path or ``process={id}`` query. + """ + if process_query: + process_id = request.params.get("process") + else: + process_id = request.matchdict.get("process_id", "") + params = sd.LocalProcessQuery().deserialize(request.params) + version = params.get("version") + if version and ":" not in process_id: # priority to tagged version over query if specified + process_id = f"{process_id}:{version}" + process_id = sd.ProcessIdentifierTag(name="ProcessID").deserialize(process_id) + return process_id + + def get_processes_filtered_by_valid_schemas(request): - # type: (Request) -> Tuple[List[JSON], List[str], Dict[str, Optional[int]], bool, int] + # type: (PyramidRequest) -> Tuple[List[JSON], List[str], Dict[str, Optional[int]], bool, int] """ Validates the processes summary schemas and returns them into valid/invalid lists. @@ -35,6 +68,8 @@ def get_processes_filtered_by_valid_schemas(request): with_providers = False if get_weaver_configuration(settings) in WeaverFeature.REMOTE: with_providers = asbool(request.params.get("providers", False)) + revisions_param = sd.ProcessRevisionsQuery(unknown="ignore").deserialize(request.params) + with_revisions = revisions_param.get("revisions") paging_query = sd.ProcessPagingQuery() paging_value = {param.name: param.default for param in paging_query.children} paging_names = set(paging_value) @@ -47,20 +82,26 @@ def get_processes_filtered_by_valid_schemas(request): }) store = get_db(request).get_store(StoreProcesses) - processes, total_local_processes = store.list_processes(visibility=Visibility.PUBLIC, total=True, **paging_param) + processes, total_local_processes = store.list_processes( + visibility=Visibility.PUBLIC, + total=True, + **revisions_param, + **paging_param + ) valid_processes = [] invalid_processes_ids = [] - for process in processes: + for process in processes: # type: Process try: - valid_processes.append(process.summary()) + valid_processes.append(process.summary(revision=with_revisions)) except colander.Invalid as invalid: - LOGGER.debug("Invalid process [%s] because:\n%s", process.identifier, invalid) + process_ref = process.tag if with_revisions else process.identifier + LOGGER.debug("Invalid process [%s] because:\n%s", process_ref, invalid) invalid_processes_ids.append(process.identifier) return valid_processes, invalid_processes_ids, paging_param, with_providers, total_local_processes def get_process_list_links(request, paging, total, provider=None): - # type: (Request, Dict[str, int], Optional[int], Optional[Service]) -> List[JSON] + # type: (PyramidRequest, Dict[str, int], Optional[int], Optional[Service]) -> List[JSON] """ Obtains a list of all relevant links for the corresponding :term:`Process` listing defined by query parameters. @@ -118,4 +159,14 @@ def get_process_list_links(request, paging, total, provider=None): "href": get_path_kvp(proc_url, page=cur_page + 1, **kvp_params), "rel": "next", "type": ContentType.APP_JSON, "title": "Next page of processes query listing." }) + process = kvp_params.get("process") + if process and ":" not in str(process): + proc_hist = f"{proc_url}?detail=false&revisions=true&process={process}" + proc_desc = f"{proc_url}/{process}" + links.extend([ + {"href": proc_desc, "rel": "latest-version", + "type": ContentType.APP_JSON, "title": "Most recent revision of this process."}, + {"href": proc_hist, "rel": "version-history", + "type": ContentType.APP_JSON, "title": "Listing of all revisions of this process."}, + ]) return links diff --git a/weaver/wps_restapi/swagger_definitions.py b/weaver/wps_restapi/swagger_definitions.py index e8aec8601..53777a1aa 100644 --- a/weaver/wps_restapi/swagger_definitions.py +++ b/weaver/wps_restapi/swagger_definitions.py @@ -281,6 +281,13 @@ class SLUG(ExtendedSchemaNode): pattern = r"^[A-Za-z0-9]+(?:(-|_)[A-Za-z0-9]+)*$" +class Tag(ExtendedSchemaNode): + schema_type = String + description = "Identifier with optional tagged version forming an unique reference." + # ranges used to remove starting/ending ^$ characters + pattern = SLUG.pattern[:-1] + rf"(:{SemanticVersion(v_prefix=False, rc_suffix=False).pattern[1:-1]})?$" + + class URL(ExtendedSchemaNode): schema_type = String description = "URL reference." @@ -392,6 +399,11 @@ class ProcessIdentifier(AnyOfKeywordSchema): ] +class ProcessIdentifierTag(AnyOfKeywordSchema): + description = "Process identifier with optional revision tag." + _any_of = [Tag] + ProcessIdentifier._any_of # type: ignore # noqa: W0212 + + class Version(ExtendedSchemaNode): # note: internally use LooseVersion, so don't be too strict about pattern schema_type = String @@ -487,33 +499,40 @@ class XAuthDockerHeader(ExtendedSchemaNode): missing = drop -class RequestContentTypeHeader(OneOfKeywordSchema): - _one_of = [ - JsonHeader(), - XmlHeader(), - ] +class RequestContentTypeHeader(ContentTypeHeader): + example = ContentType.APP_JSON + default = ContentType.APP_JSON + validator = OneOf([ + ContentType.APP_JSON, + # ContentType.APP_XML, + ]) -class ResponseContentTypeHeader(OneOfKeywordSchema): - _one_of = [ - JsonHeader(), - XmlHeader(), - HtmlHeader(), - ] +class ResponseContentTypeHeader(ContentTypeHeader): + example = ContentType.APP_JSON + default = ContentType.APP_JSON + validator = OneOf([ + ContentType.APP_JSON, + ContentType.APP_XML, + ContentType.TEXT_XML, + ContentType.TEXT_HTML, + ]) -class RequestHeaders(RequestContentTypeHeader): +class RequestHeaders(ExtendedMappingSchema): """ Headers that can indicate how to adjust the behavior and/or result the be provided in the response. """ accept = AcceptHeader() accept_language = AcceptLanguageHeader() + content_type = RequestContentTypeHeader() class ResponseHeaders(ResponseContentTypeHeader): """ Headers describing resulting response. """ + content_type = ResponseContentTypeHeader() class RedirectHeaders(ResponseHeaders): @@ -525,7 +544,7 @@ class NoContent(ExtendedMappingSchema): default = {} -class FileUploadHeaders(RequestContentTypeHeader): +class FileUploadHeaders(RequestHeaders): # MUST be multipart for upload content_type = ContentTypeHeader( example=f"{ContentType.MULTI_PART_FORM}; boundary=43003e2f205a180ace9cd34d98f911ff", @@ -562,7 +581,7 @@ class DescriptionSchema(ExtendedMappingSchema): class KeywordList(ExtendedSequenceSchema): - keyword = ExtendedSchemaNode(String()) + keyword = ExtendedSchemaNode(String(), validator=Length(min=1)) class Language(ExtendedSchemaNode): @@ -1018,8 +1037,8 @@ class PropertyOAS(PermissiveMappingSchema): read_only = ExtendedSchemaNode(Boolean(), name="readOnly", missing=drop) write_only = ExtendedSchemaNode(Boolean(), name="writeOnly", missing=drop) multiple_of = MultipleOfOAS(name="multipleOf", missing=drop, validator=BoundedRange(min=0, exclusive_min=True)) - minimum = ExtendedSchemaNode(Integer(), name="minLength", missing=drop, validator=Range(min=0)) # default=0 - maximum = ExtendedSchemaNode(Integer(), name="maxLength", missing=drop, validator=Range(min=0)) + minimum = ExtendedSchemaNode(Integer(), name="minimum", missing=drop, validator=Range(min=0)) # default=0 + maximum = ExtendedSchemaNode(Integer(), name="maximum", missing=drop, validator=Range(min=0)) exclusive_min = ExtendedSchemaNode(Boolean(), name="exclusiveMinimum", missing=drop) # default=False exclusive_max = ExtendedSchemaNode(Boolean(), name="exclusiveMaximum", missing=drop) # default=False min_length = ExtendedSchemaNode(Integer(), name="minLength", missing=drop, validator=Range(min=0)) # default=0 @@ -1132,8 +1151,8 @@ class DeployMinMaxOccurs(ExtendedMappingSchema): # does not inherit from 'DescriptionLinks' because other 'ProcessDescription<>' schema depend from this without 'links' class ProcessDescriptionType(DescriptionBase, DescriptionExtra): - id = ProcessIdentifier() - version = Version(missing=drop) + id = ProcessIdentifierTag() + version = Version(missing=None, default=None, example="1.2.3") mutable = ExtendedSchemaNode(Boolean(), default=True, description=( "Indicates if the process is mutable (dynamically deployed), or immutable (builtin with this instance)." )) @@ -1673,15 +1692,36 @@ class VisibilitySchema(ExtendedMappingSchema): ######################################################### -class ProcessPath(ExtendedMappingSchema): - # FIXME: support versioning with (https://github.com/crim-ca/weaver/issues/107) - process_id = AnyIdentifier(description="Process identifier.", example="jsonarray2netcdf") +class LocalProcessQuery(ExtendedMappingSchema): + version = Version(example="1.2.3", missing=drop, description=( + "Specific process version to locate. " + "If process ID was requested with tagged 'id:version' revision format, this parameter is ignored." + )) + + +class LocalProcessPath(ExtendedMappingSchema): + process_id = ProcessIdentifierTag( + example="jsonarray2netcdf[:1.0.0]", + description=( + "Process identifier with optional tag version. " + "If tag is omitted, the latest version of that process is assumed. " + "Otherwise, the specific process revision as 'id:version' must be matched. " + "Alternatively, the plain process ID can be specified in combination to 'version' query parameter." + ), + ) class ProviderPath(ExtendedMappingSchema): provider_id = AnyIdentifier(description="Remote provider identifier.", example="hummingbird") +class ProviderProcessPath(ProviderPath): + # note: Tag representation not allowed in this case + process_id = ProcessIdentifier(example="provider-process", description=( + "Identifier of a process that is offered by the remote provider." + )) + + class JobPath(ExtendedMappingSchema): job_id = UUID(description="Job ID", example="14c68477-c3ed-4784-9c0f-a4c9e1344db5") @@ -2485,34 +2525,42 @@ class ProcessDescriptionQuery(ExtendedMappingSchema): ) -class ProviderProcessEndpoint(ProviderPath, ProcessPath): +class ProviderProcessEndpoint(ProviderProcessPath): header = RequestHeaders() querystring = ProcessDescriptionQuery() -class ProcessEndpoint(ProcessPath): +class LocalProcessDescriptionQuery(ProcessDescriptionQuery, LocalProcessQuery): + pass + + +class ProcessEndpoint(LocalProcessPath): header = RequestHeaders() - querystring = ProcessDescriptionQuery() + querystring = LocalProcessDescriptionQuery() -class ProcessPackageEndpoint(ProcessPath): +class ProcessPackageEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProcessPayloadEndpoint(ProcessPath): +class ProcessPayloadEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProcessVisibilityGetEndpoint(ProcessPath): +class ProcessVisibilityGetEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProcessVisibilityPutEndpoint(ProcessPath): +class ProcessVisibilityPutEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() body = VisibilitySchema() -class ProviderJobEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderJobEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() @@ -2520,11 +2568,11 @@ class JobEndpoint(JobPath): header = RequestHeaders() -class ProcessInputsEndpoint(ProcessPath, JobPath): +class ProcessInputsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() -class ProviderInputsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderInputsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() @@ -2551,7 +2599,7 @@ class JobInputsEndpoint(JobPath): querystring = JobInputsOutputsQuery() -class JobOutputQuery(ExtendedMappingSchema): +class JobResultsQuery(ExtendedMappingSchema): schema = ExtendedSchemaNode( String(), title="JobOutputResultsSchema", @@ -2560,7 +2608,8 @@ class JobOutputQuery(ExtendedMappingSchema): validator=OneOfCaseInsensitive(JobInputsOutputsSchema.values()), summary="Selects the schema employed for representation of job outputs.", description=( - "Selects the schema employed for representation of job outputs for providing file Content-Type details. " + "Selects the schema employed for representation of job results (produced outputs) " + "for providing file Content-Type details. " f"When '{JobInputsOutputsSchema.OLD}' is employed, " "'format.mimeType' is used and 'type' is reported as well. " f"When '{JobInputsOutputsSchema.OGC}' is employed, " @@ -2571,19 +2620,23 @@ class JobOutputQuery(ExtendedMappingSchema): ) -class ProcessOutputsEndpoint(ProcessPath, JobPath): +class LocalProcessJobResultsQuery(LocalProcessQuery, JobResultsQuery): + pass + + +class JobOutputsEndpoint(JobPath): header = RequestHeaders() - querystring = JobOutputQuery() + querystring = LocalProcessJobResultsQuery() -class ProviderOutputsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProcessOutputsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() - querystring = JobOutputQuery() + querystring = LocalProcessJobResultsQuery() -class JobOutputsEndpoint(JobPath): +class ProviderOutputsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() - querystring = JobOutputQuery() + querystring = JobResultsQuery() class ProcessResultEndpoint(ProcessOutputsEndpoint): @@ -2601,19 +2654,19 @@ class JobResultEndpoint(JobPath): header = RequestHeaders() -class ProcessResultsEndpoint(ProcessPath, JobPath): +class ProcessResultsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() -class ProviderResultsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderResultsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() -class JobResultsEndpoint(ProviderPath, ProcessPath, JobPath): +class JobResultsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() -class ProviderExceptionsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderExceptionsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() @@ -2621,11 +2674,12 @@ class JobExceptionsEndpoint(JobPath): header = RequestHeaders() -class ProcessExceptionsEndpoint(ProcessPath, JobPath): +class ProcessExceptionsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProviderLogsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderLogsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() @@ -2633,19 +2687,21 @@ class JobLogsEndpoint(JobPath): header = RequestHeaders() -class ProcessLogsEndpoint(ProcessPath, JobPath): +class ProcessLogsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() + querystring = LocalProcessQuery() class JobStatisticsEndpoint(JobPath): header = RequestHeaders() -class ProcessJobStatisticsEndpoint(ProcessPath, JobPath): +class ProcessJobStatisticsEndpoint(LocalProcessPath, JobPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProviderJobStatisticsEndpoint(ProviderPath, ProcessPath, JobPath): +class ProviderJobStatisticsEndpoint(ProviderProcessPath, JobPath): header = RequestHeaders() @@ -2792,7 +2848,9 @@ class ProcessSummaryList(ExtendedSequenceSchema): class ProcessNamesList(ExtendedSequenceSchema): - process_name = ProcessIdentifier() + process_name = ProcessIdentifierTag( + description="Process identifier or tagged representation if revision was requested." + ) class ProcessListing(OneOfKeywordSchema): @@ -2884,6 +2942,10 @@ class ProcessDescription(OneOfKeywordSchema): class ProcessDeployment(ProcessSummary, ProcessContext, ProcessDeployMeta): + # override ID to forbid deploy to contain a tagged version part + # if version should be applied, it must be provided with its 'Version' field + id = ProcessIdentifier() + # explicit "abstract" handling for bw-compat, new versions should use "description" # only allowed in deploy to support older servers that report abstract (or parsed from WPS-1/2) # recent OGC-API v1+ will usually provide directly "description" as per the specification @@ -2942,8 +3004,8 @@ def deserialize(self, cstruct): class JobStatusInfo(ExtendedMappingSchema): jobID = UUID(example="a9d14bf4-84e0-449a-bac8-16e598efe807", description="ID of the job.") - processID = ProcessIdentifier(missing=None, default=None, - description="Process identifier corresponding to the job execution.") + processID = ProcessIdentifierTag(missing=None, default=None, + description="Process identifier corresponding to the job execution.") providerID = ProcessIdentifier(missing=None, default=None, description="Provider identifier corresponding to the job execution.") type = JobTypeEnum(description="Type of the element associated to the creation of this job.") @@ -2993,7 +3055,7 @@ class JobCollection(ExtendedSequenceSchema): class CreatedJobStatusSchema(DescriptionSchema): jobID = UUID(description="Unique identifier of the created job for execution.") - processID = ProcessIdentifier(description="Identifier of the process that will be executed.") + processID = ProcessIdentifierTag(description="Identifier of the process that will be executed.") providerID = AnyIdentifier(description="Remote provider identifier if applicable.", missing=drop) status = ExtendedSchemaNode(String(), example=Status.ACCEPTED) location = ExtendedSchemaNode(String(), example="http://{host}/weaver/processes/{my-process-id}/jobs/{my-job-id}") @@ -3314,7 +3376,7 @@ class QuoteStatusSchema(ExtendedSchemaNode): class PartialQuoteSchema(ExtendedMappingSchema): id = UUID(description="Quote ID.") status = QuoteStatusSchema() - processID = ProcessIdentifier(description="Process identifier corresponding to the quote definition.") + processID = ProcessIdentifierTag(description="Process identifier corresponding to the quote definition.") class Price(ExtendedSchemaNode): @@ -3795,6 +3857,17 @@ class CWLIdentifier(ProcessIdentifier): ) +class CWLIntentURL(URL): + description = ( + "Identifier URL to a concept for the type of computational operation accomplished by this Process " + "(see example operations: http://edamontology.org/operation_0004)." + ) + + +class CWLIntent(ExtendedSequenceSchema): + item = CWLIntentURL() + + class CWLBase(ExtendedMappingSchema): cwlVersion = CWLVersion() @@ -3802,6 +3875,7 @@ class CWLBase(ExtendedMappingSchema): class CWLApp(PermissiveMappingSchema): _class = CWLClass() id = CWLIdentifier(missing=drop) # can be omitted only if within a process deployment that also includes it + intent = CWLIntent(missing=drop) requirements = CWLRequirements(description="Explicit requirement to execute the application package.", missing=drop) hints = CWLHints(description="Non-failing additional hints that can help resolve extra requirements.", missing=drop) baseCommand = CWLCommand(description="Command called in the docker image or on shell according to requirements " @@ -4180,16 +4254,32 @@ class CWLGraphBase(ExtendedMappingSchema): ) -class DeployCWLGraph(CWLBase, CWLGraphBase): - _sort_first = ["cwlVersion", "$graph"] +class UpdateVersion(ExtendedMappingSchema): + version = Version(missing=drop, example="1.2.3", description=( + "Explicit version to employ for initial or updated process definition. " + "Must not already exist and must be greater than the latest available semantic version for the " + "corresponding version level according to the applied update operation. " + "For example, if only versions '1.2.3' and '1.3.1' exist, the submitted version can be anything before " + "version '1.2.0' excluding it (i.e.: '1.1.X', '0.1.2', etc.), between '1.2.4' and '1.3.0' exclusively, or " + "'1.3.2' and anything above. If no version is provided, the next *patch* level after the current process " + "version is applied. If the current process did not define any version, it is assumed '0.0.0' and this patch" + "will use '0.0.1'. The applicable update level (MAJOR, MINOR, PATCH) depends on the operation being applied. " + "As a rule of thumb, if changes affect only metadata, PATCH is required. If changes affect parameters or " + "execution method of the process, but not directly its entire definition, MINOR is required. If the process " + "must be completely redeployed due to application redefinition, MAJOR is required." + )) -class DeployCWL(NotKeywordSchema, CWL): - _sort_first = ["cwlVersion", "id", "class"] +class DeployCWLGraph(CWLBase, CWLGraphBase, UpdateVersion): + _sort_first = ["cwlVersion", "version", "$graph"] + + +class DeployCWL(NotKeywordSchema, CWL, UpdateVersion): + _sort_first = ["cwlVersion", "version", "id", "class"] _not = [ CWLGraphBase() ] - id = CWLIdentifier() # required in this case + id = CWLIdentifier() # required in this case, cannot have a version directly as tag, use 'version' field instead class Deploy(OneOfKeywordSchema): @@ -4238,6 +4328,104 @@ class PostProcessesEndpoint(ExtendedMappingSchema): }) +class UpdateInputOutputBase(DescriptionType, InputOutputDescriptionMeta): + pass + + +class UpdateInputOutputItem(InputIdentifierType, UpdateInputOutputBase): + pass + + +class UpdateInputOutputList(ExtendedSequenceSchema): + io_item = UpdateInputOutputItem() + + +class UpdateInputOutputMap(PermissiveMappingSchema): + io_id = UpdateInputOutputBase( + variable="{input-output-id}", + description="Input/Output definition under mapping for process update." + ) + + +class UpdateInputOutputDefinition(OneOfKeywordSchema): + _one_of = [ + UpdateInputOutputMap(), + UpdateInputOutputList(), + ] + + +class PatchProcessBodySchema(UpdateVersion): + title = ExtendedSchemaNode(String(), missing=drop, description=( + "New title to override current one. " + "Minimum required change version level: PATCH." + )) + description = ExtendedSchemaNode(String(), missing=drop, description=( + "New description to override current one. " + "Minimum required change version level: PATCH." + )) + keywords = KeywordList(missing=drop, description=( + "Keywords to add (append) to existing definitions. " + "To remove all keywords, submit an empty list. " + "To replace keywords, perform two requests, one with empty list and the following one with new definitions. " + "Minimum required change version level: PATCH." + )) + metadata = MetadataList(missing=drop, description=( + "Metadata to add (append) to existing definitions. " + "To remove all metadata, submit an empty list. " + "To replace metadata, perform two requests, one with empty list and the following one with new definitions. " + "Relations must be unique across existing and new submitted metadata. " + "Minimum required change version level: PATCH." + )) + links = LinkList(missing=drop, description=( + "Links to add (append) to existing definitions. Relations must be unique. " + "To remove all (additional) links, submit an empty list. " + "To replace links, perform two requests, one with empty list and the following one with new definitions. " + "Note that modifications to links only considers custom links. Other automatically generated links such as " + "API endpoint and navigation references cannot be removed or modified. " + "Relations must be unique across existing and new submitted links. " + "Minimum required change version level: PATCH." + )) + inputs = UpdateInputOutputDefinition(missing=drop, description=( + "Update details of individual input elements. " + "Minimum required change version levels are the same as process-level fields of corresponding names." + )) + outputs = UpdateInputOutputDefinition(missing=drop, description=( + "Update details of individual output elements. " + "Minimum required change version levels are the same as process-level fields of corresponding names." + )) + jobControlOptions = JobControlOptionsList(missing=drop, description=( + "New job control options supported by this process for its execution. " + "All desired job control options must be provided (full override, not appending). " + "Order is important to define the default behaviour (first item) to use when unspecified during job execution. " + "Minimum required change version level: MINOR." + )) + outputTransmission = TransmissionModeList(missing=drop, description=( + "New output transmission methods supported following this process execution. " + "All desired output transmission modes must be provided (full override, not appending). " + "Minimum required change version level: MINOR." + )) + visibility = VisibilityValue(missing=drop, description=( + "New process visibility. " + "Minimum required change version level: MINOR." + )) + + +class PutProcessBodySchema(Deploy): + description = "Process re-deployment using an updated version and definition." + + +class PatchProcessEndpoint(LocalProcessPath): + headers = RequestHeaders() + querystring = LocalProcessQuery() + body = PatchProcessBodySchema() + + +class PutProcessEndpoint(LocalProcessPath): + headers = RequestHeaders() + querystring = LocalProcessQuery() + body = PutProcessBodySchema() + + class WpsOutputContextHeader(ExtendedSchemaNode): # ok to use 'name' in this case because target 'key' in the mapping must # be that specific value but cannot have a field named with this format @@ -4258,8 +4446,9 @@ class ExecuteHeaders(RequestHeaders): x_wps_output_context = WpsOutputContextHeader() -class PostProcessJobsEndpoint(ProcessPath): +class PostProcessJobsEndpoint(LocalProcessPath): header = ExecuteHeaders() + querystring = LocalProcessQuery() body = Execute() @@ -4282,8 +4471,10 @@ class GetJobsQueries(ExtendedMappingSchema): description="Maximum duration (seconds) between started time and current/finished time of jobs to find.") datetime = DateTimeInterval(missing=drop, default=None) status = JobStatusEnum(missing=drop, default=None) - processID = ProcessIdentifier(missing=drop, default=null, description="Alias to 'process' for OGC-API compliance.") - process = ProcessIdentifier(missing=drop, default=None, description="Identifier of the process to filter search.") + processID = ProcessIdentifierTag(missing=drop, default=null, + description="Alias to 'process' for OGC-API compliance.") + process = ProcessIdentifierTag(missing=drop, default=None, + description="Identifier of the process to filter search.") service = AnyIdentifier(missing=drop, default=null, description="Alias to 'provider' for backward compatibility.") provider = AnyIdentifier(missing=drop, default=None, description="Identifier of service provider to filter search.") type = JobTypeEnum(missing=drop, default=null, @@ -4295,21 +4486,27 @@ class GetJobsQueries(ExtendedMappingSchema): description="Comma-separated values of tags assigned to jobs") -class GetJobsRequest(ExtendedMappingSchema): - header = RequestHeaders() - querystring = GetJobsQueries() +class GetProcessJobsQuery(LocalProcessQuery, GetJobsQueries): + pass -class GetJobsEndpoint(GetJobsRequest): +class GetProviderJobsQueries(GetJobsQueries): # ':version' not allowed for process ID in this case pass -class GetProcessJobsEndpoint(GetJobsRequest, ProcessPath): - pass +class GetJobsEndpoint(ExtendedMappingSchema): + header = RequestHeaders() + querystring = GetProcessJobsQuery() # allowed version in this case since can be either local or remote processes -class GetProviderJobsEndpoint(GetJobsRequest, ProviderPath, ProcessPath): - pass +class GetProcessJobsEndpoint(LocalProcessPath): + header = RequestHeaders() + querystring = GetProcessJobsQuery() + + +class GetProviderJobsEndpoint(ProviderProcessPath): + header = RequestHeaders() + querystring = GetProviderJobsQueries() class JobIdentifierList(ExtendedSequenceSchema): @@ -4325,20 +4522,22 @@ class DeleteJobsEndpoint(ExtendedMappingSchema): body = DeleteJobsBodySchema() -class DeleteProcessJobsEndpoint(DeleteJobsEndpoint, ProcessPath): - pass +class DeleteProcessJobsEndpoint(DeleteJobsEndpoint, LocalProcessPath): + querystring = LocalProcessQuery() -class DeleteProviderJobsEndpoint(DeleteJobsEndpoint, ProviderPath, ProcessPath): +class DeleteProviderJobsEndpoint(DeleteJobsEndpoint, ProviderProcessPath): pass -class GetProcessJobEndpoint(ProcessPath): +class GetProcessJobEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class DeleteProcessJobEndpoint(ProcessPath): +class DeleteProcessJobEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() class BillsEndpoint(ExtendedMappingSchema): @@ -4349,12 +4548,14 @@ class BillEndpoint(BillPath): header = RequestHeaders() -class ProcessQuotesEndpoint(ProcessPath): +class ProcessQuotesEndpoint(LocalProcessPath): header = RequestHeaders() + querystring = LocalProcessQuery() -class ProcessQuoteEndpoint(ProcessPath, QuotePath): +class ProcessQuoteEndpoint(LocalProcessPath, QuotePath): header = RequestHeaders() + querystring = LocalProcessQuery() class GetQuotesQueries(ExtendedMappingSchema): @@ -4373,8 +4574,9 @@ class QuoteEndpoint(QuotePath): header = RequestHeaders() -class PostProcessQuote(ProcessPath, QuotePath): +class PostProcessQuote(LocalProcessPath, QuotePath): header = RequestHeaders() + querystring = LocalProcessQuery() body = NoContent() @@ -4387,8 +4589,9 @@ class QuoteProcessParametersSchema(ExecuteInputOutputs): pass -class PostProcessQuoteRequestEndpoint(ProcessPath, QuotePath): +class PostProcessQuoteRequestEndpoint(LocalProcessPath, QuotePath): header = RequestHeaders() + querystring = LocalProcessQuery() body = QuoteProcessParametersSchema() @@ -4434,6 +4637,21 @@ class ProcessDetailQuery(ExtendedMappingSchema): ) +class ProcessRevisionsQuery(ExtendedMappingSchema): + process = ProcessIdentifier(missing=drop, description=( + "Process ID (excluding version) for which to filter results. " + "When combined with 'revisions=true', allows listing of all reversions of a given process. " + "If omitted when 'revisions=true', all revisions of every process ID will be returned. " + "If used without 'revisions' query, list should include a single process as if summary was requested directly." + )) + revisions = ExtendedSchemaNode( + QueryBoolean(), example=True, default=False, missing=drop, description=( + "Return all revisions of processes, or simply their latest version. When returning all revisions, " + "IDs will be replaced by '{processID}:{version}' tag representation to avoid duplicates." + ) + ) + + class ProviderProcessesQuery(ProcessPagingQuery, ProcessDetailQuery): pass @@ -4483,10 +4701,24 @@ class OWSExceptionResponse(ExtendedMappingSchema): description="Specific description of the error.") +class ErrorCause(OneOfKeywordSchema): + _one_of = [ + ExtendedSchemaNode(String(), description="Error message from exception or cause of failure."), + PermissiveMappingSchema(description="Relevant error fields with details about the cause."), + ] + + class ErrorJsonResponseBodySchema(ExtendedMappingSchema): - code = OWSErrorCode() - description = ExtendedSchemaNode(String(), description="Detail about the cause of error.") + schema_ref = f"{OGC_API_SCHEMA_URL}/{OGC_API_SCHEMA_VERSION}/core/openapi/schemas/exception.yaml" + description = "JSON schema for exceptions based on RFC 7807" + type = OWSErrorCode() + title = ExtendedSchemaNode(String(), description="Short description of the error.", missing=drop) + detail = ExtendedSchemaNode(String(), description="Detail about the error cause.", missing=drop) + status = ExtendedSchemaNode(Integer(), description="Error status code.", example=400) + cause = ErrorCause(missing=drop) + value = ErrorCause(missing=drop) error = ErrorDetail(missing=drop) + instance = ExtendedSchemaNode(String(), missing=drop) exception = OWSExceptionResponse(missing=drop) @@ -4496,12 +4728,24 @@ class BadRequestResponseSchema(ExtendedMappingSchema): body = ErrorJsonResponseBodySchema() +class ConflictRequestResponseSchema(ExtendedMappingSchema): + description = "Conflict between the affected entity an another existing definition." + header = ResponseHeaders() + body = ErrorJsonResponseBodySchema() + + class UnprocessableEntityResponseSchema(ExtendedMappingSchema): description = "Wrong format of given parameters." header = ResponseHeaders() body = ErrorJsonResponseBodySchema() +class UnsupportedMediaTypeResponseSchema(ExtendedMappingSchema): + description = "Media-Type not supported for this request." + header = ResponseHeaders() + body = ErrorJsonResponseBodySchema() + + class ForbiddenProcessAccessResponseSchema(ExtendedMappingSchema): description = "Referenced process is not accessible." header = ResponseHeaders() @@ -4583,7 +4827,7 @@ class OkGetProviderProcessesSchema(ExtendedMappingSchema): body = ProviderProcessesSchema() -class GetProcessesQuery(ProcessPagingQuery, ProcessDetailQuery): +class GetProcessesQuery(ProcessPagingQuery, ProcessDetailQuery, ProcessRevisionsQuery): providers = ExtendedSchemaNode( QueryBoolean(), example=True, default=False, missing=drop, description="List local processes as well as all sub-processes of all registered providers. " @@ -4637,6 +4881,7 @@ class OkGetProcessesListResponse(ExtendedMappingSchema): class OkPostProcessDeployBodySchema(ExtendedMappingSchema): + description = ExtendedSchemaNode(String(), description="Detail about the operation.") deploymentDone = ExtendedSchemaNode(Boolean(), default=False, example=True, description="Indicates if the process was successfully deployed.") processSummary = ProcessSummary(missing=drop, description="Deployed process summary if successful.") @@ -4650,11 +4895,34 @@ class OkPostProcessesResponse(ExtendedMappingSchema): body = OkPostProcessDeployBodySchema() +class OkPatchProcessUpdatedBodySchema(ExtendedMappingSchema): + description = ExtendedSchemaNode(String(), description="Detail about the operation.") + processSummary = ProcessSummary(missing=drop, description="Deployed process summary if successful.") + + +class OkPatchProcessResponse(ExtendedMappingSchema): + description = "Process successfully updated." + header = ResponseHeaders() + body = OkPatchProcessUpdatedBodySchema() + + class BadRequestGetProcessInfoResponse(ExtendedMappingSchema): description = "Missing process identifier." body = NoContent() +class NotFoundProcessResponse(ExtendedMappingSchema): + description = "Process with specified reference identifier does not exist." + examples = { + "ProcessNotFound": { + "summary": "Example response when specified process reference cannot be found.", + "value": EXAMPLES["local_process_not_found.json"] + } + } + header = ResponseHeaders() + body = ErrorJsonResponseBodySchema() + + class OkGetProcessInfoResponse(ExtendedMappingSchema): header = ResponseHeaders() body = ProcessDescription() @@ -5080,6 +5348,23 @@ class GoneVaultFileDownloadResponse(ExtendedMappingSchema): "value": EXAMPLES["local_process_deploy_success.json"], } }), + "400": BadRequestResponseSchema(description="Unable to parse process definition"), + "409": ConflictRequestResponseSchema(description="Process with same ID already exists."), + "415": UnsupportedMediaTypeResponseSchema(description="Unsupported Media-Type for process deployment."), + "422": UnprocessableEntityResponseSchema(description="Invalid schema for process definition."), + "500": InternalServerErrorResponseSchema(), +} +put_process_responses = copy(post_processes_responses) +put_process_responses.update({ + "404": NotFoundProcessResponse(description="Process to update could not be found."), + "409": ConflictRequestResponseSchema(description="Process with same ID or version already exists."), +}) +patch_process_responses = { + "200": OkPatchProcessResponse(), + "400": BadRequestGetProcessInfoResponse(description="Unable to parse process definition"), + "404": NotFoundProcessResponse(description="Process to update could not be found."), + "409": ConflictRequestResponseSchema(description="Process with same ID or version already exists."), + "422": UnprocessableEntityResponseSchema(description="Invalid schema for process definition."), "500": InternalServerErrorResponseSchema(), } get_process_responses = { @@ -5096,6 +5381,7 @@ class GoneVaultFileDownloadResponse(ExtendedMappingSchema): } }), "400": BadRequestGetProcessInfoResponse(), + "404": NotFoundProcessResponse(), "500": InternalServerErrorResponseSchema(), } get_process_package_responses = { diff --git a/weaver/wps_restapi/utils.py b/weaver/wps_restapi/utils.py index dc4123523..78a0f2770 100644 --- a/weaver/wps_restapi/utils.py +++ b/weaver/wps_restapi/utils.py @@ -5,15 +5,25 @@ from typing import TYPE_CHECKING import colander -from pyramid.httpexceptions import HTTPBadRequest, HTTPSuccessful, status_map - +import yaml +from pyramid.httpexceptions import ( + HTTPBadRequest, + HTTPInternalServerError, + HTTPSuccessful, + HTTPUnprocessableEntity, + HTTPUnsupportedMediaType, + status_map +) + +from weaver.formats import repr_json from weaver.utils import get_header, get_settings, get_weaver_url from weaver.wps_restapi import swagger_definitions as sd if TYPE_CHECKING: - from typing import Any, Callable, Dict, Optional + from typing import Any, Callable, Dict, Optional, Union - from weaver.typedefs import AnySettingsContainer, HeadersType + from weaver.formats import ContentType + from weaver.typedefs import CWL, JSON, AnyRequestType, AnySettingsContainer, HeadersType LOGGER = logging.getLogger(__name__) @@ -133,7 +143,7 @@ def handle_schema_validation(schema=None): :param schema: If provided, document this schema as the reference of the failed schema validation. :raises HTTPBadRequest: If any schema validation error occurs when handling the decorated function. """ - def decorator(func): + def decorator(func): # type: (Callable[[Any, Any], Any]) -> Callable[[Any, Any], Any] @functools.wraps(func) def wrapped(*args, **kwargs): try: @@ -153,3 +163,61 @@ def wrapped(*args, **kwargs): raise HTTPBadRequest(json=data) return wrapped return decorator + + +def parse_content(request=None, # type: Optional[AnyRequestType] + content=None, # type: Optional[Union[JSON, str]] + content_schema=None, # type: Optional[colander.SchemaNode] + content_type=sd.RequestContentTypeHeader.default, # type: Optional[ContentType] + content_type_schema=sd.RequestContentTypeHeader, # type: Optional[colander.SchemaNode] + ): # type: (...) -> Union[JSON, CWL] + """ + Load the request content with validation of expected content type and their schema. + """ + if request is None and content is None: # pragma: no cover # safeguard for early detect invalid implementation + raise HTTPInternalServerError(json={ + "title": "Internal Server Error", + "type": "InternalServerError", + "detail": "Cannot parse undefined contents.", + "status": HTTPInternalServerError.code, + "cause": "Request content and content argument are undefined.", + }) + try: + if request is not None: + content = request.text + content_type = request.content_type + if content_type is not None and content_type_schema is not None: + content_type = content_type_schema().deserialize(content_type) + if isinstance(content, str): + content = yaml.safe_load(content) + if not isinstance(content, dict): + raise TypeError("Not a valid JSON body for process deployment.") + except colander.Invalid as exc: + raise HTTPUnsupportedMediaType(json={ + "title": "Unsupported Media Type", + "type": "UnsupportedMediaType", + "detail": str(exc), + "status": HTTPUnsupportedMediaType.code, + "cause": {"Content-Type": None if content_type is None else str(content_type)}, + }) + except Exception as exc: + raise HTTPBadRequest(json={ + "title": "Bad Request", + "type": "BadRequest", + "detail": "Unable to parse contents.", + "status": HTTPBadRequest.code, + "cause": str(exc), + }) + try: + if content_schema is not None: + content = content_schema().deserialize(content) + except colander.Invalid as exc: + raise HTTPUnprocessableEntity(json={ + "type": "InvalidParameterValue", + "title": "Failed schema validation.", + "status": HTTPUnprocessableEntity.code, + "error": colander.Invalid.__name__, + "cause": exc.msg, + "value": repr_json(exc.value, force_string=False), + }) + return content