Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sdk): fixes trusted host is the same as index url. Fixes #10743 #10720

Closed
wants to merge 1 commit into from

Conversation

diegolovison
Copy link
Contributor

@diegolovison diegolovison commented Apr 22, 2024

Description of your changes:

Create a pipeline with the following

@dsl.component(base_image=common_base_image, packages_to_install=['pandas'], pip_index_urls=['https://my-bastion.node.com:9443/root/pypi/simple'])

If you compile the code, the output will be

pip install --quiet --no-warn-script-location --index-url https://my-bastion.node.com:9443/root/pypi/simple --trusted-host https://my-bastion.node.com:9443/root/pypi/simple 'pandas'
{noformat}
This will lead to an error
{noformat}
Traceback (most recent call last):
  File "/home/dlovison/miniconda3/envs/pipe-presentation/bin/pip", line 11, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/main.py", line 79, in main
    return command.main(cmd_args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 101, in main
    return self._main(args)
           ^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 236, in _main
    self.handle_pip_version_check(options)
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 177, in handle_pip_version_check
    session = self._build_session(
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 122, in _build_session
    session = PipSession(
              ^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/network/session.py", line 398, in __init__
    self.add_trusted_host(host, suppress_logging=True)
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/network/session.py", line 422, in add_trusted_host
    parsed_host, parsed_port = parse_netloc(host)
                               ^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/utils/misc.py", line 475, in parse_netloc
    parsed = urllib.parse.urlparse(url)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/urllib/parse.py", line 395, in urlparse
    splitresult = urlsplit(url, scheme, allow_fragments)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/urllib/parse.py", line 497, in urlsplit
    raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL

 

We believe that the problem is that the --trusted-host parameter value should not start with [https://|https://,/] , as it is supposed to be just hostname. So, the output provided by the SDK should be instead:

pip install --quiet --no-warn-script-location --index-url https://my-bastion.node.com:9443/root/pypi/simple --trusted-host my-bastion.node.com 'pandas'

The workaround is to have a pip server only with protocol:URL ( without any port and path ). Example {{pip_index_urls=['https://my-bastion.node.com']}}

Checklist:

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign connor-mccarthy for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rimolive
Copy link
Member

@diegolovison This looks like that tests need to be updated with that change. Can you take a look?

@google-oss-prow google-oss-prow bot added size/S and removed size/XS labels Apr 24, 2024
@diegolovison diegolovison changed the title Output trusted host as hostname fix(sdk): fixes trusted host is the same as index url. Fixes #10743 Apr 24, 2024
@diegolovison
Copy link
Contributor Author

Two errors from the test suite. I don't see the relationship with my PR

====================================================================================================================== short test summary info =======================================================================================================================
FAILED sdk/python/kfp/cli/cli_test.py::TestDslCompile::test_deprecation_warning - FileNotFoundError: [Errno 2] No such file or directory: 'dsl-compile'
FAILED sdk/python/kfp/compiler/compiler_test.py::TestReadWriteEquality::test_deprecation_warning - FileNotFoundError: [Errno 2] No such file or directory: 'dsl-compile'
====================================================================================================== 2 failed, 1250 passed, 348 warnings in 297.49s (0:04:57) ======================================================================================================

@diegolovison
Copy link
Contributor Author

@connor-mccarthy Hi, could you please help us review the PR? Thanks

Copy link

google-oss-prow bot commented May 9, 2024

@diegolovison: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipelines-sdk-yapf d0c2597 link true /test kubeflow-pipelines-sdk-yapf
kfp-kubernetes-execution-tests d0c2597 link false /test kfp-kubernetes-execution-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@github-actions github-actions bot added the Stale label Jul 9, 2024
Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please describe why the change is needed? Is it due to changes from pip side? What kind of testing has been done? (IIRC, I don't think we have automated test covering this)

@github-actions github-actions bot removed the Stale label Jul 31, 2024
@diegolovison
Copy link
Contributor Author

Hi @chensun, here is the description of the problem.

Create a pipeline with the following

@dsl.component(base_image=common_base_image, packages_to_install=['pandas'], pip_index_urls=['https://my-bastion.node.com:9443/root/pypi/simple'])

If you compile the code, the output will be

pip install --quiet --no-warn-script-location --index-url https://my-bastion.node.com:9443/root/pypi/simple --trusted-host https://my-bastion.node.com:9443/root/pypi/simple 'pandas'
{noformat}
This will lead to an error
{noformat}
Traceback (most recent call last):
  File "/home/dlovison/miniconda3/envs/pipe-presentation/bin/pip", line 11, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/main.py", line 79, in main
    return command.main(cmd_args)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 101, in main
    return self._main(args)
           ^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 236, in _main
    self.handle_pip_version_check(options)
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 177, in handle_pip_version_check
    session = self._build_session(
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 122, in _build_session
    session = PipSession(
              ^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/network/session.py", line 398, in __init__
    self.add_trusted_host(host, suppress_logging=True)
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/network/session.py", line 422, in add_trusted_host
    parsed_host, parsed_port = parse_netloc(host)
                               ^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/site-packages/pip/_internal/utils/misc.py", line 475, in parse_netloc
    parsed = urllib.parse.urlparse(url)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/urllib/parse.py", line 395, in urlparse
    splitresult = urlsplit(url, scheme, allow_fragments)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dlovison/miniconda3/envs/pipe-presentation/lib/python3.11/urllib/parse.py", line 497, in urlsplit
    raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL

 

We believe that the problem is that the --trusted-host parameter value should not start with [https://|https://,/] , as it is supposed to be just hostname. So, the output provided by the SDK should be instead:

pip install --quiet --no-warn-script-location --index-url https://my-bastion.node.com:9443/root/pypi/simple --trusted-host my-bastion.node.com 'pandas'

The workaround is to have a pip server only with protocol:URL ( without any port and path ). Example {{pip_index_urls=['https://my-bastion.node.com']}}

It has been tracked internally this is why I forgot to mention the description here.
I also updated the description of the PR.

For having a testing like that we need to have a pip server configured with https. Do you believe we should start having this kind of testing?

@gregsheremeta
Copy link
Contributor

We believe that the problem is that the --trusted-host parameter value should not start with [https://|https://,/] , as it is supposed to be just hostname.

Is this documented somewhere?

@diegolovison
Copy link
Contributor Author

Hi @gregsheremeta, thanks for asking.

If you type: pip install --help you will see the following output

--trusted-host <hostname>   Mark this host or host:port pair as trusted, even though it does not have valid or any HTTPS.

Signed-off-by: Diego Lovison <[email protected]>
@HumairAK
Copy link
Collaborator

@chensun does the above satisfy your concerns? could we get another look?

options.extend(
f'--extra-index-url {extra_index_url} --trusted-host {extra_index_url}'
f'--extra-index-url {extra_index_url} --trusted-host {urllib.parse.urlparse(extra_index_url).hostname}'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the hostname as good enough? No need to specify also the port in case it's present in the original URL? In other words - we trust all apps on that domain in such case, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the hostname as good enough?

Yes, it is sufficient.

No need to specify also the port in case it's present in the original URL?

Correct. The fix doesn't involve the port number.

In other words - we trust all apps on that domain in such case, right?

Could you please rephrase your question?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all was basically one question only - whether we feel confident and secure enough if we put just the hostname without the port here. IIUC the way it is done now means we trust all the services running on that hostname/domain. If we put also port number we would give our approval to only that one and only service that is listening on that one and only port.

Determining the port may be a bit tricky, though, since it may not be present in the URL explicitly but may need to be determined based on the default port of appropriate protocol in use in that URL. Though, we expect just HTTP(S) here, so it should be easy in the end, I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstourac please double-check #11151

@diegolovison
Copy link
Contributor Author

Closed in favor of #11151

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants