Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ability to choose ingress creation for Kserve RawDeployment #1230

Open
wants to merge 2 commits into
base: incubation
Choose a base branch
from

Conversation

VedantMahabaleshwarkar
Copy link
Contributor

@VedantMahabaleshwarkar VedantMahabaleshwarkar commented Sep 12, 2024

Description

Fixes : https://issues.redhat.com/browse/RHOAIENG-10291

  • Remove the logic to set the disableIngressCreation flag in inferenceservice-config
    • This is now done in the Kserve Manifests itself, the flag will always be true
  • Added a field in the kserve spec for disableIngressCreation. Default is true, if user has set a value then that value will be used.
  • Set the ingressDomain in inferenceservice-config to the clusterDomain

Merge after opendatahub-io/odh-model-controller#274 and opendatahub-io/kserve#419

How Has This Been Tested?

  • Deploy custom operator+kserve+odh-model-controller:

  • Using custom fork+branch to be able to apply the CRD changes along with the custom operator image
    -- git clone [email protected]:VedantMahabaleshwarkar/opendatahub-operator.git
    -- cd opendatahub-operator
    -- git checkout origin/j-10291
    -- make deploy -e IMG=quay.io/vedantm/opendatahub-operator:latest -e OPERATOR_NAMESPACE=opendatahub-operator-system

  • To test Inference with auth:
    -- Create any kserve isvc+SR
    -- isvc should have annotation : "serving.kserve.io/deploymentMode" : RawDeployment
    -- isvc should have label: "networking.kserve.io/odh-auth": "true"

  • To test Inference without auth:
    -- remove isvc label "networking.kserve.io/odh-auth": "true"

  • To test disabling route for a particular ISVC
    -- Create any kserve isvc+SR
    -- isvc should have annotation : "serving.kserve.io/deploymentMode" : RawDeployment
    -- isvc should have label : "networking.kserve.io/visibility": "cluster-local"

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link

openshift-ci bot commented Sep 12, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

openshift-ci bot commented Sep 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zdtsw
Copy link
Member

zdtsw commented Sep 13, 2024

i put this on hold because of the dependent PRs are not merged yet.

Copy link

openshift-ci bot commented Oct 9, 2024

New changes are detected. LGTM label has been removed.

@VedantMahabaleshwarkar VedantMahabaleshwarkar changed the title set clusterDomain as ingressDomain for Kserve RawDeployment Routes add ability to choose ingress creation for Kserve RawDeployment Oct 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (incubation@294cc1e). Learn more about missing BASE report.

Additional details and impacted files
@@              Coverage Diff              @@
##             incubation    #1230   +/-   ##
=============================================
  Coverage              ?   18.91%           
=============================================
  Files                 ?       30           
  Lines                 ?     2670           
  Branches              ?        0           
=============================================
  Hits                  ?      505           
  Misses                ?     2103           
  Partials              ?       62           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment. Otherwise LGTM.

components/kserve/kserve_config_handler.go Show resolved Hide resolved
components/kserve/kserve_config_handler.go Show resolved Hide resolved
@@ -25,18 +25,22 @@ const (
func (k *Kserve) setupKserveConfig(ctx context.Context, cli client.Client, logger logr.Logger, dscispec *dsciv1.DSCInitializationSpec) error {
// as long as Kserve.Serving is not 'Removed', we will setup the dependencies

var defaultDeploymentMode DefaultDeploymentMode
var disableIngressCreation bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is misleading since it states disable but can mean both yes and no. It should be isDisable or something which indicates it's a flag (only variable name in the code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the name is confusing but since this variable is setting the field ingress["disableIngressCreation"] that is why I named it to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it should be something like disableIngressCreationValue

@@ -25,18 +25,22 @@ const (
func (k *Kserve) setupKserveConfig(ctx context.Context, cli client.Client, logger logr.Logger, dscispec *dsciv1.DSCInitializationSpec) error {
// as long as Kserve.Serving is not 'Removed', we will setup the dependencies

var defaultDeploymentMode DefaultDeploymentMode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it selects mode from default to Serverless, the name may be misleading. Probably, it is developmentMode now? I do not remember details where the default mode can be changed, so ignore if I'm wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the field is called deploymentMode in Kserve so we are keeping the terminology consistent. The options for deploymentMode are RawDeployment or Serverless

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you confirmed that variable defaultDeploymentMode should be called deploymentMode.

@@ -54,6 +54,8 @@ type Kserve struct {
// This field is optional. If no default deployment mode is specified, Kserve will use Serverless mode.
// +kubebuilder:validation:Enum=Serverless;RawDeployment
DefaultDeploymentMode DefaultDeploymentMode `json:"defaultDeploymentMode,omitempty"`
// +kubebuilder:validation:Type:bool
DisableIngressCreation bool `json:"disableIngressCreation,omitempty"`
Copy link
Contributor

@lburgazzoli lburgazzoli Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the Open Shift API guidelines, boolean should be avoided if not strictly necessary.

To follow other conventions, eventually this option could be defined as an enum, like ingress: Managed|Unmanaged with its default value set of Managed, which to me seems to be more clear and avoid using negative names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to be a enum

…et cluster domain

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
// Removed: Route creation is disabled globally. In this configuration it is not possible for an individual
// InferenceService to opt in to having a Route created automatically.
// +kubebuilder:validation:Enum=Managed;Removed
RawRouteCreation operatorv1.ManagementState `json:"rawRouteCreation,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, expose the configuration with the same name as in our ConfigMap? I'm not seeing beneficial renaming it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion was made here,

disclaimer: I don't have any strong opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IngressCreationState ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... we already started by using the same name with defaultDeploymentMode one. So, I'd simply lean towards using the same names as in the ConfigMap.

Additionally, sometimes we recommend manually modifying our ConfigMap. So, having a different name than in the ConfigMap will be confusing.

But yes, I think it is quite unfortunate that KServe project is using booleans, while OpenShift recommendation is to not use them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, sometimes we recommend manually modifying our ConfigMap. So, having a different name than in the ConfigMap will be confusing.

What would be the expected behavior in this case ? I think that if we make such option visible through the platform APIs, then the config map should not be touched by the end user otherwise it would become very confusing (IMHO)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, the ConfigMap is going to be rolled back by the operator. So, in the default state, it is true that the user cannot touch the ConfigMap.

If the users needs to do manual changes, a special annotation is required: opendatahub.io/managed. With the annotation, the operator will no longer touch the ConfigMap and these fields in the DSC will be useless. Even upgrades won't receive the updated manifests. Currently, docs do not mention the annotation, so IMO this is only for special cases. I do agree that having two ways of configuration is confusing.

I do not know what are the future plans around the annotation, TBH. I simply know that if we ever document it, we will need to maintain in our docs what's our supported ConfigMap, and what we support users from modifying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielezonca @terrytangyuan any thoughts here? my initial preference was to match the upstream configmap but then we'd be violating openshift API guidelines. I don't have a strong preference either way right now

}
ingressData["ingressDomain"] = clusterDomain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think that we should always ensure that the domain is correctly configured. So, probably this should be outside the if.

@@ -25,18 +25,18 @@ const (
func (k *Kserve) setupKserveConfig(ctx context.Context, cli client.Client, logger logr.Logger, dscispec *dsciv1.DSCInitializationSpec) error {
// as long as Kserve.Serving is not 'Removed', we will setup the dependencies

defaultDeploymentMode := k.DefaultDeploymentMode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reality, this is changing the outcome and doesn't look right.
Previously

  • When Knative is available:
    • If the user does not specify a default deployment mode:
      • We configure Serverless as the default mode.
      • We enable creation of the ingress.
    • If the user specifies a default mode:
      • We respect the user specified default deployment mode.
      • We disable creation of the ingress if the user specified Raw. Otherwise, we enable it.
  • When Knative is NOT available:
    • If the user does not specify a default deployment mode:
      • We configure Raw as the default mode.
      • We disable creation of the ingress.
    • If the user specifies a default mode:
      • If the user specified Serverless as default mode, we stop reconciliation with an error.
      • Because of the previous, the user can only specify Raw. So, we use Raw.
      • We disable creation of the ingress.

Now

  • When Knative is available:
    • If the user does not specify a default deployment mode:
      • We configure Serverless as the default mode.
      • We disable the ingress if the user didn't specify a config. Otherwise, we use the user config.
        • The default is different from the previous state, as there is no conditional.
    • If the user specifies a default mode:
      • We respect the user specified default deployment mode.
      • We disable the ingress if the user didn't specify a config. Otherwise, we use the user config.
        • The default is different from the previous state, as there is no conditional.
  • When Knative is NOT available:
    • If the user does not specify a default deployment mode:
      • We take Serverless as the default mode. Then, we return an error with our own default.
        • This is quite wrong, IMO.
    • If the user specifies a default mode:
      • If the user specified Serverless as default mode, we stop reconciliation with an error.
      • Because of the previous, the user can only specify Raw. So, we use Raw.
      • We disable the ingress if the user didn't specify a config. Otherwise, we use the user config.
        • This one looks OK.

Generally speaking, any upgrade of a previous install will receive an updated config which, IMO, shouldn't happen. We should be backwards compatible here.

defaultDeploymentMode = Serverless
}
disableIngressCreation := true
if k.RawRouteCreation == operatorv1.Managed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen(assume the new field is called "rawRouteCreation" for now) , when user set

kserve:
      managementState: Managed
      defaultDeploymentMode: RawDeployment
      rawRouteCreation: Removed
      serving:
        ingressGateway:
          certificate:
            type: OpenshiftDefaultIngress
        managementState: Managed
        name: knative-serving

@zdtsw zdtsw removed the approved label Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

7 participants