Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WOR-1781][WOR-1782] APIs to manage workspace settings #2983

Merged
merged 25 commits into from
Aug 16, 2024

Conversation

marctalbott
Copy link
Contributor

@marctalbott marctalbott commented Aug 8, 2024

Ticket: WOR-1781, WOR-1782

  • Add workspace settings APIs with support for setting lifecycle rules on GCP workspace buckets

PR checklist

  • Include the JIRA issue number in the PR description and title
  • Make sure Swagger is updated if API changes
    • ...and Orchestration's Swagger too!
  • If you changed anything in model/, then you should publish a new official rawls-model and update rawls-model in Orchestration's dependencies.
  • Get two thumbsworth of PR review
  • Verify all tests go green, including CI tests
  • Squash commits and merge to develop (branches are automatically deleted after merging)
  • Inform other teams of any substantial changes via Slack and/or email

@marctalbott marctalbott requested review from a team, blakery and aherbst-broad and removed request for a team August 8, 2024 21:30
Copy link
Contributor

@aednichols aednichols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this looks awesome.

I'm currently working on making Rawls instruct Cromwell to segregate workflow files into submissions/intermediates and submissions/final-outputs.

Do you think it makes sense to enable that behavior with a setting instead of forcing it on everyone?

{
    "type": "SegregateFinalOutputs",
    "config": {
        "enable_segregation": true
    }
}

@marctalbott
Copy link
Contributor Author

@aednichols You certainly could do that with settings! You would know better than me whether it should be forced on everyone, but if you expect that users may prefer different segregation behavior then adding a setting type may be the best way to go about that.

Copy link
Contributor

@aherbst-broad aherbst-broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Can you describe your thinking around handling concurrent requests? I know we talked a bit about a constraint etc. but I don't see evidence of that here -- it'd be good to go in knowing how this will behave in the face of simultaneous requests to mutate settings.

@@ -99,4 +102,55 @@ class WorkspaceRepository(dataSource: SlickDataSource) {
} yield newWorkspace
}

def getWorkspaceSettings(workspaceId: UUID): Future[List[WorkspaceSetting]] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these methods could use comments, esp. to explain the semantics around only returning Applied, the expected calling pattern etc.

@@ -2001,6 +2005,115 @@ class WorkspaceService(
}
} yield {}

def getWorkspaceSettings(workspaceName: WorkspaceName): Future[List[WorkspaceSetting]] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about placing these methods in a more focused WorkspaceSettingsService class or some such thing -- this service is already very large.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered it, but I ultimately didn't as I felt that managing settings belonged in the WorkspaceService as much as anything else. Still, I'm not opposed to it and it would certainly make changing it easier given how much Intellij struggles with the WorkspaceService

} else {
val requestedSettingsMap = requestedSettings.map(s => s.`type` -> s.config).toMap
existingSettings.map { case WorkspaceSetting(settingType, _) =>
WorkspaceSetting(settingType, requestedSettingsMap.getOrElse(settingType, settingType.defaultConfig()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean a previously configured setting will assume the default on any new settings request, regardless if it's included or not? I think it should retain the existing setting, if that's the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it does. My thinking was that this would be a simple overwrite -- any settings provided become the new settings on the workspace and old settings are reverted to the default if not respecified. It's pretty heavy-handed and requires the caller to do more work, so I'm definitely not attached to it.

case _ => throw new RawlsException("unsupported workspace setting")
}).recoverWith { case e =>
workspaceRepository
.removePendingSetting(workspace.workspaceIdAsUUID, setting.`type`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can e be logged somewhere? It looks like it's being swallowed which will make diagnosing issues here difficult. Agreed it should not be surfaced in the error report.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I'll log it


for {
_ <- gcsDAO.setBucketLifecycle(workspace.bucketName, googleRules)
_ <- workspaceRepository.markWorkspaceSettingApplied(workspace.workspaceIdAsUUID, setting.`type`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making sure I understand: not only will this shift the newly inserted setting record from PENDING -> APPLIED, it will mark any previous settings of this type for the workspace from APPLIED -> DELETED. Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct

case WorkspaceSetting(WorkspaceSettingTypes.GcpBucketLifecycle, GcpBucketLifecycleConfig(rules)) =>
val googleRules = rules.map { rule =>
val conditionBuilder =
LifecycleCondition.newBuilder().setMatchesPrefix(rule.conditions.matchesPrefix.toList.asJava)
Copy link
Contributor

@aherbst-broad aherbst-broad Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible rule, or rule.conditions or rule.conditions.matchesPrefix are null here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we constructed setting here by parsing the json and none of those fields are optional. We'd throw a NoSuchElementException if any of them were null. I'll add some tests that assert that similar to this test that checks for the presence of config

case actionType => Some(validationErrorReport(settingType, s"unsupported lifecycle action $actionType"))
}
val ageValidation = rule.conditions.age.collect {
case age if age < 0 =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be <= 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you can set age to 0 to delete immediately. We actually do it when deleting a bucket - deleteBucket


case class GcpBucketLifecycleRule(action: GcpBucketLifecycleAction, conditions: GcpBucketLifecycleCondition)
case class GcpBucketLifecycleAction(`type`: String)
case class GcpBucketLifecycleCondition(matchesPrefix: Set[String], age: Option[Int])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can age be more strongly typed instead of an Int? I worry about mixing up seconds/hours/days etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add type Days = Int to the object so this would read as:

case class GcpBucketLifecycleCondition(matchesPrefix: Set[String], age: Option[Days])

Everything else would remain the same as it's functionally unchanged, but it would make it clearer to future devs

case class GcpBucketLifecycleConfig(rules: List[GcpBucketLifecycleRule]) extends WorkspaceSettingConfig

case class GcpBucketLifecycleRule(action: GcpBucketLifecycleAction, conditions: GcpBucketLifecycleCondition)
case class GcpBucketLifecycleAction(`type`: String)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would prefer not to have to backtick every usage of this param, but I am lazy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I avoided it initially with WorkspaceSetting because I'm also not a fan, but then I wanted this to mimic the Google lifecycle rules format as closely as possible so I ended up caving and using backticks for both since they use type.

enum:
- Delete
GcpBucketLifecycleCondition:
type: object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of these properties required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either can be omitted, but we do need at least one now that I think about it which isn't accounted for. I'll change that.

@marctalbott
Copy link
Contributor Author

Looking good! Can you describe your thinking around handling concurrent requests? I know we talked a bit about a constraint etc. but I don't see evidence of that here -- it'd be good to go in knowing how this will behave in the face of simultaneous requests to mutate settings.

@aherbst-broad Certainly. WorkspaceRepository checks for existing pending settings before saving any new settings at the beginning of the request (source). If there are any pending settings, it will throw an exception. Your question did get me thinking that we could still be at risk of phantom reads and end up with multiple Pending settings, so perhaps we should change the isolation level to Serializable. I'm not super familiar with isolation levels, but does that sound reasonable to you?

@marctalbott marctalbott marked this pull request as ready for review August 9, 2024 15:34
Copy link
Contributor

@blakery blakery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the repo split out

import org.broadinstitute.dsde.rawls.billing.BillingRepository
import org.broadinstitute.dsde.rawls.model.WorkspaceSettingConfig._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these imports actually needed?
I don't see any code changes, so I'm wondering of maybe they're left over from changes that were moved elsewhere?

@@ -3,18 +3,23 @@ package org.broadinstitute.dsde.rawls.workspace
import akka.http.scaladsl.model.StatusCodes
import org.broadinstitute.dsde.rawls.RawlsExceptionWithErrorReport
import org.broadinstitute.dsde.rawls.dataaccess.SlickDataSource
import org.broadinstitute.dsde.rawls.dataaccess.slick.WorkspaceSettingRecord
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: unused imports from when the settings were a part of this class

@marctalbott marctalbott merged commit bd2e5eb into develop Aug 16, 2024
30 checks passed
@marctalbott marctalbott deleted the mtalbott-workspace-settings branch August 16, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants