Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Internal] Update Jobs GetRun API to support paginated responses for jobs and ForEach tasks with >100 runs #324

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .codegen/workspace.java.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import com.databricks.sdk.core.DatabricksConfig;
defined in an outer scope (https://github.com/golang/go/issues/17454). */ -}}
import com.databricks.sdk.mixin.ClustersExt;
import com.databricks.sdk.mixin.DbfsExt;
import com.databricks.sdk.mixin.JobsExt;
import com.databricks.sdk.mixin.SecretsExt;
{{range .Services}}{{if and (not .IsAccounts) (not .IsDataPlane)}}
import com.databricks.sdk.service.{{.Package.Name}}.{{.PascalName}}API;
Expand All @@ -18,7 +19,7 @@ import com.databricks.sdk.service.{{.Package.Name}}.{{.PascalName}}Service;
import com.databricks.sdk.support.Generated;

{{- define "api" -}}
{{- $mixins := dict "ClustersAPI" "ClustersExt" "DbfsAPI" "DbfsExt" "SecretsAPI" "SecretsExt" -}}
{{- $mixins := dict "ClustersAPI" "ClustersExt" "DbfsAPI" "DbfsExt" "SecretsAPI" "SecretsExt" "JobsAPI" "JobsExt" -}}
{{- $genApi := concat .PascalName "API" -}}
{{- getOrDefault $mixins $genApi $genApi -}}
{{- end -}}
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
package com.databricks.sdk.mixin;

import com.databricks.sdk.core.ApiClient;
import com.databricks.sdk.service.jobs.*;
import java.util.Collection;

public class JobsExt extends JobsAPI {

public JobsExt(ApiClient apiClient) {
super(apiClient);
}

public JobsExt(JobsService mock) {
super(mock);
}

/**
* Wrap the {@code JobsApi.getRun} operation to retrieve paginated content without breaking the
* response contract.
*
* <p>Depending on the Jobs API version used under the hood, tasks or iteration runs retrieved by
* the initial request may be truncated due to high cardinalities. Truncation can happen for job
* runs over 100 task runs, as well as ForEach task runs with over 100 iteration runs. To avoid
* returning an incomplete {@code Run} object to the user, this method performs all the requests
* required to collect all task/iteration runs into a single {@code Run} object.
*/
@Override
public Run getRun(GetRunRequest request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, we may want to have a separate doccomment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here

I may have missed another comment of yours.
I added a docstring to this method in this commit: e9eec92 Is this what you were looking for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you were referring to this? databricks/databricks-sdk-py#725 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved further to explicitly mention the 100 task/iteration limit: 057f140

Run run = super.getRun(request);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this calling API 2.2 or 2.1? I can't find where the version is defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently 2.1. The implementation is supposed to work with both (as long as the types are there).

Copy link
Contributor

@gkiko10 gkiko10 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly you will override the String path in JobsImpl.java variable later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JobsImpl.java is generated. It follows that the String path there is also generated. My understanding is that when the version is updated on the proto, the next generation will automatically update those paths.


/*
* fetch all additional pages (if any) and accumulate the result in a single response
*/

Collection<RunTask> iterations = run.getIterations();
boolean paginatingIterations = iterations != null && !iterations.isEmpty();

Run currRun = run;
while (currRun.getNextPageToken() != null) {
request.setPageToken(currRun.getNextPageToken());
currRun = super.getRun(request);
if (paginatingIterations) {
Collection<RunTask> newIterations = currRun.getIterations();
if (newIterations != null) {
run.getIterations().addAll(newIterations);
}
} else {
Collection<RunTask> newTasks = currRun.getTasks();
if (newTasks != null) {
run.getTasks().addAll(newTasks);
}
}
}

// now that we've added all pages to the Run, the tokens are useless
run.setNextPageToken(null);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous page token too I suppose

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run.setPrevPageToken(null);

return run;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
package com.databricks.sdk.mixin;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.*;

import com.databricks.sdk.service.jobs.GetRunRequest;
import com.databricks.sdk.service.jobs.JobsService;
import com.databricks.sdk.service.jobs.Run;
import com.databricks.sdk.service.jobs.RunTask;
import java.util.ArrayList;
import java.util.Collection;
import org.junit.jupiter.api.Test;
import org.mockito.Mockito;

public class JobsExtTest {

@Test
public void testGetRunPaginationWithTasks() {
JobsService service = Mockito.mock(JobsService.class);

Run firstPage = new Run().setNextPageToken("tokenToSecondPage");
addTasks(firstPage, 0L, 1L);
Run secondPage = new Run().setNextPageToken("tokenToThirdPage");
addTasks(secondPage, 2L, 3L);
Run thirdPage = new Run();
addTasks(thirdPage, 4L);

when(service.getRun(any())).thenReturn(firstPage).thenReturn(secondPage).thenReturn(thirdPage);

JobsExt jobsExt = new JobsExt(service);

GetRunRequest request = new GetRunRequest();

Run run = jobsExt.getRun(request);

Run expectedRun = new Run();
addTasks(expectedRun, 0L, 1L, 2L, 3L, 4L);

assertEquals(expectedRun, run);
verify(service, times(3)).getRun(any());
}

@Test
public void testGetRunPaginationWithIterations() {
JobsService service = Mockito.mock(JobsService.class);

Run firstPage = new Run().setNextPageToken("tokenToSecondPage");
addIterations(firstPage, 0L, 1L);
Run secondPage = new Run().setNextPageToken("tokenToThirdPage");
addIterations(secondPage, 2L, 3L);
Run thirdPage = new Run();
addIterations(thirdPage, 4L);

when(service.getRun(any())).thenReturn(firstPage).thenReturn(secondPage).thenReturn(thirdPage);

JobsExt jobsExt = new JobsExt(service);

GetRunRequest request = new GetRunRequest();

Run run = jobsExt.getRun(request);

Run expectedRun = new Run();
addIterations(expectedRun, 0L, 1L, 2L, 3L, 4L);

assertEquals(expectedRun, run);
verify(service, times(3)).getRun(any());
}

private void addTasks(Run run, long... taskRunIds) {
Collection<RunTask> tasks = new ArrayList<>();
for (long runId : taskRunIds) {
tasks.add(new RunTask().setRunId(runId));
}
run.setTasks(tasks);
}

private void addIterations(Run run, long... iterationRunIds) {
Collection<RunTask> iterations = new ArrayList<>();
for (long runId : iterationRunIds) {
iterations.add(new RunTask().setRunId(runId));
}
run.setIterations(iterations);
}
}
Loading