Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 3602 pod network bandwidth usage action #3621

Merged
merged 11 commits into from
Aug 21, 2024

Conversation

okolesn
Copy link
Collaborator

@okolesn okolesn commented Jul 30, 2024

@okolesn okolesn requested a review from SilinPavel July 30, 2024 07:00
okolesn added 2 commits July 30, 2024 10:04
…tion

# Conflicts:
#	api/src/main/java/com/epam/pipeline/manager/cluster/performancemonitoring/ResourceMonitoringManager.java
@@ -595,4 +595,18 @@ public Result<List<RunInfo>> loadRunsByParentId(@PathVariable(RUN_ID) final Long
public Result<RunChartInfo> loadActiveRunsCharts(@RequestBody final RunChartFilterVO filter) {
return Result.success(runApiService.loadActiveRunsCharts(filter));
}

@PostMapping("/run/{runId}/network")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets rename it to /run/{runId}/network/limit to get instant sense what this method does

produces = MediaType.APPLICATION_JSON_VALUE)
@ApiResponses(value = {@ApiResponse(code = HTTP_STATUS_OK, message = API_STATUS_DESCRIPTION)})
public Result<Boolean> setLimitBoundary(@PathVariable(value = RUN_ID) final Long runId,
@RequestParam(defaultValue = "true") final Boolean limit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'limit' -> 'enable' since we will have limit in the name of the method. Don't forget to change @ApiOperation notes

@@ -489,7 +491,9 @@ private void processHighNetworkConsumingRuns(Map<String, PipelineRun> running,
if (bandwidth >= bandwidthLimit) {
processHighNetworkConsumingRun(run, actionTimeout, action, runsToNotify,
runsToUpdateNotificationTime, bandwidth, runsToUpdateTags);
} else if (run.getLastNetworkConsumptionNotificationTime() != null) {
} else if (run.getLastNetworkConsumptionNotificationTime() != null &&
!run.getTags().containsKey(getNetworkLimitDateTag()) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we even need to change this code? it seems that logic for limiting and defining if run under network pressure should be separate

Comment on lines 52 to 63
if (shouldLimit(tags)) {
boundary = Integer.parseInt(tags.get(NETWORK_LIMIT));
dockerContainerOperationManager.limitNetworkBandwidth(run, boundary, true);
run.addTag(networkLimitDateTag, DateUtils.nowUTCStr());
} else if (shouldCleanLimit(tags)) {
dockerContainerOperationManager.limitNetworkBandwidth(run, 0, false);
run.removeTag(NETWORK_CONSUMING_LEVEL_HIGH);
run.removeTag(NETWORK_LIMIT);
run.removeTag(networkLimitDateTag);
} else {
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that there is a problem. If you define shouldCleanLimit as !shouldLimit it means that else is unreachable. But we actually want to:

  • set limit if there is NETWORK_LIMIT tag but not networkLimitDateTag
  • unset limit if there is networkLimitDateTag but not NETWORK_LIMIT
  • do not do anything as default action

Comment on lines 215 to 218
cd /bin
git clone https://github.com/magnific0/wondershaper.git
cd wondershaper
sudo make install
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrite to

cd /bin && \
git clone https://github.com/magnific0/wondershaper.git && \
cd wondershaper && \
sudo make install

at the end also check if installation was successful and print log about the result

if [ "$i" != "lo" ]; then
link_num=$(sudo docker exec -it $container_id cat /sys/class/net/$i/iflink | tr -d '\r')
interface_id=$(sudo ip ad | grep "^$link_num:" | awk -F ': ' '{print $2}' | awk -F '@' '{print $1}')
if [ ! -z $interface_id ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to warn if interface_id can't be found for i

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also please wrap $interface_id with ""


set -x

interfaces=$(sudo docker exec -it $container_id ls /sys/class/net | tr -d '\r')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's check result of the command and if it's != 0 warn to the run and exit

# download_rate and upload_rate in kilobits per second
result=$($command -a $interface_id -d $download_rate -u $upload_rate 2>&1)
if [ ! -z "$result" ]; then
pipe_log_warn "[WARN] Failed to apply bandwidth limiting"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to write to the run log you need to have its RUN_ID and TASK_NAME in the env, but you doesn't. Please see how it is done in commit run. Basically we pass RUN_ID wher launch a command

Assert.notNull(containerId,
messageHelper.getMessage(MessageConstants.ERROR_CONTAINER_ID_FOR_RUN_NOT_FOUND, run.getId()));
final int boundaryKBitsPerSec = limit ? boundary * 8 / BYTES_IN_KB : 0;
final String getSizeCommand = LimitBandwidthCommand.builder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also should pass runId to the script, because script will need it

okolesn and others added 7 commits August 1, 2024 12:43
…work_bandwidth_usage_action

# Conflicts:
#	api/src/main/java/com/epam/pipeline/acl/run/RunApiService.java
#	api/src/main/java/com/epam/pipeline/controller/pipeline/PipelineRunController.java
@SilinPavel SilinPavel force-pushed the Issue_3602_Pod_network_bandwidth_usage_action branch from 23f3f9d to 699f564 Compare August 21, 2024 12:46
@SilinPavel SilinPavel merged commit 9d77b53 into develop Aug 21, 2024
4 checks passed
SilinPavel pushed a commit that referenced this pull request Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants