-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 3602 pod network bandwidth usage action #3621
Issue 3602 pod network bandwidth usage action #3621
Conversation
…tion # Conflicts: # api/src/main/java/com/epam/pipeline/manager/cluster/performancemonitoring/ResourceMonitoringManager.java
@@ -595,4 +595,18 @@ public Result<List<RunInfo>> loadRunsByParentId(@PathVariable(RUN_ID) final Long | |||
public Result<RunChartInfo> loadActiveRunsCharts(@RequestBody final RunChartFilterVO filter) { | |||
return Result.success(runApiService.loadActiveRunsCharts(filter)); | |||
} | |||
|
|||
@PostMapping("/run/{runId}/network") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets rename it to /run/{runId}/network/limit
to get instant sense what this method does
produces = MediaType.APPLICATION_JSON_VALUE) | ||
@ApiResponses(value = {@ApiResponse(code = HTTP_STATUS_OK, message = API_STATUS_DESCRIPTION)}) | ||
public Result<Boolean> setLimitBoundary(@PathVariable(value = RUN_ID) final Long runId, | ||
@RequestParam(defaultValue = "true") final Boolean limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'limit' -> 'enable' since we will have limit
in the name of the method. Don't forget to change @ApiOperation notes
@@ -489,7 +491,9 @@ private void processHighNetworkConsumingRuns(Map<String, PipelineRun> running, | |||
if (bandwidth >= bandwidthLimit) { | |||
processHighNetworkConsumingRun(run, actionTimeout, action, runsToNotify, | |||
runsToUpdateNotificationTime, bandwidth, runsToUpdateTags); | |||
} else if (run.getLastNetworkConsumptionNotificationTime() != null) { | |||
} else if (run.getLastNetworkConsumptionNotificationTime() != null && | |||
!run.getTags().containsKey(getNetworkLimitDateTag()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we even need to change this code? it seems that logic for limiting and defining if run under network pressure should be separate
if (shouldLimit(tags)) { | ||
boundary = Integer.parseInt(tags.get(NETWORK_LIMIT)); | ||
dockerContainerOperationManager.limitNetworkBandwidth(run, boundary, true); | ||
run.addTag(networkLimitDateTag, DateUtils.nowUTCStr()); | ||
} else if (shouldCleanLimit(tags)) { | ||
dockerContainerOperationManager.limitNetworkBandwidth(run, 0, false); | ||
run.removeTag(NETWORK_CONSUMING_LEVEL_HIGH); | ||
run.removeTag(NETWORK_LIMIT); | ||
run.removeTag(networkLimitDateTag); | ||
} else { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that there is a problem. If you define shouldCleanLimit
as !shouldLimit
it means that else
is unreachable. But we actually want to:
- set limit if there is
NETWORK_LIMIT
tag but notnetworkLimitDateTag
- unset limit if there is networkLimitDateTag but not NETWORK_LIMIT
- do not do anything as default action
cd /bin | ||
git clone https://github.com/magnific0/wondershaper.git | ||
cd wondershaper | ||
sudo make install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrite to
cd /bin && \
git clone https://github.com/magnific0/wondershaper.git && \
cd wondershaper && \
sudo make install
at the end also check if installation was successful and print log about the result
if [ "$i" != "lo" ]; then | ||
link_num=$(sudo docker exec -it $container_id cat /sys/class/net/$i/iflink | tr -d '\r') | ||
interface_id=$(sudo ip ad | grep "^$link_num:" | awk -F ': ' '{print $2}' | awk -F '@' '{print $1}') | ||
if [ ! -z $interface_id ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to warn if interface_id
can't be found for i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also please wrap $interface_id
with ""
|
||
set -x | ||
|
||
interfaces=$(sudo docker exec -it $container_id ls /sys/class/net | tr -d '\r') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's check result of the command and if it's != 0
warn to the run and exit
# download_rate and upload_rate in kilobits per second | ||
result=$($command -a $interface_id -d $download_rate -u $upload_rate 2>&1) | ||
if [ ! -z "$result" ]; then | ||
pipe_log_warn "[WARN] Failed to apply bandwidth limiting" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to write to the run log you need to have its RUN_ID
and TASK_NAME
in the env, but you doesn't. Please see how it is done in commit run
. Basically we pass RUN_ID
wher launch a command
Assert.notNull(containerId, | ||
messageHelper.getMessage(MessageConstants.ERROR_CONTAINER_ID_FOR_RUN_NOT_FOUND, run.getId())); | ||
final int boundaryKBitsPerSec = limit ? boundary * 8 / BYTES_IN_KB : 0; | ||
final String getSizeCommand = LimitBandwidthCommand.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also should pass runId
to the script, because script will need it
…le to write logs into run
…etwork.bandwidth.command.timeout
…script to be more resilient and atomic
…work_bandwidth_usage_action # Conflicts: # api/src/main/java/com/epam/pipeline/acl/run/RunApiService.java # api/src/main/java/com/epam/pipeline/controller/pipeline/PipelineRunController.java
23f3f9d
to
699f564
Compare
(cherry picked from commit 9d77b53)
Pod nework bandwidth usage #3602