Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nearest dc balancer #206

Open
wants to merge 42 commits into
base: master
Choose a base branch
from
Open

Nearest dc balancer #206

wants to merge 42 commits into from

Conversation

bakalover
Copy link

@bakalover bakalover commented Aug 4, 2024

No description provided.

@bakalover bakalover marked this pull request as draft August 4, 2024 14:32
@bakalover
Copy link
Author

bakalover commented Aug 8, 2024

  • squash
  • tests

Copy link
Member

@rekby rekby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for the draft :)

I suggest few ideas for that state

ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
@bakalover bakalover marked this pull request as ready for review August 9, 2024 06:50
@rekby rekby linked an issue Aug 23, 2024 that may be closed by this pull request
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
}

const NODES_PER_DC: usize = 5;
const PING_TIMEOUT_SECS: u64 = 60;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about use discovery/ping interval for ping timeout?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

60 sec - interval for TimerDiscovery renewal (there is also no predefined constants)
https://github.com/ydb-platform/ydb-rs-sdk/blob/master/ydb/src/client_builder.rs#L283

ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
ydb/src/load_balancer.rs Outdated Show resolved Hide resolved
Copy link
Member

@rekby rekby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the great fork

I have only few small comments


impl NearestDCBalancer {
fn get_endpoint(&self, service: Service) -> YdbResult<Uri> {
match self.balancer_state.try_lock() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use lock instead of try_lock: we don`t need fast fail at the path: if code doesn't see endpoint it will return to retrier. The retraier will see custom error and fail transaction.

Better wait until update state finish: it is fast operation.

the method should return a error is the ballancer is empty

});
match Self::find_local_dc(&to_check).await {
Ok(dc) => {
info!("found new local dc:{}", dc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is often message, not intresting when all ok. Better to use debug for the message

ping_token: CancellationToken,
waiter: Arc<WaiterImpl>,
config: BalancerConfig,
balancer_state: Arc<Mutex<BalancerState>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about use RWLock instead of Mutex?
it will allow multiply reads in parallel and block for update state only

new_nodes: &Vec<NodeInfo>,
local_dc: String,
) {
info!("adjusting endpoints");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be debug or trace log level

tokio::select! {
biased; // check timeout first
_ = interrupt_collector_future.cancelled() =>{
Self::join_all(&mut nursery).await; // Children will be cancelled due to tokens chaining, see (*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what mean the comment: "see (*)"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rust Detect and select nearest DB by tcp-pings
2 participants