Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aggregator_core: Allow dumping query plans to stdout #3347

Merged
merged 6 commits into from
Aug 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions aggregator_core/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# `aggregator_core`

`aggregator_core` contains helper code for the `aggregator` crate. It mainly consists of data
structures that are only pertinent to a DAP aggregator, and subroutines for talking to a PostgreSQL
database.

It is not published to crates.io and should not be depended on directly by users of Janus.

## PostgreSQL Logs

During tests, you can have PostgreSQL dump its logs to stdout by setting
`JANUS_TEST_DUMP_POSTGRESQL_LOGS`. The test database is set to log all query plans.

Example:
```
JANUS_TEST_DUMP_POSTGRESQL_LOGS= RUST_LOG=debug cargo test datastore::tests::roundtrip_task::case_1 -- --exact --nocapture
```
116 changes: 96 additions & 20 deletions aggregator_core/src/datastore/test_util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,30 @@ use sqlx::{
Connection, PgConnection,
};
use std::{
env,
path::PathBuf,
str::FromStr,
sync::{Arc, Weak},
thread::JoinHandle,
time::Duration,
};
use testcontainers::{runners::AsyncRunner, ContainerAsync, ContainerRequest, ImageExt};
use tokio::sync::Mutex;
use testcontainers::{runners::AsyncRunner, ContainerRequest, ImageExt};
use tokio::{
io::{AsyncBufRead, AsyncBufReadExt},
join,
sync::{
oneshot::{self, Sender},
Mutex,
},
};
use tokio_postgres::{connect, Config, NoTls};
use tracing::trace;

use super::SUPPORTED_SCHEMA_VERSIONS;

struct EphemeralDatabase {
_db_container: ContainerAsync<Postgres>,
db_thread: Option<JoinHandle<()>>,
db_thread_shutdown: Option<Sender<()>>,
divergentdave marked this conversation as resolved.
Show resolved Hide resolved
port_number: u16,
}

Expand All @@ -49,24 +59,69 @@ impl EphemeralDatabase {
}

async fn start() -> Self {
// Start an instance of Postgres running in a container.
let db_container = ContainerRequest::from(Postgres::default())
.with_cmd(Vec::from([
"-c".to_string(),
"max_connections=200".to_string(),
]))
.start()
.await
.unwrap();
const POSTGRES_DEFAULT_PORT: u16 = 5432;
let port_number = db_container
.get_host_port_ipv4(POSTGRES_DEFAULT_PORT)
.await
.unwrap();
trace!("Postgres container is up with port {port_number}");

let (shutdown_tx, shutdown_rx) = oneshot::channel();
let (port_tx, port_rx) = oneshot::channel();

// Hack: run testcontainer logic under its own thread with its own tokio runtime, to avoid
// deadlocking the main runtime when waiting for logs to finish in the Drop implementation.
let db_thread = std::thread::spawn(move || {
tokio::runtime::Builder::new_current_thread()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach to resolving the deadlock LGTM. I suspect another way to solve it would be with [tokio::test(flavor = "multi_threaded")] everywhere, but that would be very intrusive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, if all tests run multithreaded, then maybe we can use block_in_place(). But yeah having to decorate every test would be irritating--we would probably be better off writing our own macro wrapper in that case. (I wonder why tokio tests default to the current thread runtime anyways.)

.enable_all()
.build()
.unwrap()
.block_on(async move {
let should_log = env::var_os("JANUS_TEST_DUMP_POSTGRESQL_LOGS").is_some();
let mut configuration = Vec::from(["max_connections=200"]);
if should_log {
configuration.append(&mut Vec::from([
// Enable logging of query plans.
"shared_preload_libraries=auto_explain",
"log_min_messages=LOG",
"auto_explain.log_min_duration=0",
"auto_explain.log_analyze=true",
Comment on lines +77 to +81
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to conditionally enable these settings when the JANUS_TEST_DUMP_POSTGRESQL_LOGS environment variable is set. Both enabling auto-explain and raising the logging level could add overhead to tests.

]))
}

// Start an instance of Postgres running in a container.
let db_container = ContainerRequest::from(Postgres::default())
.with_cmd(Self::postgres_configuration(&configuration))
.start()
.await
.unwrap();

let stdout = db_container.stdout(true);
let stderr = db_container.stderr(true);
let log_consumer_handle = should_log.then(|| {
tokio::spawn(async move {
join!(buffer_printer(stdout), buffer_printer(stderr));
})
});

const POSTGRES_DEFAULT_PORT: u16 = 5432;
let port_number = db_container
.get_host_port_ipv4(POSTGRES_DEFAULT_PORT)
.await
.unwrap();
trace!("Postgres container is up with port {port_number}");

// Send port information, which frees this function to continue.
port_tx.send(port_number).unwrap();

// Wait for shutdown to be signalled.
shutdown_rx.await.unwrap();

// Shutdown container and Wait for log consumers to stop.
db_container.stop().await.unwrap();
if let Some(log_consumer_handle) = log_consumer_handle {
log_consumer_handle.await.unwrap();
}
})
});

let port_number = port_rx.await.unwrap();
Self {
_db_container: db_container,
db_thread: Some(db_thread),
db_thread_shutdown: Some(shutdown_tx),
port_number,
}
}
Expand All @@ -77,6 +132,27 @@ impl EphemeralDatabase {
self.port_number
)
}

fn postgres_configuration<'a>(settings: &[&'a str]) -> Vec<&'a str> {
settings
.iter()
.flat_map(|setting| ["-c", setting])
.collect::<Vec<_>>()
}
}

async fn buffer_printer(buffer: std::pin::Pin<Box<dyn AsyncBufRead + Send>>) {
let mut lines = buffer.lines();
while let Ok(Some(line)) = lines.next_line().await {
println!("{}", line);
}
}

impl Drop for EphemeralDatabase {
fn drop(&mut self) {
self.db_thread_shutdown.take().map(|tx| tx.send(()));
self.db_thread.take().map(|thread| thread.join());
}
}

/// EphemeralDatastore represents an ephemeral datastore instance. It has methods allowing
Expand Down
Loading