Large percentage of cpu time in memset when reading with a larger buffer

Reproduced using a slightly modified version of `examples/client.rs` to allow specifying a path, and reading into a user-provided buffer. The problem was originally noticed via `rusoto` s3 client, which has a configuration for a read buffer size, which gets mapped to `hyper` `http1_read_buf_exact_size`. Since aws s3 recommends fetching large ranges, using a larger buffer seemed like a good idea and was not expected to cause any cpu overhead.

![flamegraph](https://github.com/rustls/tokio-rustls/assets/689138/7c563b02-46ed-4878-9c23-9f08a414c022)

```rust
use std::fs::File;
use std::io;
use std::io::BufReader;
use std::net::ToSocketAddrs;
use std::path::PathBuf;
use std::sync::Arc;

use argh::FromArgs;
use tokio::io::{split, AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpStream;
use tokio_rustls::TlsConnector;

/// Tokio Rustls client example
#[derive(FromArgs)]
struct Options {
    /// host
    #[argh(positional)]
    host: String,

    /// path
    #[argh(positional)]
    path: Option<String>,

    /// port
    #[argh(option, short = 'p', default = "443")]
    port: u16,

    /// domain
    #[argh(option, short = 'd')]
    domain: Option<String>,

    /// cafile
    #[argh(option, short = 'c')]
    cafile: Option<PathBuf>,
}

#[tokio::main]
async fn main() -> io::Result<()> {
    let options: Options = argh::from_env();

    let addr = (options.host.as_str(), options.port)
        .to_socket_addrs()?
        .next()
        .ok_or_else(|| io::Error::from(io::ErrorKind::NotFound))?;
    let domain = options.domain.unwrap_or(options.host);
    let path = options.path.as_ref().map(|p| p.as_str()).unwrap_or("/");
    let content = format!(
        "GET {} HTTP/1.0\r\nConnection: close\r\nHost: {}\r\n\r\n",
        path, domain
    );

    let mut root_cert_store = rustls::RootCertStore::empty();
    if let Some(cafile) = &options.cafile {
        let mut pem = BufReader::new(File::open(cafile)?);
        for cert in rustls_pemfile::certs(&mut pem) {
            root_cert_store.add(cert?).unwrap();
        }
    } else {
        root_cert_store.extend(webpki_roots::TLS_SERVER_ROOTS.iter().cloned());
    }

    let config = rustls::ClientConfig::builder()
        .with_root_certificates(root_cert_store)
        .with_no_client_auth(); // i guess this was previously the default?
    let connector = TlsConnector::from(Arc::new(config));

    let stream = TcpStream::connect(&addr).await?;

    let domain = pki_types::ServerName::try_from(domain.as_str())
        .map_err(|_| io::Error::new(io::ErrorKind::InvalidInput, "invalid dnsname"))?
        .to_owned();

    let mut stream = connector.connect(domain, stream).await?;
    stream.write_all(content.as_bytes()).await?;

    let (mut reader, _writer) = split(stream);

    let mut buffer = Vec::with_capacity(4 * 1024 * 1024);
    let mut total_len = 0_usize;

    loop {
        match reader.read_buf(&mut buffer).await {
            Ok(len) => {
                total_len += len;
                buffer.clear();
                if len == 0 {
                    break;
                }
            }
            Err(e) => {
                eprintln!("{:?}", e);
                break;
            }
        }
    }

    println!("Size: {}", total_len);

    Ok(())
}
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Large percentage of cpu time in memset when reading with a larger buffer #55

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Large percentage of cpu time in memset when reading with a larger buffer #55

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions