Slow throughput when reading many small files #953

pkasravi · 2024-07-24T16:54:46Z

Mountpoint for Amazon S3 version

mount-s3 1.7.2

AWS Region

us-west-2

Describe the running environment

Running in p4de EC2 instance on Amazon Linux 2 using instance profile credentials against an S3 Bucket in the same account.

Mountpoint options

mount-s3 \                                                                                                                                                                                                
  --transfer-acceleration \                                                                                                                                                                               
  --read-only \                                                                                                                                                                                           
  --max-threads 2048 \                                                                                                                                                                                    
  --cache /media/ramdisk/cache \                                                                                                                                                                          
  --prefix <PREFIX> \                                                                                                                                                               
  <BUCKET> /media/ramdisk/data

What happened?

I am trying to read 16k small 8MB files in parallel from S3. I have been comparing Mountpoint-S3 and goofys. I am seeing a large difference in performance when using Mountpoint. With goofys I am able to read all the files in 44s, with Mountpoint it takes 478s. These timings are averaged over 5 test runs. Both goofys and Mountpoint are mounted to tmpfs file systems.

Relevant log output

No response

jamesbornholt · 2024-07-29T05:14:14Z

I think Mountpoint may not be correctly configuring the default network throughput on p4de instances. Could you try the argument --maximum-throughput-gbps 100 and see if that helps? For small-ish files like yours, setting it even higher (say to 400) might help more, but 100 would be a good place to start (it's what we configure for p4d).

Also, since you're going from EC2 to S3, --transfer-acceleration probably isn't needed — Transfer Acceleration is S3's feature for clients outside EC2 to route their traffic onto the Amazon network from as close to the customer as possible.

pkasravi · 2024-07-29T20:30:41Z

Thanks for the reply. I tried your suggestions, with max throughput set to 400 I am seeing mixed results shown below.

Here is the modified mount command with some extra context on my environment

sudo mount -t tmpfs -o size=140G tmpfs /media/ramdisk
sudo mkdir -p /media/ramdisk/input
sudo mkdir -p /media/ramdisk/cache

echo "Starting Mountpoint-S3..."
mount-s3 \
  --maximum-throughput-gbps 400 \
  --read-only \
  --max-threads 2048 \
  --cache /media/ramdisk/cache \
  --prefix $S3_PREFIX \
  $S3_BUCKET /media/ramdisk/input

Also here is a minimally reproducible example I used to generate the performance numbers

fn main() {
    let args: Vec<String> = env::args().collect();
    if args.len() != 2 {
        println!("Usage: main <INPUT_DIR>");
        return;
    }
    println!("Reading .bin files from {}", &args[1]);
    let dir_path = Path::new(&args[1]);
    let files = fs::read_dir(dir_path).unwrap();

    let mut handles = vec![];
    let total_bytes = Arc::new(Mutex::new(0));

    let start_time = Instant::now();

    for file in files {
        let file = file.unwrap();
        let file_path = file.path();
        if file_path.extension().unwrap_or_default() == "bin" {
            let total_bytes_clone = total_bytes.clone();
            let handle = thread::spawn(move || {
                let mut file = fs::File::open(file_path).unwrap();
                let mut contents = vec![];
                let bytes_read = file.read_to_end(&mut contents).unwrap();
                *total_bytes_clone.lock().unwrap() += bytes_read;
            });
            handles.push(handle);
        }
    }
    println!("Spawned {} threads", handles.len());

    for handle in handles {
        handle.join().unwrap();
    }

    let elapsed_time = start_time.elapsed();
    let total_gb = *total_bytes.lock().unwrap() as f64 / (1024_f64 * 1024_f64 * 1024_f64);
    let throughput_gb_s = total_gb / elapsed_time.as_secs_f64();
    println!("Overall throughput: {:.2} GB/s", throughput_gb_s);
}

arsh · 2024-08-02T15:16:05Z

@pkasravi I noticed that you configured Mountpoint to cache the file's content locally. In my tests, I observed an improvement when caching was disabled, which seems more in line with how Goofys operates. It might be good to compare the performance by disabling caching, to ensure we're comparing apples to apples.

pkasravi · 2024-08-06T00:18:55Z

Hi @arsh I tried your suggestion but I'm not seeing much of a difference. Here are the results using the same code I shared above. I've also included the results from goofys (again same code) for comparison

Goofys command:

goofys \                                                                                                                           
  $S3_BUCKET:$S3_PREFIX \                                                                                                                                     
  /media/ramdisk/input

arsh · 2024-08-06T13:18:29Z

I'm going to read 16,000 8MB files to observe the performance I get and will report back. Previously, I tested with 5,000 files and noticed some improvement by disabling caching.

Could you run your test with caching disabled and logging enabled in MP? Please share the log file afterward.

You can do this by running Mountpoint as follows:

MOUNTPOINT_LOG=trace,awscrt=error \
mount-s3 \
--read-only \
--max-threads 2048 \
--maximum-throughput-gbps 400 \
--log-directory <a local directory>
--prefix <PREFIX> \
<BUCKET> /media/ramdisk/data

More details on logging are here https://github.com/awslabs/mountpoint-s3/blob/main/doc/LOGGING.md#logging-to-a-file

pkasravi · 2024-08-26T21:55:09Z

@arsh were you able to reproduce similar or different results?

I ran my test with logging enabled, the log file was 1G. I've attached as much as github will allow me, let me know if it's useful to add the rest.

log-parts.zip

pkasravi added the bug Something isn't working label Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow throughput when reading many small files #953

Slow throughput when reading many small files #953

pkasravi commented Jul 24, 2024 •

edited

Loading

jamesbornholt commented Jul 29, 2024

pkasravi commented Jul 29, 2024

arsh commented Aug 2, 2024

pkasravi commented Aug 6, 2024

arsh commented Aug 6, 2024 •

edited

Loading

pkasravi commented Aug 26, 2024 •

edited

Loading

Slow throughput when reading many small files #953

Slow throughput when reading many small files #953

Comments

pkasravi commented Jul 24, 2024 • edited Loading

Mountpoint for Amazon S3 version

AWS Region

Describe the running environment

Mountpoint options

What happened?

Relevant log output

jamesbornholt commented Jul 29, 2024

pkasravi commented Jul 29, 2024

arsh commented Aug 2, 2024

pkasravi commented Aug 6, 2024

arsh commented Aug 6, 2024 • edited Loading

pkasravi commented Aug 26, 2024 • edited Loading

pkasravi commented Jul 24, 2024 •

edited

Loading

arsh commented Aug 6, 2024 •

edited

Loading

pkasravi commented Aug 26, 2024 •

edited

Loading