Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor append-only performance with large files #1054

Open
andyp1per opened this issue Dec 16, 2024 · 4 comments
Open

Poor append-only performance with large files #1054

andyp1per opened this issue Dec 16, 2024 · 4 comments

Comments

@andyp1per
Copy link

andyp1per commented Dec 16, 2024

My setup:

W25N01GV, 2Gbit chip, with 2k pages and 128k blocks

    fs_cfg.read_size = page_size;
    fs_cfg.prog_size = page_size;
    fs_cfg.block_size = block_size;
    fs_cfg.block_count = block_count;
    fs_cfg.metadata_max = page_size;
    fs_cfg.lookahead_size = 128;
    fs_cfg.cache_size = page_size;

I am trying to write an append-only log file at about 318kB/s which is 158 pages/s. The chip will easily do writes at 1MB/s.
I am only syncing the file per-block by using the algorithm in #564 (comment)
The subsystem is writing data a page (2k) at a time.

My logging subsystem is doing no reads (I instrumented read() to check) but I see pages being read at about 274/s.
Worse the write speed slows down as the file gets bigger - going down to 90kB/s. Start a new file and the write speed bounces back up.
sync takes about 11ms, which is slow but not awful.

So my questions:

  • what in littlefs is reading all the pages? It's going to severely limit the amount of writing I can do
  • will making the cache bigger help? If so at what cost?
  • Why the slowdown over time related to filesize.

I have read the DESIGN.md, numerous issues and the code but am no closer to understanding what is actually going on here and why the performance is so poor.

@andyp1per andyp1per changed the title Poor append-only performance with large files. Poor append-only performance with large files Dec 16, 2024
@andyp1per
Copy link
Author

I added this which helps: #1056
The other reason is that checking the fs size is very expensive so I have stopped us doing that.

@geky
Copy link
Member

geky commented Dec 19, 2024

Hi @andyp1per, thanks for creating an issue.

Honestly, if your logging speed is mission critical, and you're putting in the effort to make the hack in #564 (comment) work, I would consider not storing the log in a file and instead reserving a fixed amount of raw flash to hold the log. The speed in the chips datasheet is the maximum speed, and any filesystem will necessarily be slower.

You could still store the log size/offset in a file to benefit from power-loss resilience.


Worse the write speed slows down as the file gets bigger - going down to 90kB/s. Start a new file and the write speed bounces back up.

It sounds like you're running into #75, the issue being block allocation/gc ultimately scales $O(n^2)$ where $n$ is the number of blocks in the filesystem.

The other reason is that checking the fs size is very expensive so I have stopped us doing that.

This also makes it sound like a gc bottleneck. The gc scan and lfs_file_size both take roughly the same amount of time, in that they both traverse all blocks in the filesystem.

Some possible workarounds:

  • Increasing the size of the lookahead buffer (lookahead_size) will make gc run less often, though it's not possible to completely avoid gc.

  • Increasing the block_size means fewer blocks for the same amount of storage, which can soften the gc bottleneck.

what in littlefs is reading all the pages? It's going to severely limit the amount of writing I can do

The gc scan as a part of block allocation. When the lookahead buffer is exhausted, littlefs traverses the filesystem to figure out what blocks are still in use (or more accurately, which blocks are not in use). This grows $O(n^2)$ so it ultimately wins in filesystems with a large number of blocks.

There are plans in the works to add an optional block map to avoid this, but it's a part of a large piece of work. To make a block map work littlefs needs to understand when blocks are no longer in use, which it currently doesn't.

will making the cache bigger help? If so at what cost?

Increasing the lookahead buffer may help, but the prog/read/file caches only prevent multiple reads to the same block.

littlefs currently doesn't have multi-block caching. It's low priority vs things that require disk changes, but in theory multi-block caching could help here.

Though you would need to be careful to make sure gc doesn't just thrash the multi-block cache every scan...

Why the slowdown over time related to filesize.

It's technically related to total filesystem size, which is arguably worse.


This is one piece of a number of performance issues in littlefs that are being worked on. Unfortunately there's not much to show at this stage. With disk compatibility being the way it is, it's difficult to improve things incrementally.

@andyp1per
Copy link
Author

  • Increasing the size of the lookahead buffer (lookahead_size) will make gc run less often, though it's not possible to completely avoid gc.

  • Increasing the block_size means fewer blocks for the same amount of storage, which can soften the gc bottleneck.

Thanks for the reply:

  • What is a good size for lookahead_size? I am currently at 128.
  • The block size can be bigger than the physical erase size? What are the downsides of setting it bigger? If its bigger how does littlefs know what the erase size is?

@geky
Copy link
Member

geky commented Dec 19, 2024

What is a good size for lookahead_size? I am currently at 128.

I don't think there's an easy answer without benchmarking on the device. It's a tradeoff of RAM to frequency of garbage collection, though there is no benefit to a lookahead larger than block_count/8.

The block size can be bigger than the physical erase size? What are the downsides of setting it bigger?

The downsides of bigger blocks are 1. less block granularity, so things like small non-inlined or unaligned files can end up wasting more space, and 2. more expensive in-block operations, specifically metadata logs.

This is another littlefs performance bottleneck in that metadata compaction also grows $O(n^2)$: #214. At least here, metadata_max offers a workaround.

If its bigger how does littlefs know what the erase size is?

The neat part is littlefs doesn't really care about the physical erase size. It's up to the read/prog/erase callbacks to map correctly from logical block to physical block.

Though at some point I think it would be a good idea to add erase_size to the config. If anything just to assert on alignment issues and make it more clear to users that the logical block size is not tied to the flash. But this is low priority since it would probably mean API breakage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants