-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic volume recognition #349
Comments
Hi @davidefer, this has been on my mind for a while. littlefs does have a "superblock entry" that contains static info about the filesystem (can you can add your own with Sidenote: Most of the configuration isn't needed for portability.
littlefs only needs to know two things: The block_size and block_count. The block_count is easy, we can read that from the superblock entry. The block_size, however, present a chicken-and-egg problem I haven't found a good answer for yet. The root/superblock in littlefs is a small two-block log in blocks 0 and 1. This means that at any given time one of the two blocks may be in the middle of an erase operation. For most flash, this means temporarily setting the flash to 1s, losing any data on the block. Handling is a hard requirement for power-resilience. So we have a superblock in one of the two blocks at the start of storage. If we can get to one of the two blocks, we can fetch the configuration. But this is where we run into a problem. How do we load the second block if we don't know our block size? We only need the offset (not the size) to fetch a block, but without knowing the size, we don't know where the second block could be. I'm very interested in solving this problem. But have no idea how to. Some possible solutions:
|
Hi @geky |
Afraid so. Not everyone is using flash. For example, I'm using FRAM with a block size of 128, and small EEPROMs would be similar. Other people here appear to be using SD cards with 1M block sizes and so on. In any case, it could cause future problems if assumptions were made about the size of erasable blocks. |
Further thought. Option 3 looks to be the best (least bad?) choice for auto-determination. Firstly, I suggest this should be a compile-time option, since not all users will need it. Secondly, with auto-determination enabled, how about treating the value in the blocksize field as a hint? If its zero, try different block sizes according to some algorithm; otherwise, start from the hint value and try bigger and smaller values from there. It may be possible to speed up the process by defining a search order for block sizes; perhaps 128, 256, 512, 4K, 1M early in the list, then fill in the gaps? |
I was meaning that for every chip you don't have so many possibilities for block sizes. It was just an example, lfs doesn't have to hardcode those numbers. They should be proposed to lfs while mounting or similar. |
If there is a known set of block sizes, you could iterate and try mounting each. This is one way to the filesystem which block sizes you support. lfs_size_t block_sizes[] = {4096, 65536};
for (int i = 0; i < sizeof(block_sizes)/sizeof(block_sizes[0]); i++) {
cfg.block_size = block_sizes[i];
int err = lfs_mount(&lfs, &cfg);
if (!err) {
break;
}
} Of course right now littlefs is still missing the option of loading the block_count from disk. #279 and #238 are similar issues around this. Keep in mind the block_size config option is a "virtual block size". It can be any multiple of the physical block, which is useful if you're working around performance issues.
That's a good idea. It wouldn't be unreasonable to search all powers-of-two (O(log n)) before searching other multiple (O(n)). It wouldn't change the amount of time needed to reject an unformated disk though. I'm currently waiting on this to build up better benchmarks, which is needed for a number of other scalability issues. My concern is that the time taken to scan makes this option unreasonable. Disks can be surprisingly large. |
It sounds like we should have a special value for lfs->cfg->block_size that is "LFS_BLOCK_SIZE_UNKNOWN" or similar. There's been some ideas around improving how the lfs_config struct works to be possible to optimize out at compile time: #158 (comment) |
Is there actually any merit in integrating this within littleFs? That basic loop to try opening with different block sizes is so simple, and easily added by those that want it. I was concerned that code size would be increased with a feature that some (many? most?) don't want. Just needs a way of reading the block_count from disc. (And that possibly has to be configurable, since the config structure could be in ROM.) |
If lfs could deal with something like cfg .block_count = LFS_BLOCK_COUNT_UNKNOWN (0xFFFFFFFFUL) and the search by itself the real block count while mounting (and providing it back as info), then the loop with the block sizes would be acceptable. |
We wanted this feature (automatic block size and block count detection) in MicroPython and implemented it using option 1 (best effort, just look at the first part of the first block). But indeed it turns out to be unreliable, there have been cases where the first block was partially erased and did not contain a valid superblock (but the second block did). Option 2 seems interesting if there's another place the data can be stored so it doesn't waste too much flash (external to the filesystem data). Another idea is to store a copy of the info at the very end of the block device, so it's either at the start or end and there are only 2 places to search for it (only one of these would be erased at any one time, at the most). But in the short term I think option 3 is the best way to solve it. Instead of making // Will fill in block_size and block_count in the given config struct if a valid FS is found.
// Returns 0 on success, or non-zero if no filesystem found.
int lfs_detect(lfs2_t *lfs2, const struct lfs2_config *config); |
This is a clever idea, and almost works, except right now the metadata blocks can only be fetched from the beginning. So ironically we would need to know the block size in order to read a metadata block at the end of disk. But maybe you could search for the magic string to try to guess where the block starts? Unfortunately that ends up again needing to search half of the entire block device in case the block size = 1/2 the disk size. Hmm, but I suppose nothing says we couldn't just always invert the order we write one of the blocks in our metadata pair... |
Ah, though storing the superblock at both the front+end means you need to know the block size. A sort of Heisenberg superblock. Though maybe that's an acceptable tradeoff? The main use case you would lose is being able to store extra data after the filesystem without needing a partition table. |
I'd actually prefer the opposite of autodetection: for |
Circling back around to this issue, had a chance to explore it a bit more and found some interesting observations:
I did explore other options for superblock locations, but unfortunately each come with their own set of problems:
|
@e107steved, this is a valid concern, deciding when to include features vs code size is hard. Unfortunately additional configuration options bring their own problems with fragmenting the code base and making testing more difficult. There are a couple of benefits putting the block size search in littlefs, 1. we can handle tricky corner cases a bit better, such as a larger superblock being picked up by a search for a smaller superblock, 2. we can avoid the expensive search when we find a valid, but incompatible superblock, 3. there may be opportunities to optimize the search better, such as not fetching superblock 0 every search, reusing cachees, memory allocations, etc. I'm currently seeing a ~818 byte (~4.5%) increase in code size and ~40 byte (~2.9%) increase in stack usage (branch). Though most of this comes from moving block_size/block_count into Code: +818 (+4.5%)
Stack: +40 (+2.9%)
Struct: +40 (+5.2%)
|
Added a benchmark, shows the expected sub-linear runtime when block_count is known, but linear runtime both block_size and block_count is unknown. If we know the block_size the Choosing a random device, mt25q, shows a max of ~90MiB/s reads in the datasheet. So for a very rough estimate:
I'm thinking this is quite reasonable for an exceptional operation, seeing as this performance hit only happens when the disk is unformatted. |
From the comments it's clear that there would be mixed responses to adding this feature - for example jimparis definitely doesn't want it (and I don't have any need for it). So IMO it should be optional. Having said all that, I can see the logic in storing the block size and count somewhere within the file system (then you have a self-contained binary blob to move around). The extra code and RAM requirements for this feature are more than I would like to incur. I'd prefer to see it as a wrapper round the littleFS mount function. Maybe with a few more options than would be sensible in an integral implementation (for example, the "guided" list of block sizes to try). What if the block count and size were simply stored in the superblock, but not otherwise used? Presumably that would require an absolutely minimal code increase, but make some of the things discussed here more feasible. |
I've gone ahead and opened a PR for tracking purposes: #753.
Keep in mind the main benefit is not needing to know the block_size during mount. This is an uncommon requirement from a filesystem. Two things this would give us that I personally like:
This is the current state of things. Unfortunately you can't read the superblock without knowing the block_size, so this is only useful as a redundant check after finding the block_size. Such a check was added in #584 after some of these comments.
Most of these costs are from moving the block_size/block_count into RAM, and the translation between block_sizes in the low-level read/prog/erase operations. I don't think it would be possible to avoid these costs by adding an additional mount function. They are currently just being hidden in the block device implementations. The block_size search itself is at most 240 bytes (+1.3%), as it is all contained in It sounds like this will also be helped by #491, which should allow you to avoid the code cost if the block_size is known at compile-time. Though this wouldn't help if you also have multiple littlefs disks or only know the block_size at runtime. Long-term (probably in tandem with #491), I think we should move all of the |
I'm not sure I understand about the "move lfs-config to RAM" suggestion. At present, code which has to operate with a single fixed configuration can be in ROM; code which has to support a number of configurations (e.g. different memory sizes) can be kept in RAM, provided the data is persistent. Seems very simple to me! |
My belief is that the first use case, a single fixed configuration, is better implemented by using defines at compile-time. This would strip out runtime checks and allow more compiler optimizations. This isn't currently possible with littlefs, but is the goal of #491. With #491 and lfs_config in RAM, both the fixed configuration and OS-like use cases should be improved, though there may be a cost for 2+ fixed configurations.
This is an interesting consideration, but littlefs already has important state in RAM. I'm not sure it could continue to function without remounting, otherwise it won't know where the root directory starts for example.
This was the original idea, but it turns out configuration is a bit more dynamic than expected during littlefs's lifetime.
Those are just some examples, it will be interesting to see exactly how this changes code size. |
On the RAM corruption issue, I was thinking about corruption of small areas - maybe 1-16 bytes. IIRC externally-derived corruption is mostly a few bytes. And software bugs such as an incorrect pointer value can lead to corruption of just a few bytes. If lfs_config is in ROM, a write will not affect it (and may on some architectures lead to an exception which should quickly lead to the root cause). If lfs_config is in RAM then sure, all bets are off and you may well have a difficult to find bug. |
Just an FYI for anyone watching this issue, at the moment I'm considering #753 defunct. More info in #753 (comment), but at the moment it looks like block-size will always be a required configuration option. There's just not a tractable way to find the block-size faster than a naive search. |
Hi everybody
I was wondering whether lfs could automatically detect the volume setup when mounting, without specifying the volume configuration in the lfs_config structure.
Would it be possible to store somewhere in the volume this information while formatting? Or at least all the information needed to detect the volume topology and let it be mounted.
The only information to be hardcoded in the fw would be the start address of lfs in the flash.
Thanks in advance.
The text was updated successfully, but these errors were encountered: