Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition Leading to Panic on call to entry() #122

Open
posborne opened this issue Oct 1, 2019 · 0 comments
Open

Race Condition Leading to Panic on call to entry() #122

posborne opened this issue Oct 1, 2019 · 0 comments

Comments

@posborne
Copy link
Contributor

posborne commented Oct 1, 2019

I am using lru_time_cache on an embedded Linux system with somewhat limited CPU resources which may exacerbate the problem, but reviewing the code the problem seems to be there regardless.

Here's the code for entry() at present:

    /// Gets the given key's corresponding entry in the map for in-place manipulation.
    pub fn entry(&mut self, key: Key) -> Entry<'_, Key, Value> {
        // We need to do it the ugly way below due to this issue:
        // https://github.com/rust-lang/rfcs/issues/811
        // match self.get_mut(&key) {
        //     Some(value) => Entry::Occupied(OccupiedEntry{value: value}),
        //     None => Entry::Vacant(VacantEntry{key: key, cache: self}),
        // }
        if self.contains_key(&key) {
            Entry::Occupied(OccupiedEntry {
                value: self.get_mut(&key).expect("key not found"),
            })
        } else {
            Entry::Vacant(VacantEntry { key, cache: self })
        }
    }

On a device running code making use of entry() I got a crash report of a crash due to a panic with message "key not found" from a call to entry() on the lru_time_cache for this code.

It seems possible that it could very well be the case that the contains_key could see an unexpired key when it is executed but that when we execute get_mut the new Instant would see that key as expired leading to a crash. The code is incorrect and has a race condition not involving concurrency.

I will try to get the core dump from this crash and update with more insights if applicable. The logic here could be valid if the operations were made into pure functions taking the Instant in rather than fetching it from the system at many different execution locations.

posborne added a commit to posborne/lru_time_cache that referenced this issue Oct 3, 2019
These changes aim to make it so that calls to the public API
operate over the cache using a single Instant of time rather than
calling Instant::now() at several unique points in time during the
synchronous execution of the function.

The former execution model lead to potential bugs, notably in functions
like entry() as brought up in maidsafe#122, especially on targets with less processing
power or under heavy load.

This also reduces the number of system calls that need to be made which
may improve performance for some use cases.

Signed-off-by: Paul Osborne <[email protected]>
calavera pushed a commit to calavera/lru_time_cache that referenced this issue Mar 19, 2020
These changes aim to make it so that calls to the public API
operate over the cache using a single Instant of time rather than
calling Instant::now() at several unique points in time during the
synchronous execution of the function.

The former execution model lead to potential bugs, notably in functions
like entry() as brought up in maidsafe#122, especially on targets with less processing
power or under heavy load.

This also reduces the number of system calls that need to be made which
may improve performance for some use cases.

Signed-off-by: Paul Osborne <[email protected]>
S-Coyle pushed a commit that referenced this issue Mar 20, 2020
These changes aim to make it so that calls to the public API
operate over the cache using a single Instant of time rather than
calling Instant::now() at several unique points in time during the
synchronous execution of the function.

The former execution model lead to potential bugs, notably in functions
like entry() as brought up in #122, especially on targets with less processing
power or under heavy load.

This also reduces the number of system calls that need to be made which
may improve performance for some use cases.

Signed-off-by: Paul Osborne <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant