Add a cache for samples #7497

sakertooth · 2024-09-14T05:21:27Z

Adds an in-memory cache for samples. Callers can fetch data from the cache using two overloads: one for audio files, and another for Base64 strings. The cache stores weak pointers to the samples and returns shared pointers to the callers. However, the cache in the future will probably need to store the weak pointers along with sample thumbnails and possibly other kinds of metadata for each sample, which should be fairly easy to do as all that needs to be done is to store a collection of the data we need in a struct/class instead of just the weak pointer to the buffer.

For audio files, we check the last write time for the file to see if it needs updating each time it is fetched, and update it if necessary. For Base64 strings, we just compare the contents.

SampleLoader loads audio data by querying the cache for it, rather than creating them manually. As such, the SampleLoader::create* functions where renamed to SampleLoader::load*.

The memory usage has dropped significantly when using duplicate samples (htop readings went from 28.5% to 3.6% for me with a project that had a number of sample clips, each ~3 minutes long), and projects load faster (There is still some delay because of the waveform drawing, but this should be addressed soon. The speedup is more noticeable when you zoom all the way out before loading a project as a result).

Should supersede #7058 I believe.

messmerd

Gave it a quick look-over.

Also,

Sample's move constructor and move assignment operator should be marked noexcept
Sample::s_interpolationMargins should be renamed to Sample::InterpolationMargins since it is public
sample_rate_t could be used instead of int for the sample rate (See this commit for what needed to be changed: 10162ec)

include/FileSystemHelpers.h

include/SampleLoader.h

include/SampleDatabase.h

sakertooth · 2024-11-13T00:08:59Z

Sample's move constructor and move assignment operator should be marked noexcept
Sample::s_interpolationMargins should be renamed to Sample::InterpolationMargins since it is public
sample_rate_t could be used instead of int for the sample rate (See this commit for what needed to be changed: 10162ec)

Thanks, I will make another PR to address these issues separately. However, about the int -> sample_rate_t change, I never really understood why we have those types when basic types like int will work just as well. We already have to convert the sample rate to an int when processing the project file anyways. I can't really see the benefit using sample_rate_t brings, and it adds more complexity than int does (not in terms how long the type is.. well maybe, but more so having to know what is the actual type is that we are dealing with here on top of adding more code that we don't really need in the codebase).

…om the database

sakertooth · 2024-11-19T12:32:39Z

Any updates @messmerd?

include/SampleCache.h

messmerd · 2024-11-20T22:46:49Z

This PR looks pretty good to me code-wise. I'll test it next.

include/SampleCache.h

…ence

We need to return it back to the caller as well

src/core/SampleCache.cpp

Co-authored-by: Dalton Messmer <[email protected]>

src/core/SampleCache.cpp

Spacemagehq · 2025-01-15T00:34:06Z

I'm testing, and I'm finding no issues or bugs so far on Windows 11. I had 20 samples and samples randomly throughout lmms and the memory and the storage seems fine and not affected. The CPU meter inside lmms is not going up by that much and not heavy on the cpu.

Co-authored-by: Dalton Messmer <[email protected]>

sakertooth · 2025-02-08T14:19:16Z

@messmerd, I believe I understand the benefit of using UUIDs better now. If the file path changes, then we only would need to update the mapping from UUIDs to file paths at one place, while all the other clients don't have to worry about anything and can continue using the UUID, which hasn't changed.

As of right now my PRs handles it using the "last modified time", which has two problems. One, this can create a lot of dead entries if they are not being actively "garbage collected". Two, since clients still store the file path, if it changes, they would have to still manage the sample by themselves, and that would have to be done possibly everywhere. This is a problem in #7366, where we have check if the Thumbnail needs to be updated or a new one to be created because we store the file paths directly, which can become invalidated at any point on the file system.

It seems like the overall idea here is to map UUIDs to assets/resources, which can be a sample file, sample Bas64 string, project file, preset file, etc. Assets can then specify if they are loaded from disk, or something else like as a string in the case of Base64 samples. Each asset can be updated as necessary to when the asset manager/cache feels like it should (changes on file system or something else possibly). This will not only bolster the sample caching implementation since it truly centralizes everything, but will also help make forward strides in simplifying and improving asset management (which may be needed for features like showing a popup to the user to load missing assets when loading the project, among other things).

~~I am going to try to implement some of these ideas in a new PR~~ (actually I might just do it here instead).

messmerd · 2025-02-08T22:18:05Z

@sakertooth Yep, that's exactly it.

This PR doesn't make any changes to the project file as far as I'm aware, so it won't introduce any backwards incompatible changes if we merge it now. And since this PR is very useful as is, I think we should merge it then explore the UUID / resource manager idea in a follow-up PR.

sakertooth · 2025-02-08T23:37:49Z

I'll at least move to using QFileSystemWatcher in the current implementation to fix some of the problems mentioned by actively keeping the table in sync with the file system. Other than that I think this is safe to merge. I agree that the UUID asset idea might need to be explored in a different PR since its implementation is far more lengthy than what I have here, and I remember you were planning to do this already to some extent.

sakertooth · 2025-02-09T04:57:20Z

I read up on the docs for QFileSystemWatcher

The act of monitoring files and directories for modifications consumes system resources. This implies there is a limit to the number of files and directories your process can monitor simultaneously. On all BSD variants, for example, an open file descriptor is required for each monitored file. Some system limits the number of open file descriptors to 256 by default. This means that addPath() and addPaths() will fail if your process tries to add more than 256 files or directories to the file system monitor. Also note that your process may have other file descriptors open in addition to the ones for files being monitored, and these other open descriptors also count in the total. macOS uses a different backend and does not suffer from this issue.

I was a bit scared off making the switch because of this and had to weigh the pros and cons, so I'll probably leave the timestamp checking to be safe as it works well enough. A counter argument to having dead paths in the caches is that its rare and have static duration, so it really not that big of a problem as I made it out to be. The main issue is exposing the file path information that has to be made in sync everywhere else in the codebase, but we will deal with this later as already discussed.

One change I should make though is that instead of making completely new entries, update preexisting ones if they are fetched a second time and are still valid on the file system. This should keep the number of invalid entries in the cache down by a fair amount.

… one

messmerd · 2025-02-09T18:30:02Z

src/core/SampleCache.cpp

+	const auto it = std::find_if(s_audioFileMap.begin(), s_audioFileMap.end(),
+		[&](const auto& entry) { return entry.first.path == PathUtil::pathFromQString(path); });
+
+	auto lastWriteTime = fs::last_write_time(PathUtil::pathFromQString(path));
+
+	if (it == s_audioFileMap.end() || it->first.lastWriteTime != lastWriteTime)
+	{
+		const auto buffer = std::make_shared<SampleBuffer>(path);
+		const auto key = AudioFileEntry{PathUtil::pathFromQString(path), lastWriteTime};
+		s_audioFileMap[std::move(key)] = buffer;
+		return buffer;
+	}


Suggested change

const auto it = std::find_if(s_audioFileMap.begin(), s_audioFileMap.end(),

[&](const auto& entry) { return entry.first.path == PathUtil::pathFromQString(path); });

auto lastWriteTime = fs::last_write_time(PathUtil::pathFromQString(path));

if (it == s_audioFileMap.end() || it->first.lastWriteTime != lastWriteTime)

{

const auto buffer = std::make_shared<SampleBuffer>(path);

const auto key = AudioFileEntry{PathUtil::pathFromQString(path), lastWriteTime};

s_audioFileMap[std::move(key)] = buffer;

return buffer;

}

const auto qPath = PathUtil::pathFromQString(path);

const auto it = std::find_if(s_audioFileMap.begin(), s_audioFileMap.end(),

[&](const auto& entry) { return entry.first.path == qPath; });

auto lastWriteTime = fs::last_write_time(qPath);

if (it == s_audioFileMap.end() || it->first.lastWriteTime != lastWriteTime)

{

const auto buffer = std::make_shared<SampleBuffer>(path);

const auto key = AudioFileEntry{qPath, lastWriteTime};

s_audioFileMap[std::move(key)] = buffer;

return buffer;

}

Moved the PathUtil::pathFromQString(path) outside the lambda so it doesn't call it every time the lambda is called.

I would use fsPath instead personally, but yeah I'll move it out.

sakertooth added 8 commits September 13, 2024 21:41

Add interface for SampleDatabase

a76d8c2

Add implementation for SampleDatabase

2ca2f25

Rename SampleLoader::create* to SampleLoader::load*

badde84

Use SampleDatabase in SampleLoader

6a6ecfa

Fix segmentation fault on null entries

247955e

Fix duplication

121b327

Fix CI attempt 1

99f29a9

CI fix attempt 2

35026dd

messmerd reviewed Nov 12, 2024

View reviewed changes

include/FileSystemHelpers.h Outdated Show resolved Hide resolved

include/SampleLoader.h Outdated Show resolved Hide resolved

messmerd reviewed Nov 12, 2024

View reviewed changes

include/SampleDatabase.h Outdated Show resolved Hide resolved

sakertooth added 4 commits November 12, 2024 19:35

Move file system helper functions into PathUtil

777e9c8

Move SampleLoader into the core namespace and add SampleFilePicker class

9d30457

Add asserts that ensure only the main thread is requesting samples fr…

053ba61

…om the database

Rebrand SampleDatabase to SampleCache

b1c1a9a

sakertooth changed the title ~~Add a database for samples~~ Add a cache for samples Nov 13, 2024

sakertooth added 2 commits November 12, 2024 20:40

Merge remote-tracking branch 'upstream/master' into add-sample-database

ff07622

Remove ::gui suffix

5810d3d

messmerd reviewed Nov 20, 2024

View reviewed changes

include/SampleCache.h Outdated Show resolved Hide resolved

messmerd reviewed Nov 20, 2024

View reviewed changes

include/SampleCache.h Outdated Show resolved Hide resolved

Improve get function

f9d7968

messmerd reviewed Nov 21, 2024

View reviewed changes

include/SampleCache.h Outdated Show resolved Hide resolved

sakertooth added 3 commits November 21, 2024 00:57

Use insert_or_assign instead of emplace and make args an rvalue refer…

826cbf7

…ence

Do not move buffer into map

e9a5914

We need to return it back to the caller as well

Are we there yet

8cd2472

messmerd reviewed Nov 21, 2024

View reviewed changes

src/core/SampleCache.cpp Outdated Show resolved Hide resolved

src/core/SampleCache.cpp Outdated Show resolved Hide resolved

sakertooth and others added 2 commits November 21, 2024 12:48

Update SampleCache.cpp

ffdf92c

Co-authored-by: Dalton Messmer <[email protected]>

Update SampleCache.cpp

85f03e5

Co-authored-by: Dalton Messmer <[email protected]>

sakertooth mentioned this pull request Dec 28, 2024

Improve waveform rendering performance #7366

Merged

Merge remote-tracking branch 'upstream/master' into add-sample-database

7e34dad

messmerd reviewed Jan 14, 2025

View reviewed changes

src/core/SampleCache.cpp Outdated Show resolved Hide resolved

Use absolute path

efc5165

Co-authored-by: Dalton Messmer <[email protected]>

messmerd mentioned this pull request Feb 9, 2025

Add sample caching #7058

Closed

Include QCoreApplication header

4ea292a

sakertooth marked this pull request as draft February 9, 2025 05:36

sakertooth added 4 commits February 9, 2025 03:52

Update existing audio file entires instead of adding a completely new…

fd9861c

… one

Fix crash when fetching audio files and remove get function

216de3d

Merge remote-tracking branch 'upstream' into add-sample-database

2d2d9fe

Update copyright

c1c6898

sakertooth marked this pull request as ready for review February 9, 2025 09:27

messmerd reviewed Feb 9, 2025

View reviewed changes

sakertooth marked this pull request as draft February 12, 2025 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a cache for samples #7497

Add a cache for samples #7497

sakertooth commented Sep 14, 2024 •

edited

Loading

messmerd left a comment

sakertooth commented Nov 13, 2024 •

edited

Loading

sakertooth commented Nov 19, 2024

messmerd commented Nov 20, 2024

Spacemagehq commented Jan 15, 2025

sakertooth commented Feb 8, 2025 •

edited

Loading

messmerd commented Feb 8, 2025

sakertooth commented Feb 8, 2025

sakertooth commented Feb 9, 2025 •

edited

Loading

messmerd Feb 9, 2025

sakertooth Feb 9, 2025

Add a cache for samples #7497

Are you sure you want to change the base?

Add a cache for samples #7497

Conversation

sakertooth commented Sep 14, 2024 • edited Loading

messmerd left a comment

Choose a reason for hiding this comment

sakertooth commented Nov 13, 2024 • edited Loading

sakertooth commented Nov 19, 2024

messmerd commented Nov 20, 2024

Spacemagehq commented Jan 15, 2025

sakertooth commented Feb 8, 2025 • edited Loading

messmerd commented Feb 8, 2025

sakertooth commented Feb 8, 2025

sakertooth commented Feb 9, 2025 • edited Loading

messmerd Feb 9, 2025

Choose a reason for hiding this comment

sakertooth Feb 9, 2025

Choose a reason for hiding this comment

sakertooth commented Sep 14, 2024 •

edited

Loading

sakertooth commented Nov 13, 2024 •

edited

Loading

sakertooth commented Feb 8, 2025 •

edited

Loading

sakertooth commented Feb 9, 2025 •

edited

Loading