Skip to content

Extend admission policies, statistics, add multi thread eviction and promotion #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 62 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
36eae66
Run centos and debian workflows on push and PR
igchor Nov 2, 2021
790c09f
Introduce FileShmSegment for file-backed shared memory
igchor Oct 20, 2021
aed38aa
Adjust and enable tests for ShmFileSegment
igchor Oct 16, 2021
442b6f4
Add support for shm opts serialization
guptask Oct 27, 2021
6dea7a8
Initial version of config API extension to support multiple memory tiers
victoria-mcgrath Oct 28, 2021
3186b94
Integrate Memory Tier config API with CacheAllocator.
igchor Oct 30, 2021
38515ac
Add MemoryTierCacheConfig::fromShm()
igchor Nov 6, 2021
48b68a3
Fix test_shm_manager.cpp test
igchor Nov 9, 2021
6fe4971
Run tests on CI
igchor Nov 5, 2021
dff2296
Run long tests (navy/bench) every day on CI
igchor Nov 16, 2021
3af7643
Moved common segment code for posix and file shm segments into ShmCommon
guptask Nov 7, 2021
c93b440
Enabled memory tier config API for cachebench.
victoria-mcgrath Nov 18, 2021
ab752e8
Enabled shared memory tier in cachebench.
victoria-mcgrath Nov 23, 2021
2cacc99
Converted nvmCacheState_ to std::optional to simplify NVM cache state…
victoria-mcgrath Nov 29, 2021
9ace595
Run CI on prebuild docker image
igchor Dec 15, 2021
9812286
Run only centos build on CI
igchor Dec 15, 2021
e111395
Initial multi-tier support implementation
igchor Sep 28, 2021
c8576f5
Extend CompressedPtr to work with multiple tiers
igchor Dec 11, 2021
9ae9b2e
Implemented async Item movement between tiers
vinser52 Dec 18, 2021
3b68053
Adding example for multitiered cache
vinser52 Dec 9, 2021
bef878e
Enable workarounds in tests
igchor Dec 24, 2021
0e8af04
Add basic multi-tier test
igchor Dec 30, 2021
4477fec
Set correct size for each memory tier
igchor Dec 30, 2021
53ca174
Extend cachbench with value validation
igchor Jan 19, 2022
8b83aab
Aadding new configs to hit_ratio/graph_cache_leader_fobj
vinser52 Jan 27, 2022
7b36c51
Move validateValue call to make sure it is measured by latency tracker
igchor Jan 28, 2022
d817018
Fix eviction flow and removeCb calls
vinser52 Feb 3, 2022
385128d
Remove failing build-cachelib workflow (#42)
igchor Feb 7, 2022
d13568e
Disabled test suite allocator-test-AllocatorTypeTest (#41)
victoria-mcgrath Feb 7, 2022
02a3bfb
Do not compensate for rounding error when calculating tier sizes (#43)
igchor Feb 8, 2022
172caf1
Fixed total cache size in CacheMemoryStats (#38)
victoria-mcgrath Feb 8, 2022
2046eea
Fix tests and benchmarks compilation
igchor Feb 9, 2022
8f08009
Update docker file used in CI
igchor Feb 14, 2022
9c0aca8
Disable failing clang-format-check
igchor Feb 14, 2022
d1f26ab
Add one more navy test to BLACKLIST
igchor Feb 15, 2022
043df5f
Merge pull request #47 from igchor/update_docker2
vinser52 Feb 15, 2022
c95b2b3
Fix issue with "Destorying an unresolved handle"
vinser52 Feb 17, 2022
019b2a5
Merge pull request #50 from vinser52/fix_unresolved_handle
vinser52 Feb 17, 2022
2561f45
Add extra param to build-package.sh
igchor Apr 8, 2022
47a978c
Add scripts for rebuilding/pushing docker images
igchor Apr 8, 2022
880f7dc
Extend CI to rebuild docker automatically
igchor Apr 8, 2022
ad59d20
Merge pull request #57 from igchor/auto_docker_build
igchor Apr 11, 2022
87db5fc
Added required packages to install Intel ittapi
mcengija Apr 26, 2022
3ffe808
Update build-cachelib-docker.yml
igchor Apr 27, 2022
c805e69
Merge pull request #67 from mcengija/add_packages_to_docker_image
igchor Apr 27, 2022
0f2fe81
Shorten critical section in findEviction
igchor Apr 12, 2022
7a56883
Fix slab release code
igchor Jun 10, 2022
ed2af50
critical section inside combined_lock
igchor Jun 13, 2022
dcf4290
Merge pull request #79 from igchor/fix_slab_release
igchor Jun 13, 2022
681bcbc
Merge pull request #73 from igchor/optimize_mmcontainer_locking_hetero
igchor Jun 13, 2022
98a2fde
Extend cachbench with touch value
igchor May 4, 2022
21c2e31
Enable touchValue by default
igchor Jun 15, 2022
1d16e1a
Merge pull request #85 from igchor/touch_value_develop
igchor Jun 20, 2022
3c34254
Issue75 rebased (#88)
igchor Jul 5, 2022
407806a
Add memory usage statistics for slabs and allocation classes
igchor Jul 6, 2022
57a3d1c
Merge pull request #91 from igchor/more_stats
igchor Jul 11, 2022
34f9f8e
Add option to print memory stats in bytes only
igchor Jul 12, 2022
cab87d4
Merge pull request #93 from igchor/stats_no_gb
igchor Jul 27, 2022
2434693
added per tier pool class rolling average latency
guptask Jul 21, 2022
11fadaa
Merge pull request #96 from guptask/rolling_stats
guptask Aug 4, 2022
acdfa0b
MM2Q promotion iterators (#1)
byrnedj Aug 9, 2022
a2721d1
Implement background promotion and eviction
igchor Jul 6, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
name: build-cachelib-centos-latest
on:
schedule:
- cron: '30 5 * * 1,4'
- cron: '0 7 * * *'

jobs:
build-cachelib-centos8-latest:
name: "CentOS/latest - Build CacheLib with all dependencies"
Expand Down Expand Up @@ -33,3 +34,6 @@ jobs:
uses: actions/checkout@v2
- name: "build CacheLib using build script"
run: ./contrib/build.sh -j -v -T
- name: "run tests"
timeout-minutes: 60
run: cd opt/cachelib/tests && ../../../run_tests.sh long
6 changes: 5 additions & 1 deletion .github/workflows/build-cachelib-debian.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
name: build-cachelib-debian-10
on:
schedule:
- cron: '30 5 * * 2,6'
- cron: '30 5 * * 0,3'

jobs:
build-cachelib-debian-10:
name: "Debian/Buster - Build CacheLib with all dependencies"
Expand Down Expand Up @@ -37,3 +38,6 @@ jobs:
uses: actions/checkout@v2
- name: "build CacheLib using build script"
run: ./contrib/build.sh -j -v -T
- name: "run tests"
timeout-minutes: 60
run: cd opt/cachelib/tests && ../../../run_tests.sh
49 changes: 49 additions & 0 deletions .github/workflows/build-cachelib-docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: build-cachelib-docker
on:
push:
pull_request:

jobs:
build-cachelib-docker:
name: "CentOS/latest - Build CacheLib with all dependencies"
runs-on: ubuntu-latest
env:
REPO: cachelib
GITHUB_REPO: pmem/CacheLib
CONTAINER_REG: ghcr.io/pmem/cachelib
CONTAINER_REG_USER: ${{ secrets.GH_CR_USER }}
CONTAINER_REG_PASS: ${{ secrets.GH_CR_PAT }}
FORCE_IMAGE_ACTION: ${{ secrets.FORCE_IMAGE_ACTION }}
HOST_WORKDIR: ${{ github.workspace }}
WORKDIR: docker
IMG_VER: devel
strategy:
matrix:
CONFIG: ["OS=centos OS_VER=8streams PUSH_IMAGE=1"]
steps:
- name: "System Information"
run: |
echo === uname ===
uname -a
echo === /etc/os-release ===
cat /etc/os-release
echo === df -hl ===
df -hl
echo === free -h ===
free -h
echo === top ===
top -b -n1 -1 -Eg || timeout 1 top -b -n1
echo === env ===
env
echo === gcc -v ===
gcc -v
- name: "checkout sources"
uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Pull the image or rebuild and push it
run: cd $WORKDIR && ${{ matrix.CONFIG }} ./pull-or-rebuild-image.sh $FORCE_IMAGE_ACTION

- name: Run the build
run: cd $WORKDIR && ${{ matrix.CONFIG }} ./build.sh
147 changes: 0 additions & 147 deletions .github/workflows/build-cachelib.yml

This file was deleted.

2 changes: 1 addition & 1 deletion .github/workflows/clang-format-check.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# From: https://github.com/marketplace/actions/clang-format-check#multiple-paths
name: clang-format Check
on: [pull_request]
on: []
jobs:
formatting-check:
name: Formatting Check
Expand Down
117 changes: 117 additions & 0 deletions MultiTierDataMovement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Background Data Movement

In order to reduce the number of online evictions and support asynchronous
promotion - we have added two periodic workers to handle eviction and promotion.

The diagram below shows a simplified version of how the background evictor
thread (green) is integrated to the CacheLib architecture.

<p align="center">
<img width="640" height="360" alt="BackgroundEvictor" src="cachelib-background-evictor.png">
</p>

## Synchronous Eviction and Promotion

- `disableEvictionToMemory`: Disables eviction to memory (item is always evicted to NVMe or removed
on eviction)

## Background Evictors

The background evictors scan each class to see if there are objects to move the next (lower)
tier using a given strategy. Here we document the parameters for the different
strategies and general parameters.

- `backgroundEvictorIntervalMilSec`: The interval that this thread runs for - by default
the background evictor threads will wake up every 10 ms to scan the AllocationClasses. Also,
the background evictor thead will be woken up everytime there is a failed allocation (from
a request handling thread) and the current percentage of free memory for the
AllocationClass is lower than `lowEvictionAcWatermark`. This may render the interval parameter
not as important when there are many allocations occuring from request handling threads.

- `evictorThreads`: The number of background evictors to run - each thread is a assigned
a set of AllocationClasses to scan and evict objects from. Currently, each thread gets
an equal number of classes to scan - but as object size distribution may be unequal - future
versions will attempt to balance the classes among threads. The range is 1 to number of AllocationClasses.
The default is 1.

- `maxEvictionBatch`: The number of objects to remove in a given eviction call. The
default is 40. Lower range is 10 and the upper range is 1000. Too low and we might not
remove objects at a reasonable rate, too high and it might increase contention with user threads.

- `minEvictionBatch`: Minimum number of items to evict at any time (if there are any
candidates)

- `maxEvictionPromotionHotness`: Maximum candidates to consider for eviction. This is similar to `maxEvictionBatch`
but it specifies how many candidates will be taken into consideration, not the actual number of items to evict.
This option can be used to configure duration of critical section on LRU lock.


### FreeThresholdStrategy (default)

- `lowEvictionAcWatermark`: Triggers background eviction thread to run
when this percentage of the AllocationClass is free.
The default is `2.0`, to avoid wasting capacity we don't set this above `10.0`.

- `highEvictionAcWatermark`: Stop the evictions from an AllocationClass when this
percentage of the AllocationClass is free. The default is `5.0`, to avoid wasting capacity we
don't set this above `10`.


## Background Promoters

The background promotes scan each class to see if there are objects to move to a lower
tier using a given strategy. Here we document the parameters for the different
strategies and general parameters.

- `backgroundPromoterIntervalMilSec`: The interval that this thread runs for - by default
the background promoter threads will wake up every 10 ms to scan the AllocationClasses for
objects to promote.

- `promoterThreads`: The number of background promoters to run - each thread is a assigned
a set of AllocationClasses to scan and promote objects from. Currently, each thread gets
an equal number of classes to scan - but as object size distribution may be unequal - future
versions will attempt to balance the classes among threads. The range is `1` to number of AllocationClasses. The default is `1`.

- `maxProtmotionBatch`: The number of objects to promote in a given promotion call. The
default is 40. Lower range is 10 and the upper range is 1000. Too low and we might not
remove objects at a reasonable rate, too high and it might increase contention with user threads.

- `minPromotionBatch`: Minimum number of items to promote at any time (if there are any
candidates)

- `numDuplicateElements`: This allows us to promote items that have existing handles (read-only) since
we won't need to modify the data when a user is done with the data. Therefore, for a short time
the data could reside in both tiers until it is evicted from its current tier. The default is to
not allow this (0). Setting the value to 100 will enable duplicate elements in tiers.

### Background Promotion Strategy (only one currently)

- `promotionAcWatermark`: Promote items if there is at least this
percent of free AllocationClasses. Promotion thread will attempt to move `maxPromotionBatch` number of objects
to that tier. The objects are chosen from the head of the LRU. The default is `4.0`.
This value should correlate with `lowEvictionAcWatermark`, `highEvictionAcWatermark`, `minAcAllocationWatermark`, `maxAcAllocationWatermark`.
- `maxPromotionBatch`: The number of objects to promote in batch during BG promotion. Analogous to
`maxEvictionBatch`. It's value should be lower to decrease contention on hot items.

## Allocation policies

- `maxAcAllocationWatermark`: Item is always allocated in topmost tier if at least this
percentage of the AllocationClass is free.
- `minAcAllocationWatermark`: Item is always allocated in bottom tier if only this percent
of the AllocationClass is free. If percentage of free AllocationClasses is between `maxAcAllocationWatermark`
and `minAcAllocationWatermark`: then extra checks (described below) are performed to decide where to put the element.

By default, allocation will always be performed from the upper tier.

- `acTopTierEvictionWatermark`: If there is less that this percent of free memory in topmost tier, cachelib will attempt to evict from top tier. This option takes precedence before allocationWatermarks.

### Extra policies (used only when percentage of free AllocationClasses is between `maxAcAllocationWatermark`
and `minAcAllocationWatermark`)
- `sizeThresholdPolicy`: If item is smaller than this value, always allocate it in upper tier.
- `defaultTierChancePercentage`: Change (0-100%) of allocating item in top tier

## MMContainer options

- `lruInsertionPointSpec`: Can be set per tier when LRU2Q is used. Determines where new items are
inserted. 0 = insert to hot queue, 1 = insert to warm queue, 2 = insert to cold queue
- `markUsefulChance`: Per-tier, determines chance of moving item to the head of LRU on access
Binary file added cachelib-background-evictor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading