headerssync: Add headers cache #3

hodlinator · 2025-04-04T14:01:28Z

While we do want to avoid OOM-attacks through CVE-2024-52916, we can still serve the happy path by caching headers up to a reasonable chain height.

Cuts out time/bandwidth re-downloading same headers from same peer - good for both local and remote node, especially over slow/flaky connections. Even more helpful when the local node has flaky connection to all other nodes.

Queries the machine for free RAM and takes up to 10% of it for the cache. (Which is released when syncing ends).

l0rinc

Great initiative on optimizing the headers sync – definitely a worthwhile area, I'm very happy somebody's working on this, it was also on my list of things to tackle!
Welcome to the team Benchoors \:D/

The idea of caching headers to skip network redownload is solid, however, the current approach using an in-memory RAM cache with automatic sizing heuristics raises a few key concerns:

Disk caching (raw file or even LevelDB) seems much more scalable, avoids significant RAM pressure during PRESYNC (which could be high), and might offer comparable performance gains given network latency is the main bottleneck. While RAM is faster, disk avoids complex sizing logic and scales better for the future. Have you considered caching in batches to disk instead?
If we keep ram caching, the automatic sizing (GetFreeRAM, time estimates, various bounds) is complex and potentially fragile. Would a simpler approach, like a user-configurable -headerscachesize=MB (with a conservative default), be more predictable and robust?
Some of the logic seems ripe for extraction into a helper method, and some explanatory comments might be better suited for commit messages.
Probably irrelevant, but was wondering that if we're not redownloading, do we still care about "honest nodes" (as long as it's not obvious spam)?

I will review this later in more detail, only had this much energy at the end of the weekend :)

l0rinc · 2025-04-06T19:02:37Z

src/common/system.cpp

@@ -110,3 +120,52 @@ int64_t GetStartupTime()
 {
    return nStartupTime;
 }
+
+std::optional<size_t> GetFreeRAM()


I'm not sure we can get away with this, in other cases the user can provide the size of the desired cache.
If we will keep it, I am very much against introducing dead code, the commit that adds a functionality should also show its use - at least in test, though I prefer real usage to be able to decide if it's the best solution to a given problem or not.

Each commit should be self-contained and its utility immediately demonstrable, the problem should come before the solution, but currently I see the solution in a commit, but not the problem description.

Yep, it is a bit of a departure (interesting potential direction for other settings as well IMO). We mostly do headers-sync on initial startup, after that the memory is freed. Maybe there is some condition where we've been running offline for a while and then reconnect that would re-trigger it too. It might be good to also provide an -arg to override the otherwise dynamic cache size though, will look into that.

I think having an independent commit "lays the breadcrumbs of the story", showing there are only dependencies in one direction (HeadersSyncState -> GetFreeRAM()). But I can see your point too. Merged it into where it is used.

Still not sure why we need this, now that there's a dedicated arg for it. We're not trying to guess dbcache size either - or whether we're on a HDD or SSD (though that would actually be a good idea :p).
We've relied on expert users instead of heuristics mostly, seems simpler to try that here as well - we can estimate it automatically in a follow-up, would go for the simpler solution first.

I think we probably should be "guessing" dbcache size as well. A minor part of this PR is bringing that up for discussion. But moved to a later commit for now, so it's easy to drop if there is more push-back.

I'd do both of them in a separate PR (base memory defaults on available memory), I don't think you want to spend your time defending this part of the code in this PR, seems like a different concern. You'd have to test this on a gazillion operating system and compilers and versions...

l0rinc · 2025-04-06T19:14:22Z

src/headerssync.cpp

+                // Don't cache more than 30 years of ~10-minute blocks for now
+                6 * 24 * 365 * 30,


Was somehow worried about consuming too much RAM.. but we already cap it at 10% below, so removed this, which simplifies things, thanks!

l0rinc · 2025-04-06T19:19:52Z

src/headerssync.cpp

+                6 * 24 * 365 * 30,
+                // We try to make sure not to use more than 10% of
+                // available RAM for the headers cache.
+                (GetFreeRAM().value_or(0) / 10) / sizeof(decltype(m_headers_cache)::value_type)))))


This may be a bit harsh, but why are we caching to memory, why not cache to disk directly? (via a small buffer, of course, like we do with blocks, see https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L114)

We can expect SSDs to be a lot faster by the time this is merged and used - but modern ones are already very close to RAM speed, basically same order of magnitude - and still several orders of magnitude faster than redownloading them:

NVMe has hit almost 4 GB/s in transfer speeds, which is really incredible. This is no where near modern DDR4 speeds… DDR4 is generally around 20 GB/s, with higher speeds hitting 25 GB/s.

RAM latency is lower, but we don't really need that here as far as I understood, we have a predictable read/write order (given some buffering) - very similarly to the actual block download, which also saves and rereads the blocks when the windows is full - and we could likely read/write header batches of 10.000 or whatever, we don't have to store them separately, if my understanding is correct.
Saving and prefetching can even be done on a separate thread if need be (if indeed the iteration order is mostly deterministic).
Predictable read could also be fast on a lousy HDD.

While (de)serialization would add some extra, it's actually quite trivial and we header sync would mimic the pattern used by block downloads (maybe even reuse some of the setup). Could be solved automatically if we saved these to LevelDB instead of separate files - might be slow, dunno, but I think it's worth considering, since I think we can assume that the headers should always fit to disk, but not always into memory.

It may result in simpler code (since we can just assume that everything is cached, no need for 30 year and 10% memory limits and redownload fallbacks), better scaling (we expect more headers in 30 years) and basically the same speed gain compared to redownloading?

My initial goal is using resources that are available on slightly beefier nodes.

Since adding logic for caching to disk adds extra complexity, I'd prefer either doing it in a follow-up PR (saving review-energy for the other aspects in this PR), or possibly in a separate commit at the end of the PR.

Probably would just use one contiguous file without LevelDB since we don't need to do lookups, just write and read in order from genesis/start to sufficient chainwork.
Might be good to have one file for every 100'000 headers in order to be able to delete them after they have been consumed (read), as I think we start downloading and storing blocks to disk in parallel with that process.

Is there some other reason for doing something closer to block storage?

Is there some other reason for doing something closer to block storage?

We don't have to, of course, but I think we need a good reason to do it differently since it's a very similar problem.

Since adding logic for caching to disk adds extra complexity, I'd prefer either doing it in a follow-up PR (saving review-energy for the other aspects in this PR), or possibly in a separate commit at the end of the PR.

I'm not sure that's the case, quite the opposite, I'd expect it to be simpler. But you've spent more time on it than I have...
Memory is finite, not sure it scale with the blockchain. But the disk kinda' has to and I'm not sure it would be measurably slower to do it on disk, but with memory we have to worry about OOM and other attacks all the time.

src/headerssync.cpp

l0rinc · 2025-04-06T19:45:23Z

src/test/headers_sync_chainwork_tests.cpp

-    hss.reset(new HeadersSyncState(0, Params().GetConsensus(), chain_start, chain_work));
-    (void)hss->ProcessNextHeaders(first_chain, true);
-    BOOST_CHECK(hss->GetState() == HeadersSyncState::State::REDOWNLOAD);
+static void HappyPath(const std::vector<CBlockHeader>& first_chain, const CBlockIndex* chain_start)


Is the happy path even likely? What's the smallest thing that can ruin a happy path, a single invalid header that isn't part of the longest chain?
Won't that become a new attack vector - i.e. the lightest attack that will make the happy path unfeasible, derailing caching (which isn't problematic now, may be problematic in a few decades). It's likely a lot better than previous DOS threats, but with disk caching we can likely even handle invalid attempts (e.g. 10% invalid or whatever), which would likely be more robust.

Already on master: During PRESYNC, just as in REDOWNLOAD we both check hashPrevBlock and check PermittedDifficultyTransition (nBits). We must be given headers that fulfill this in-order from our selected headers sync-peer.

The happy path is the most likely. Edge-cases:

Malicious node (SneakyRedownload-test).

Selecting other node that hasn't fully synced the chain either, giving us insufficient work (TooLittleWork-test).

Connection lost.

Don't see potential for a new attack, but maybe I'm missing something?

I still need to understand what the worst case scenario is before and after this change - will try to find out myself (e.g. what kind of attacks are possible before and after)

l0rinc · 2025-04-06T20:04:13Z

src/headerssync.cpp

+                    // buffers.
+                    Assume(m_process_all_remaining_headers);
+                } else {
+                    // If we just switched to REDOWNLOAD and m_headers_cache was


seems to me saving to disk could solve this as well

Not in this case I don't think, even if we were to permit more headers being stored to disk (5min blocktimes?), we would still have a cap on the cache size eventually - so this would still need to be handled.

src/headerssync.cpp

hodlinator

Thank you for your initial feedback!

I'm very happy somebody's working on this, it was also on my list of things to tackle!
Welcome to the team Benchoors :D/

Nice!

the current approach using an in-memory RAM cache with automatic sizing heuristics raises a few key concerns

Disk caching

... automatic sizing (GetFreeRAM, time estimates, various bounds) ... Would a simpler approach, like a user-configurable -headerscachesize=MB

Please see inline comments.

Some of the logic seems ripe for extraction into a helper method, and some explanatory comments might be better suited for commit messages.

Extracted ComputeHeadersCacheMax(), any other specific suggestions?

Probably irrelevant, but was wondering that if we're not redownloading, do we still care about "honest nodes" (as long as it's not obvious spam)?

We have a minimum proof of work threshold which is hardcoded for a given release. This threshold triggers the PRESYNC -> REDOWNLOAD state change. We could have a malicious node attempt to feed us an ultra-long but low work chain. If our node has the cache but is running an old release far in the future (with an old threshold), it will fill the cache and then fall back to the behavior on master of verifying headers are connected and conform to difficulty adjustments, but proceed to throw them away. So the old node is still not going to go into some pathological case unless we remove the cap on the headers cache.

A mid-level aspect I've been considering is doing things like adding an extra state PRESYNC -> +CONSUMECACHE+ -> REDOWNLOAD. But HeadersSyncState exposes state-specific methods publicly (GetState(), GetPresyncHeight(), GetPresyncTime(), GetPresyncWork(), NextHeadersRequestLocator()), so I've avoided it so far. Any ideas on that level?

Changes in latest push:

Extracted ComputeHeadersCacheMax().
Changed HeadersSyncState-ctor to take bytes instead of header count.
Renamed HeadersSyncState::m_download_state -> m_state as we aren't tracking any other state enums within that class.
Extracted ProcessPresync() and ProcessRedownload() methods to reduce indentation.
Added -headerssynccache=MB arg.
Added reference to recent BitMEX blog entry in commit message.
Adjusted comments and other minutia.

hodlinator · 2025-04-09T09:30:23Z

src/common/system.cpp

@@ -110,3 +120,52 @@ int64_t GetStartupTime()
 {
    return nStartupTime;
 }
+
+std::optional<size_t> GetFreeRAM()


Yep, it is a bit of a departure (interesting potential direction for other settings as well IMO). We mostly do headers-sync on initial startup, after that the memory is freed. Maybe there is some condition where we've been running offline for a while and then reconnect that would re-trigger it too. It might be good to also provide an -arg to override the otherwise dynamic cache size though, will look into that.

I think having an independent commit "lays the breadcrumbs of the story", showing there are only dependencies in one direction (HeadersSyncState -> GetFreeRAM()). But I can see your point too. Merged it into where it is used.

src/headerssync.cpp

hodlinator · 2025-04-09T09:37:07Z

src/headerssync.cpp

+                // Don't cache more than 30 years of ~10-minute blocks for now
+                6 * 24 * 365 * 30,


Was somehow worried about consuming too much RAM.. but we already cap it at 10% below, so removed this, which simplifies things, thanks!

src/headerssync.cpp

hodlinator · 2025-04-09T09:42:18Z

src/headerssync.cpp

+                    // buffers.
+                    Assume(m_process_all_remaining_headers);
+                } else {
+                    // If we just switched to REDOWNLOAD and m_headers_cache was


Not in this case I don't think, even if we were to permit more headers being stored to disk (5min blocktimes?), we would still have a cap on the cache size eventually - so this would still need to be handled.

src/headerssync.cpp

hodlinator · 2025-04-10T09:42:35Z

src/headerssync.cpp

+                6 * 24 * 365 * 30,
+                // We try to make sure not to use more than 10% of
+                // available RAM for the headers cache.
+                (GetFreeRAM().value_or(0) / 10) / sizeof(decltype(m_headers_cache)::value_type)))))


My initial goal is using resources that are available on slightly beefier nodes.

Since adding logic for caching to disk adds extra complexity, I'd prefer either doing it in a follow-up PR (saving review-energy for the other aspects in this PR), or possibly in a separate commit at the end of the PR.

Probably would just use one contiguous file without LevelDB since we don't need to do lookups, just write and read in order from genesis/start to sufficient chainwork.
Might be good to have one file for every 100'000 headers in order to be able to delete them after they have been consumed (read), as I think we start downloading and storing blocks to disk in parallel with that process.

Is there some other reason for doing something closer to block storage?

hodlinator · 2025-04-10T09:59:43Z

src/test/headers_sync_chainwork_tests.cpp

-    hss.reset(new HeadersSyncState(0, Params().GetConsensus(), chain_start, chain_work));
-    (void)hss->ProcessNextHeaders(first_chain, true);
-    BOOST_CHECK(hss->GetState() == HeadersSyncState::State::REDOWNLOAD);
+static void HappyPath(const std::vector<CBlockHeader>& first_chain, const CBlockIndex* chain_start)


Already on master: During PRESYNC, just as in REDOWNLOAD we both check hashPrevBlock and check PermittedDifficultyTransition (nBits). We must be given headers that fulfill this in-order from our selected headers sync-peer.

The happy path is the most likely. Edge-cases:

Malicious node (SneakyRedownload-test).

Selecting other node that hasn't fully synced the chain either, giving us insufficient work (TooLittleWork-test).

Connection lost.

Don't see potential for a new attack, but maybe I'm missing something?

hodlinator · 2025-04-11T22:20:31Z

Latest push:

Unbroke p2p_headers_presync fuzz-test in favor of TODO, which is an admission of guilt, there you go. :)
(headerssync: Add headers cache #3 (comment))

src/headerssync.cpp

l0rinc · 2025-04-16T14:16:32Z

src/headerssync.cpp

@@ -23,15 +27,34 @@ constexpr size_t REDOWNLOAD_BUFFER_SIZE{14827}; // 14827/624 = ~23.8 commitments
 // re-calculate parameters if we compress further)
 static_assert(sizeof(CompressedHeader) == 48);

+static size_t ComputeHeadersCacheMax(const CBlockIndex* chain_start, const std::optional<size_t> cache_bytes, size_t element_size)
+{
+    if (cache_bytes) {


this short-circuiting is a bit counter-intuitive

Specifying the size means we override the dynamic calculation, but we still need to convert bytes -> header count. Tried to clarify in latest push.

src/headerssync.cpp

l0rinc · 2025-04-16T14:20:22Z

src/headerssync.cpp

-                break;
-            }
+    switch (m_state) {
+    case State::PRESYNC:


I will have to debug this properly, I don't understand any of this yet :/

src/headerssync.cpp

l0rinc

I'll debug it properly this week, only did a lightweight sweep again to get you some early feedback. Sorry for the lack of depth, will do that shortly.

l0rinc · 2025-04-16T14:30:31Z

src/test/fuzz/headerssync.cpp

@@ -50,13 +50,11 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
 {
    SeedRandomStateForTest(SeedRand::ZEROS);
    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
-    auto mock_time{ConsumeTime(fuzzed_data_provider)};


749f17c

Needs more explanation, I don't have enough context.
"lower" than what? What happens when we're "returning"?

Attempted to improve commit message:

- fuzz: Specify lower limit of mock-time instead of returning - - Should reduce wasted runs, through less code complexity. + fuzz(headerssync): Specify minimum for mock-time + + That way we can always proceed instead of just exiting the fuzz target, wasting cycles. Also reduces code complexity.

Any idea why it was done like that before?

auto age = rand(); if x < 120 do shit else return false

seems indeed simpler to do

auto age = rand(0, 120)

(you could likely commit this in a separate PR as well)

My guess is either the author didn't know of the min parameter or it was added afterwards. I think it's small and non-controversial enough to sneak into this PR for now.

You can do both

It seems it was already available at the time of the pr

src/headerssync.cpp

l0rinc · 2025-04-16T14:34:12Z

src/headerssync.h

@@ -272,7 +272,7 @@ class HeadersSyncState {
    bool m_process_all_remaining_headers{false};

    /** Current state of our headers sync. */
-    State m_download_state{State::PRESYNC};
+    State m_state{State::PRESYNC};


yeah, this is what I mean X x = X:x; // sets X x to x (X)

Yeah, seems kind of naked if removing the comment completely though. Could change it to "Central tracker of state machine's state", but not sure it's enough of an improvement.

It's not just the comment, state is too general, everything is a state.
Can we find a better name - or if not, can we refactor, not being able to find a proper name is a strong code smell

Having a State m_state field in a C++ state machine is certainly not unheard of. Would m_current_state be nicer in your eyes?

No, it doesn't help me in understanding the behavior better. Again, it's like a variable called "thing" ... it indicates a bigger problem

src/headerssync.cpp

l0rinc · 2025-04-16T14:35:55Z

src/headerssync.cpp

+        ret = ProcessRedownload(received_headers, full_headers_message);
+        break;
+    case State::FINAL:
+        Assume(false); // Should never be called again after entering FINAL.


can we rather avoid this instead? Or at least print something instead of a comment? I hate comments, it's a sign that we've given up :(

Changed assume condition to: Assume(m_state != State::FINAL).

Would love proper support for error strings in asserts/assume.

This is worse than before, m_state != State::FINAL is always true

Inside case State::FINAL, m_state != State::FINAL is always false, triggering failure. This is a common idiom for documenting what condition has failed.

Yeah, but ideally we shouldn't be able to get to an illegal state in the first place - though it's not your code that is at fault here, but maybe we can reduce the invalid possibilities

l0rinc · 2025-04-16T14:49:15Z

src/headerssync.h

+    std::vector<CompressedHeader> m_headers_cache;
+
+    /** Maximum cache height / entry count for m_headers_cache. */
+    const size_t m_headers_cache_max;


Do we really need this? We should be able to pre-size this quite precisely, maybe even make it an array and use its bounds.

Changed to use vector::capacity() for now. Would be nice to have a terse way of preventing the vector from accidentally re-allocating.

Or an array, making sure it's heap allocated

I take it you mean a C array? We would end up with 3 fields in HeadersSyncState instead of 1:

const size_t m_headers_cache_size; // Equivalent to vector::capacity() CompressedHeader* const m_headers_cache; // Fixed memory blob, nice! size_t m_headers_cache_count; // Equivalent to vector::size()

Yeah, but passing it around as an std::span.
I don't really have enough experience (as discussed in the other thread) about std::array for big values, so not sure if we can force it to heap allocation or not. Let me know if you do.

std::array is off the table since the size is fixed at compile time, and we will want to determine the size at runtime.

Maybe m_presync_height can be used in place of m_headers_cache_count as long as it is checked against m_headers_cache_size which it may exceed.

You're right, we have to make it dynamic, for a second I thought that if it depends on dbcache size, it's not runtime anymore, lol

And there's also the upper bound of 10% mem currently

src/net_processing.cpp

l0rinc · 2025-04-16T14:50:51Z

src/common/system.cpp

40e9538
This is a huge commit, can you break it up meaningfully?

Broke out the RAM-querying to later commit.

I thought about making the cache_bytes parameter to the HeadersSyncState-ctor default to std::nullopt or 0 and then introduce all the setting of that value in subsequent commits, but it felt too contrived, not leaving that commit in an acceptable state. Files with call-sites sending in that value are more than a handful, but they are quite simple.

Let's discuss in person why you think that's an important part of this change

l0rinc · 2025-04-16T14:53:06Z

src/headerssync.cpp

@@ -64,7 +64,8 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
    // exceeds this bound, because it's not possible for a consensus-valid
    // chain to be longer than this (at the current time -- in the future we
    // could try again, if necessary, to sync a longer chain).
-    m_max_commitments = 6*(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start->GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME) / HEADER_COMMITMENT_PERIOD;
+    // Subtract cached headers as we don't redownload and verify commitments for them.


aren't we subtracting the max possible headers (for some reason I don't yet understand)?

No, this is just the cache, hopefully dropping "max" from the variable name helps.

Q: I don't yet understand how it can that be negative (that's why we need the std::max now, right)? But I'll debug it, you don't have to explain it, just documenting my surprises. Breaking it up into smaller named symbols could help.

l0rinc · 2025-04-16T14:53:41Z

src/headerssync.cpp

@@ -180,7 +181,7 @@ HeadersSyncState::ProcessingResult HeadersSyncState::ProcessRedownload(const
    // we'll return a batch of headers to the caller for processing.
    ret.success = true;
    for (const auto& hdr : received_headers) {
-        if (!ValidateAndStoreRedownloadedHeader(hdr)) {
+        if (!ValidateAndStoreRedownloadedHeader(hdr, /*from_cache=*/false)) {


"redownload" from "cache"?

That way we can always proceed instead of just exiting the fuzz target, wasting cycles. Also reduces code complexity.

* Makes tests logically separated through extraction into functions. Also enables using function-scope instances of HeadersSyncState instead of reset()-ing unique_ptr. * Also adds missing comment for part 4.

* Use BOOST_REQUIRE_EQUAL for HeadersSyncState::State - Nicer failure output and prevents continuing the test. * Verify HeadersSyncState::State directly after ProcessNextHeaders(). * CHECK for more, like Locator and result.pow_validated_headers. * Extract and capitalize constants. * Extract genesis block variable.

* m_current_chain_work -> m_presync_chain_work: See GetPresyncWork() * m_last_header_received -> m_presync_last_header_received: See GetPresyncTime() * m_current_height -> m_presync_height: See GetPresyncHeight() * m_download_state -> m_state: See GetState(). We don't have other state-enums in the class, so drop "download"-qualifier which could mislead one to associate it with the REDOWNLOAD state.

State processing will increase in complexity in upcoming commit.

While we do want to avoid OOM-attacks through [CVE-2024-52916](https://bitcoincore.org/en/2024/07/03/disclose-header-spam/), we can still serve the happy path by caching headers up to a reasonable chain height. The reasonable chain height is calculated by taking the difference between starting block and our current time, and assuming slightly faster than 10 minute block time averages[^1]. Cuts out time/bandwidth re-downloading same headers from same peer - good for both local and remote node, especially over slow/flaky connections. Even more helpful when the local node has flaky connection to all other nodes. The cache is entirely RAM-based since it is ephemeral by nature, taking less than a minute to fill on a really good connection (at contemporary block heights). Also due to the fact that we throw it away upon completion, or if we end up with a peer that has too-low-PoW or feeds us a malicious chain. The headers cache size is set by -headerssynccache=<MB>. When unset, we default to storing REDOWNLOAD_BUFFER_SIZE number of headers (= +0.68 MiB) (smaller for shorter chains). This makes us able to return a chunk of headers earlier, after having switched into the REDOWNLOAD phase. [^1]: "the Bitcoin target interval time isn’t really 10 minutes, it’s actually 10 minutes and 0.3 seconds. Not that this bug is significant, actually since Bitcoin launched, the average interval has been 9 minutes and 36 seconds, significantly less than 10 minutes. This is because on average since 2009 the hashrate has been increasing." https://blog.bitmex.com/the-timewarp-attack/ (31 Mar 2025)

As we cache rather than redownload them, it doesn't make sense to generate commitments for those and then verify against the same data. Also skip duplicate check for PermittedDifficultyTransition during REDOWNLOAD as we already did it during PRESYNC for cached headers.

Query the machine for free RAM and take up to 10% of it for the temporary cache. This really enables Bitcoin Core to make use of the hardware it is given when operated by users who don't investigate every single setting.

l0rinc

Pushing my first impressions while I'm reviewing

l0rinc · 2025-04-17T13:26:10Z

src/headerssync.cpp

@@ -64,7 +64,8 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
    // exceeds this bound, because it's not possible for a consensus-valid
    // chain to be longer than this (at the current time -- in the future we
    // could try again, if necessary, to sync a longer chain).
-    m_max_commitments = 6*(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start->GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME) / HEADER_COMMITMENT_PERIOD;
+    // Subtract cached headers as we don't redownload and verify commitments for them.


Q: I don't yet understand how it can that be negative (that's why we need the std::max now, right)? But I'll debug it, you don't have to explain it, just documenting my surprises. Breaking it up into smaller named symbols could help.

l0rinc · 2025-04-17T13:27:50Z

src/headerssync.h

+    std::vector<CompressedHeader> m_headers_cache;
+
+    /** Maximum cache height / entry count for m_headers_cache. */
+    const size_t m_headers_cache_max;


Or an array, making sure it's heap allocated

l0rinc · 2025-04-17T13:28:16Z

src/common/system.cpp

Let's discuss in person why you think that's an important part of this change

l0rinc · 2025-04-17T13:28:50Z

src/headerssync.cpp

+        ret = ProcessRedownload(received_headers, full_headers_message);
+        break;
+    case State::FINAL:
+        Assume(false); // Should never be called again after entering FINAL.


This is worse than before, m_state != State::FINAL is always true

l0rinc · 2025-04-17T13:29:19Z

src/headerssync.h

@@ -272,7 +272,7 @@ class HeadersSyncState {
    bool m_process_all_remaining_headers{false};

    /** Current state of our headers sync. */
-    State m_download_state{State::PRESYNC};
+    State m_state{State::PRESYNC};


It's not just the comment, state is too general, everything is a state.
Can we find a better name - or if not, can we refactor, not being able to find a proper name is a strong code smell

src/headerssync.cpp

l0rinc · 2025-04-17T13:42:28Z

src/test/fuzz/p2p_handshake.cpp

@@ -56,6 +56,7 @@ FUZZ_TARGET(p2p_handshake, .init = ::initialize)
                                     PeerManager::Options{
                                         .reconcile_txs = true,
                                         .deterministic_rng = true,
+                                         .headerssync_cache_bytes = fuzzed_data_provider.ConsumeIntegralInRange(0, 20 << 20), // Up to 20MiB


Can we use 20_MiB here?

l0rinc · 2025-04-17T13:43:28Z

src/test/headers_sync_chainwork_tests.cpp

@@ -104,7 +104,7 @@ static void SneakyRedownload(const std::vector<CBlockHeader>& first_chain, const
    // initially and then the rest.
    headers_batch.insert(headers_batch.end(), std::next(first_chain.begin()), first_chain.end());

-    HeadersSyncState hss{0, Params().GetConsensus(), chain_start, CHAIN_WORK};
+    HeadersSyncState hss{0, Params().GetConsensus(), chain_start, CHAIN_WORK, /*cache_bytes=*/1 * sizeof(CompressedHeader)};


nit: I personally don't find 1 * more readable than sizeof(CompressedHeader)

l0rinc · 2025-04-17T13:46:10Z

src/headerssync.h

@@ -215,7 +215,7 @@ class HeadersSyncState {

    /** In REDOWNLOAD, check a header's commitment (if applicable) and add to
     * buffer for later processing */
-    bool ValidateAndStoreRedownloadedHeader(const CBlockHeader& header);
+    bool ValidateAndStoreRedownloadedHeader(const CBlockHeader& header, bool from_cache);


as stated before, we likely need to rename the method now, since we're not "downloading" from a cache anymore

l0rinc · 2025-04-17T13:48:22Z

src/common/system.cpp

@@ -110,3 +120,52 @@ int64_t GetStartupTime()
 {
    return nStartupTime;
 }
+
+std::optional<size_t> GetFreeRAM()


I'd do both of them in a separate PR (base memory defaults on available memory), I don't think you want to spend your time defending this part of the code in this PR, seems like a different concern. You'd have to test this on a gazillion operating system and compilers and versions...

l0rinc · 2025-04-18T09:39:50Z

src/test/fuzz/headerssync.cpp

@@ -50,13 +50,11 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
 {
    SeedRandomStateForTest(SeedRand::ZEROS);
    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
-    auto mock_time{ConsumeTime(fuzzed_data_provider)};


It seems it was already available at the time of the pr

l0rinc · 2025-04-18T09:58:50Z

src/test/headers_sync_chainwork_tests.cpp

+    SneakyRedownload(first_chain, second_chain, chain_start, chain_work);
+    HappyPath(first_chain, chain_start, chain_work);
+    TooLittleWork(second_chain, chain_start, chain_work);


This is a lot better than commenting all overt the place, love it!

l0rinc · 2025-04-18T09:59:29Z

src/test/headers_sync_chainwork_tests.cpp

+    TooLittleWork(second_chain, chain_start, chain_work);
+}
+
+static void SneakyRedownload(const std::vector<CBlockHeader>& first_chain, const std::vector<CBlockHeader>& second_chain, const CBlockIndex* chain_start, const arith_uint256& chain_work)


nit: would be great to avoid all these const vector& references and use std::span instead, but that would require prod code change as well, so maybe next time

l0rinc · 2025-04-18T10:05:29Z

src/test/headers_sync_chainwork_tests.cpp

@@ -63,15 +63,19 @@ BOOST_FIXTURE_TEST_SUITE(headers_sync_chainwork_tests, HeadersGeneratorSetup)
 //    updates to the REDOWNLOAD phase successfully.


nit: HeadersGeneratorSetup::GenerateHeaders seems to have a leftover return at the end (and ;;)

l0rinc · 2025-04-18T10:06:57Z

src/test/headers_sync_chainwork_tests.cpp


-    headers_batch.clear();
+    std::vector<CBlockHeader> headers_batch;
    headers_batch.insert(headers_batch.end(), std::next(second_chain.begin(), 1), second_chain.end());
    // Tell the sync logic that the headers message was not full, implying no
    // more headers can be requested. For a low-work-chain, this should causes


Suggested change

// more headers can be requested. For a low-work-chain, this should causes

// more headers can be requested. For a low-work-chain, this should cause

l0rinc · 2025-04-18T10:36:25Z

src/headerssync.h


-    /** Height of m_last_header_received */
-    int64_t m_current_height{0};
+    /** During phase 1 (PRESYNC), we cache a reasonable amount of received headers so


Q: what's the role of the lookahead buffer after this change?

l0rinc · 2025-04-18T10:43:20Z

src/headerssync.h

@@ -183,6 +183,11 @@ class HeadersSyncState {
    const unsigned m_commit_offset;

 private:
+    ProcessingResult ProcessPresync(const std::vector<CBlockHeader>&


maybe this could come after the test breakup (or maybe even the same commit), it seems to follow the same logic

l0rinc · 2025-04-18T10:46:02Z

src/node/peerman_args.cpp

@@ -23,6 +23,10 @@ void ApplyArgsManOptions(const ArgsManager& argsman, PeerManager::Options& optio
    if (auto value{argsman.GetBoolArg("-capturemessages")}) options.capture_messages = *value;

    if (auto value{argsman.GetBoolArg("-blocksonly")}) options.ignore_incoming_txs = *value;
+
+    if (auto value{argsman.GetIntArg("-headerssynccache")}) {
+        options.headerssync_cache_bytes = *value << 20; // MiB -> byte conversion


this could be problematic on 32 bit system, we should likely err if this is too big for some reason:

Suggested change

options.headerssync_cache_bytes = *value << 20; // MiB -> byte conversion

options.headerssync_cache_bytes = *value * 1_MiB;

l0rinc · 2025-04-18T10:53:02Z

src/headerssync.cpp

@@ -43,7 +67,7 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
    // could try again, if necessary, to sync a longer chain).
    m_max_commitments = 6*(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start->GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME) / HEADER_COMMITMENT_PERIOD;

-    LogDebug(BCLog::NET, "Initial headers sync started with peer=%d: height=%i, max_commitments=%i, min_work=%s\n", m_id, m_presync_height, m_max_commitments, m_minimum_required_work.ToString());
+    LogDebug(BCLog::NET, "Initial headers sync started with peer=%d: height=%i, max_commitments=%i, min_work=%s, cache=%d headers (%.1f MiB)", m_id, m_presync_height, m_max_commitments, m_minimum_required_work.ToString(), m_headers_cache.capacity(), (m_headers_cache.capacity() * sizeof(decltype(m_headers_cache)::value_type)) / (1024.0f * 1024.0f));


to avoid guarding m_headers_cache for additional pushes and querying .capacity in this awkward way, we could hide it behind a struct, something like:

template<typename T> class FixedCache { std::vector<T> data; std::size_t const cap;

l0rinc · 2025-04-18T11:03:22Z

src/headerssync.cpp

@@ -54,6 +78,7 @@ void HeadersSyncState::Finalize()
    Assume(m_state != State::FINAL);
    ClearShrink(m_header_commitments);
    m_presync_last_header_received.SetNull();
+    std::vector<CompressedHeader>{}.swap(m_headers_cache);


this may bare a comment stating that when we detect any problem the cache is completely wiped.
And can this also be a ClearShrink instead?

hodlinator marked this pull request as draft April 4, 2025 14:06

l0rinc reviewed Apr 6, 2025

View reviewed changes

hodlinator force-pushed the 2024/10/headers_presync_cache branch from ef1b61b to 2de1ae5 Compare April 11, 2025 21:19

hodlinator commented Apr 11, 2025

View reviewed changes

hodlinator force-pushed the 2024/10/headers_presync_cache branch from 2de1ae5 to c0c990f Compare April 11, 2025 22:17

l0rinc reviewed Apr 16, 2025

View reviewed changes

src/headerssync.cpp Show resolved Hide resolved

l0rinc reviewed Apr 16, 2025

View reviewed changes

src/headerssync.cpp Outdated Show resolved Hide resolved

l0rinc reviewed Apr 16, 2025

View reviewed changes

src/headerssync.cpp Outdated Show resolved Hide resolved

l0rinc reviewed Apr 16, 2025

View reviewed changes

hodlinator added 8 commits April 17, 2025 11:26

fuzz(headerssync): Specify minimum for mock-time

93c327f

That way we can always proceed instead of just exiting the fuzz target, wasting cycles. Also reduces code complexity.

refactor(test): Break up headers_sync_chainwork_tests

359ebcd

* Makes tests logically separated through extraction into functions. Also enables using function-scope instances of HeadersSyncState instead of reset()-ing unique_ptr. * Also adds missing comment for part 4.

refactor(headerssync): Break up ProcessNextHeaders

723763d

State processing will increase in complexity in upcoming commit.

headerssync: Enable headers cache size to scale with RAM

f1fdc84

Query the machine for free RAM and take up to 10% of it for the temporary cache. This really enables Bitcoin Core to make use of the hardware it is given when operated by users who don't investigate every single setting.

hodlinator force-pushed the 2024/10/headers_presync_cache branch from c0c990f to f1fdc84 Compare April 17, 2025 11:37

l0rinc reviewed Apr 17, 2025

View reviewed changes

l0rinc reviewed Apr 18, 2025

View reviewed changes

		// Don't cache more than 30 years of ~10-minute blocks for now
		6 * 24 * 365 * 30,

		@@ -63,15 +63,19 @@ BOOST_FIXTURE_TEST_SUITE(headers_sync_chainwork_tests, HeadersGeneratorSetup)
		// updates to the REDOWNLOAD phase successfully.

	// more headers can be requested. For a low-work-chain, this should causes
	// more headers can be requested. For a low-work-chain, this should cause

	options.headerssync_cache_bytes = *value << 20; // MiB -> byte conversion
	options.headerssync_cache_bytes = value 1_MiB;

headerssync: Add headers cache #3

Are you sure you want to change the base?

headerssync: Add headers cache #3

Conversation

hodlinator commented Apr 4, 2025 • edited Loading

l0rinc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hodlinator Apr 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hodlinator left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hodlinator commented Apr 11, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

l0rinc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

l0rinc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hodlinator commented Apr 4, 2025 •

edited

Loading

l0rinc left a comment •

edited

Loading

hodlinator Apr 17, 2025 •

edited

Loading

hodlinator left a comment •

edited

Loading