بنقرة واحدة
subsystem-summary-of-catchup
read this skill for a token-efficient summary of the catchup subsystem
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
القائمة
read this skill for a token-efficient summary of the catchup subsystem
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
استنادا إلى تصنيف SOC المهني
extending yourself with a new reusable skill by interviewing the user
analyzing a change to determine what tests are needed and adding them to the test suite
modifying build configuration to enable/disable variants, switch compilers or flags, or otherwise prepare for a build
reviewing a change for semantic correctness, simplicity, design consistency, and completeness
reviewing a git diff for small localized coding mistakes that can be fixed without high-level understanding
how to run make correctly to get a good build, and otherwise understand the build system
| name | subsystem-summary-of-catchup |
| description | read this skill for a token-efficient summary of the catchup subsystem |
The catchup subsystem in stellar-core is responsible for synchronizing a node's local ledger state with the rest of the network when it falls behind. It downloads historical data (ledger headers, transactions, and bucket snapshots) from history archives, verifies integrity, and applies the data to bring the node up to date.
All files reside in src/catchup/.
File: CatchupConfiguration.h/.cpp
Immutable configuration describing a catchup request. Parameterized by:
toLedger — destination ledger number (or CURRENT = 0 to resolve at runtime from the archive).count — number of ledgers to replay before the destination. 0 = minimal (buckets only), UINT32_MAX = complete history.Mode — one of OFFLINE_BASIC, OFFLINE_COMPLETE, or ONLINE.Key methods:
resolve(uint32_t remoteCheckpoint) — substitutes CURRENT with an actual checkpoint ledger number.offline() / online() — predicates for mode.Helper free functions parseLedger() and parseLedgerCount() parse CLI strings.
File: CatchupRange.h/.cpp
Computed from CatchupConfiguration + the current LCL + HistoryManager. Decides what the catchup must do:
| Field | Meaning |
|---|---|
mApplyBuckets | Whether a bucket-apply phase is needed. |
mApplyBucketsAtLedger | Checkpoint ledger at which to apply buckets (0 if !mApplyBuckets). |
mReplayRange (LedgerRange) | Half-open range of ledgers to replay after bucket-apply. |
Five logical cases based on LCL position, requested count, and checkpoint boundaries (see comments in header). Invariants enforced by checkInvariants().
Key accessors: applyBuckets(), replayLedgers(), getBucketApplyLedger(), getReplayRange(), getFullRangeIncludingBucketApply().
Files: LedgerApplyManager.h, LedgerApplyManagerImpl.h/.cpp
Abstract interface + concrete implementation. This is the top-level coordinator between the consensus layer (Herder) and the catchup/apply machinery. Owned by Application.
mCatchupWork — shared_ptr<CatchupWork>, the running catchup work item (null when not catching up).mSyncingLedgers — map<uint32_t, LedgerCloseData>, buffer of ledgers received from the network that cannot be applied yet. Has strict invariants: either empty, starts at LCL+1, or contains at most 65 ledgers within a checkpoint boundary.mLastQueuedToApply — tracks the highest ledger sequence queued for application.mLargestLedgerSeqHeard — the highest ledger seq ever received.mMetrics (CatchupMetrics) — counters for archive states downloaded, checkpoints, ledgers verified, buckets downloaded/applied, tx sets downloaded/applied.mCatchupFatalFailure — set when catchup fails unrecoverably (e.g., incompatible core version).MAX_EXTERNALIZE_LEDGER_APPLY_DRIFT = 12 — maximum ledger drift allowed before entering catchup in parallel-close mode.processLedger(LedgerCloseData, isLatestSlot) — main entry point called by Herder/LedgerManager when a new consensus ledger arrives. Logic:
mCatchupWork.tryApplySyncingLedgers().startOnlineCatchup().startCatchup(CatchupConfiguration, archive) — schedules a CatchupWork on the WorkScheduler.startOnlineCatchup() — constructs a CatchupConfiguration targeting firstBuffered - 1 in ONLINE mode.trimSyncingLedgers() — garbage-collects old entries from mSyncingLedgers, keeping at most one checkpoint's worth plus one.tryApplySyncingLedgers() — iterates sequential ledgers in mSyncingLedgers and applies them via LedgerManager::applyLedger(). In parallel-close mode, posts work to the ledger-close thread.maybeGetNextBufferedLedgerToApply() — returns the next buffered ledger (LCL+1) if available; used by ApplyBufferedLedgersWork.File: CatchupWork.h/.cpp
The central Work subclass orchestrating all catchup steps. Extends Work (composite work pattern).
mLocalState (HistoryArchiveState) — local BucketList state at catchup start.mDownloadDir (unique_ptr) — temporary directory for downloaded files.mLiveBuckets, mHotBuckets — maps from hash → downloaded Bucket objects.mCatchupConfiguration — the resolved configuration.mGetHistoryArchiveStateWork, mGetBucketStateWork — work to fetch HAS from archive.mDownloadVerifyLedgersSeq — work sequence for downloading + verifying ledger headers.mVerifyLedgers (VerifyLedgerChainWork) — verifies ledger chain integrity.mBucketVerifyApplySeq — work sequence for downloading, verifying, and applying buckets.mTransactionsVerifyApplySeq (DownloadApplyTxsWork) — work for downloading and applying transactions.mApplyBufferedLedgersWork — applies buffered network ledgers after catchup replay.mCatchupSeq — final composite work sequence.mVerifiedLedgerRangeStart (LedgerHeaderHistoryEntry) — the verified ledger at the start of the catchup range (used for bucket-apply).mFatalFailureFuture — shared_future indicating unrecoverable failure.runCatchupStep() / doWork()):getAndMaybeSetHistoryArchiveState() fetches the remote history archive state, validates network passphrase, checks that target > LCL.getAndMaybeSetBucketHistoryArchiveState() if bucket-apply is needed and the bucket HAS differs from the main HAS.downloadVerifyLedgerChain() spawns BatchDownloadWork + VerifyLedgerChainWork in a WorkSequence.applyBuckets(): downloadApplyBuckets() → DownloadBucketsWork + ApplyBucketsWork.replayLedgers(): downloadApplyTransactions() → DownloadApplyTxsWork.LedgerManager::setLastClosedLedger() with the verified state, clears rebuild flags.ApplyBufferedLedgersWork drains mSyncingLedgers.Constants: PUBLISH_QUEUE_UNBLOCK_APPLICATION = 8, PUBLISH_QUEUE_MAX_SIZE = 16 — flow-control the publish queue during catchup.
File: VerifyLedgerChainWork.h/.cpp
BasicWork subclass that verifies a range of downloaded ledger header files. Processes checkpoints from highest to lowest, linking each checkpoint's hash chain to the next.
mDownloadDir, mRange, mCurrCheckpoint — the files to verify and current position.mLastClosed (LedgerNumHashPair) — local LCL for consistency checks.mTrustedMaxLedger (shared_future) — trusted hash from SCP consensus for the range end.mVerifiedAhead (LedgerNumHashPair) — hash-link propagation between checkpoint verifications.mVerifiedMinLedgerPrev (promise) — outgoing: the hash just before the verified range, so bucket-apply can validate.mMaxVerifiedLedgerOfMinCheckpoint — the max ledger of the lowest checkpoint; used by CatchupWork as mVerifiedLedgerRangeStart.mFatalFailurePromise — set when a mismatch against trusted hash is detected.mChainDisagreesWithLocalState — records local-state disagreements (e.g., bad LCL hash, incompatible version).verifyHistoryOfSingleCheckpoint():mTrustedMaxLedger.mVerifiedAhead.mVerifiedMinLedgerPrev and records mMaxVerifiedLedgerOfMinCheckpoint.onRun():Calls verifyHistoryOfSingleCheckpoint() once per crank. On success, decrements mCurrCheckpoint and returns WORK_RUNNING until all checkpoints are verified. Maps various error statuses to WORK_FAILURE with appropriate log messages.
File: DownloadApplyTxsWork.h/.cpp
BatchWork subclass that iterates over checkpoints in a replay range, yielding a work sequence per checkpoint: download → unzip → apply.
mRange (LedgerRange) — the half-open replay range.mDownloadDir — shared temp directory.mLastApplied (LedgerHeaderHistoryEntry&) — reference to the last applied header (updated on success).mCheckpointToQueue — next checkpoint to schedule.mLastYieldedWork — the previous checkpoint's work, used for sequencing.mWaitForPublish — if true, gates application on publish queue size.yieldMoreWork():For each checkpoint:
GetAndUnzipRemoteFileWork for the transaction file.ApplyCheckpointWork for the ledger range within that checkpoint.ConditionalWork that:
PUBLISH_QUEUE_MAX_SIZE.WorkSequence.File: ApplyCheckpointWork.h/.cpp
BasicWork subclass that applies transactions from a single checkpoint (at most one checkpoint worth of ledgers).
mDownloadDir — temp dir with ledger + tx files.mLedgerRange — the aligned ledger range to apply.mCheckpoint — the checkpoint number.mHdrIn, mTxIn — XDR input streams for ledger headers and transactions.mConditionalWork — wraps ApplyLedgerWork in a conditional that waits for BucketList merge futures to resolve.onRun()):getNextLedgerCloseData() which reads the next header from file, performs knitting checks (skip old, verify LCL hash continuity, verify tx set hash), and constructs a LedgerCloseData.ApplyLedgerWork wrapped in a ConditionalWork that waits for BucketList merge futures.File: ApplyLedgerWork.h/.cpp
Minimal BasicWork subclass. onRun() calls LedgerManager::applyLedger(lcd, false) to close a single ledger. No retry.
File: ApplyBucketsWork.h/.cpp
Work subclass that applies bucket snapshot state to the database.
mBuckets — map of hash → LiveBucket (downloaded buckets).mApplyState (HistoryArchiveState) — the archive state to apply.mBucketsToApply — ordered vector of buckets (L0 curr, L0 snap, L1 curr, ...).mBucketApplicator — the active BucketApplicator instance.mSeenKeys, mSeenKeysBeforeApply — deduplication sets to ensure only the newest version of each entry is written.mIndexBucketsWork — child work to index bucket files (runs first).mAssumeStateWork — child work to assume BucketList state (runs after all buckets applied).doWork()):IndexBucketsWork<LiveBucket> on first call.mBucketsToApply in order, using BucketApplicator to incrementally write entries to the database. Entries already in mSeenKeys are skipped (ensures newest-version-wins). After each bucket, runs invariant checks.AssumeStateWork which indexes both live and hot archive buckets, then calls BucketManager::assumeState() to set the BucketList to the target state and restart merges.File: AssumeStateWork.h/.cpp
Work subclass spawned at the end of ApplyBucketsWork. Holds strong references to all buckets in the HAS (including future buckets from pending merges) to prevent garbage collection during indexing.
doWork():IndexBucketsWork<LiveBucket> and IndexBucketsWork<HotArchiveBucket>.BucketManager::assumeState() and InvariantManager::checkAfterAssumeState().checkChildrenStatus().File: IndexBucketsWork.h/.cpp
Template Work subclass that indexes bucket files in parallel. For each non-empty, non-indexed bucket, spawns an IndexWork child.
postOnBackgroundThread.createIndex<BucketT>().BucketManager::maybeSetIndex().File: ApplyBufferedLedgersWork.h/.cpp
BasicWork subclass used at the end of catchup to drain mSyncingLedgers. On each onRun():
ConditionalWork is done.LedgerApplyManager::maybeGetNextBufferedLedgerToApply() for the next ledger.ApplyLedgerWork in a ConditionalWork that waits for BucketList merge futures.WORK_SUCCESS when no more buffered ledgers available.File: ReplayDebugMetaWork.h/.cpp
Work subclass for offline replay of debug meta files (used in diagnostic scenarios). Iterates sorted debug meta files, optionally gunzips them, and spawns ApplyLedgersFromMetaWork (inner helper class) to read LedgerCloseMeta entries and apply them via ApplyLedgerWork. Can also apply a final StoredDebugTransactionSet for the latest ledger.
Herder (consensus)
│
▼
LedgerApplyManagerImpl::processLedger()
│
├─ If sequential with LCL → tryApplySyncingLedgers() → LedgerManager::applyLedger()
│
└─ If behind → buffer in mSyncingLedgers
│
└─ When checkpoint boundary reached → startOnlineCatchup()
│
▼
CatchupWork (scheduled on WorkScheduler)
│
├─ 1. GetHistoryArchiveStateWork → fetch remote HAS
├─ 2. Compute CatchupRange
├─ 3. downloadVerifyLedgerChain()
│ ├─ BatchDownloadWork (ledger header files)
│ └─ VerifyLedgerChainWork (hash-chain verification, highest→lowest)
├─ 4a. downloadApplyBuckets() [if applyBuckets()]
│ ├─ DownloadBucketsWork
│ ├─ verify HAS
│ └─ ApplyBucketsWork
│ ├─ IndexBucketsWork
│ ├─ BucketApplicator (per bucket, level by level)
│ └─ AssumeStateWork
├─ 4b. downloadApplyTransactions() [if replayLedgers()]
│ └─ DownloadApplyTxsWork (per checkpoint)
│ ├─ GetAndUnzipRemoteFileWork
│ └─ ApplyCheckpointWork
│ └─ ApplyLedgerWork (per ledger)
└─ 5. ApplyBufferedLedgersWork → drain mSyncingLedgers
Same as online but triggered by startCatchup() directly (not by buffered ledgers), mode is OFFLINE_BASIC or OFFLINE_COMPLETE, no ApplyBufferedLedgersWork, and in OFFLINE_COMPLETE mode, DownloadVerifyTxResultsWork is also run for full validation.
LedgerApplyManagerImpl methods assert threadIsMain() — the catchup coordinator runs entirely on the main thread.Work / BasicWork framework is cranked on the main thread's event loop.IndexBucketsWork::IndexWork posts indexing tasks to a background thread pool via postOnBackgroundThread(), and posts results back to the main thread via postOnMainThread().parallelLedgerClose()), tryApplySyncingLedgers() posts applyLedger calls to the ledger-close thread.ApplyCheckpointWork and ApplyBufferedLedgersWork use ConditionalWork to poll for BucketList merge future resolution before applying ledgers, preventing application while background merges are pending.VerifyLedgerChainWork uses std::promise / std::shared_future for inter-work communication: the trusted max-ledger hash is passed in via shared_future, and the verified min-ledger-prev hash is passed out via promise.Application
└─ LedgerApplyManagerImpl (unique_ptr, via LedgerApplyManager::create)
└─ mCatchupWork: shared_ptr<CatchupWork> (owned while catchup active)
├─ mDownloadDir: unique_ptr<TmpDir>
├─ mLiveBuckets / mHotBuckets: map<string, shared_ptr<Bucket>>
├─ mGetHistoryArchiveStateWork: shared_ptr
├─ mDownloadVerifyLedgersSeq: shared_ptr<WorkSequence>
│ └─ mVerifyLedgers: shared_ptr<VerifyLedgerChainWork>
├─ mBucketVerifyApplySeq: shared_ptr<WorkSequence>
│ └─ ApplyBucketsWork
│ ├─ mIndexBucketsWork: shared_ptr<IndexBucketsWork>
│ │ └─ IndexWork children (per bucket, background thread)
│ ├─ mBucketApplicator: unique_ptr<BucketApplicator>
│ └─ mAssumeStateWork: shared_ptr<AssumeStateWork>
├─ mTransactionsVerifyApplySeq: shared_ptr<DownloadApplyTxsWork>
│ └─ per-checkpoint WorkSequence children
│ ├─ GetAndUnzipRemoteFileWork
│ └─ ApplyCheckpointWork
│ └─ mConditionalWork → ApplyLedgerWork
├─ mApplyBufferedLedgersWork: shared_ptr<ApplyBufferedLedgersWork>
└─ mCatchupSeq: shared_ptr<WorkSequence> (final composite)
LedgerApplyManagerImpl also owns mSyncingLedgers (the ledger buffer) independently of CatchupWork.
CatchupRange::checkInvariants() ensures at least one of bucket-apply or replay is active, and validates sequencing between them.VerifyLedgerChainWork is done backwards (highest checkpoint first) to propagate trust from the SCP-consensus hash downward.VerifyLedgerChainWork detects a mismatch with a trusted SCP hash, it sets mFatalFailurePromise to true, causing CatchupWork::fatalFailure() to return true and LedgerApplyManagerImpl to set mCatchupFatalFailure, permanently blocking further catchup attempts.ApplyCheckpointWork validates that the resulting LCL hash matches the expected ledger header after each ledger application.DownloadApplyTxsWork prevents the publish queue from growing beyond PUBLISH_QUEUE_MAX_SIZE by gating ApplyCheckpointWork behind a ConditionalWork.ConditionalWork) before applying any ledger, both during checkpoint replay and buffered ledger application.