-
Notifications
You must be signed in to change notification settings - Fork 242
IPIP: On-Demand Pinning Based on DHT Provider Counts #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,148 @@ | ||
| --- | ||
| title: "IPIP-0000: On-Demand Pinning Based on DHT Provider Counts" | ||
| date: 2026-03-25 | ||
| ipip: proposal | ||
| editors: | ||
| - name: Cornelius Ihle | ||
| github: ihlec | ||
| relatedIssues: [] | ||
| order: 0000 | ||
| tags: ['ipips'] | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| Defines a mechanism for IPFS nodes to automatically pin content when DHT | ||
| provider counts fall below a configurable replication target, and unpin when | ||
| replication has recovered above target for a grace period. | ||
|
|
||
| ## Motivation | ||
|
|
||
| IPFS content availability depends on at least one provider remaining online. | ||
| Content hosted by a small number of nodes is fragile. | ||
|
|
||
| Node operators who want to help keep content alive must manually monitor | ||
| provider counts and re-pin content when replication drops. This is tedious, | ||
| error-prone, and doesn't scale. Conversely, nodes may continue pinning content | ||
| that already has abundant providers elsewhere, wasting local storage that could | ||
| serve under-replicated content instead. | ||
|
|
||
| A standardized on-demand pinning mechanism would let community nodes act as an | ||
| automatic safety net: pinning content that is at risk of disappearing, and | ||
| releasing it once enough other providers exist. | ||
|
|
||
| ## Detailed design | ||
|
|
||
| This IPIP defines an on-demand pinning mechanism with three components: | ||
|
|
||
| - registry of monitored CID | ||
| - background checker | ||
| - pin/unpin behavior based on DHT provider counts | ||
|
|
||
| ### Registry | ||
|
|
||
| Implementations maintain a persistent registry of CIDs to monitor. Each entry | ||
| tracks: | ||
|
|
||
| - The CID being monitored | ||
| - Whether the implementation currently holds a pin for it | ||
| - A timestamp for when providers were last observed above target (used for grace period tracking) | ||
| - A creation timestamp (purely informational, no logic depends on the timestamp) | ||
|
|
||
| Users add and remove CIDs from the registry explicitly. Adding a CID does not | ||
| immediately pin it -- it registers it for monitoring. | ||
|
|
||
| ### Background checker | ||
|
|
||
| A periodic loop evaluates every registered CID: | ||
|
|
||
| 1. Query the DHT for providers of the CID (excluding self) | ||
| 2. If providers < replication target and not currently pinned: recursively pin | ||
| the content | ||
| 3. If providers >= replication target and currently pinned: start the grace | ||
| period timer (if not already running) | ||
| 4. If grace period has elapsed: unpin the content | ||
| 5. If providers drop below target again while grace period is running: reset | ||
| the timer | ||
|
|
||
| The checker skips CIDs that have a user-created pin (to avoid interfering with | ||
| manual pin management). | ||
| The periodic query replaces the periodic re-provide and only adds overhead for actively pinnned CIDs (below replication target). | ||
|
|
||
| ### Configuration parameters | ||
|
|
||
| Implementations MUST support the following parameters (values are TBD): | ||
|
|
||
| - **Replication target**: the desired number of providers (excluding self). | ||
| Recommended default: 5 | ||
| - **Check interval**: time between full evaluation cycles. Recommended | ||
| default: 10 minutes | ||
| - **Unpin grace period**: how long providers must stay above target before | ||
| unpinning. Recommended default: 24 hours | ||
|
|
||
| ### Pin naming | ||
|
|
||
| When the checker creates a pin, it SHOULD use a well-known name (e.g., | ||
| `"on-demand"`) to distinguish it from user-created pins. The checker MUST NOT | ||
| unpin content whose pin name does not match, to avoid removing user pins. | ||
|
Comment on lines
+83
to
+87
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is too specific - implementations like Helia allow attaching arbitrary metadata to pins, so it's easy to mark a pin as "on-demand" without dictating what the pin name should be. Better to just say that implementations should partition their pin store in some way to prevent accidental deletion of user pins, or words to that effect. |
||
|
|
||
| ### Pinning scope | ||
|
|
||
| The checker MUST use recursive pins. Direct pins do not preserve | ||
| content availability since they do not protect linked blocks. | ||
|
|
||
| ## Design rationale | ||
|
|
||
| The design favors simplicity: it relies entirely on existing DHT infrastructure | ||
| for provider discovery and existing pin semantics for storage. No new wire | ||
| protocols or peer coordination are introduced. | ||
|
|
||
| The grace period mechanism prevents thrashing. Without it, a CID hovering near | ||
| the replication target would be pinned and unpinned on every check cycle. A | ||
| 24-hour default gives enough time to confirm that new providers are stable. | ||
|
|
||
| The pin naming convention lets the checker coexist with user pins. If a user | ||
| manually pins a CID that is also registered for on-demand pinning, the checker | ||
| will not interfere. | ||
|
|
||
| ### User benefit | ||
|
|
||
| Nodes can contribute storage where it matters most. Instead of pinning content | ||
| indefinitely regardless of how many other providers exist, on-demand pinning | ||
| frees storage automatically when content is well-replicated, making room for | ||
| content that actually needs help. | ||
|
|
||
| On-demand pinning can be easily integrated into existing UI-flow. | ||
|  | ||
|
|
||
| ### Compatibility | ||
|
|
||
| This feature is purely additive. Nodes that do not implement on-demand pinning | ||
| are unaffected. On-demand pinning nodes interact with the network using only | ||
| existing DHT queries and standard pin operations -- no protocol changes are | ||
| required. | ||
|
|
||
| ### Security | ||
|
|
||
| DHT provider counts can be gamed. A Sybil attack could inflate provider counts | ||
| by announcing many fake provider records, tricking nodes into unpinning content | ||
| that is not actually well-replicated. Implementations SHOULD document this | ||
| limitation. The grace period provides partial mitigation: an attacker would | ||
| need to sustain fake provider records for the full grace duration. | ||
|
|
||
| ### Alternatives | ||
|
|
||
| A dedicated replication protocol between cooperating nodes was considered. This | ||
| would allow nodes to explicitly coordinate who pins what, avoiding redundant | ||
| work. However, it would require a overlay network protocol, and peer discovery for | ||
| replication partners -- dramatically increasing [complexity](https://github.com/gipplab/D-LOCKSS) or introducing [centralized components](https://github.com/gipplab/ipfs-archive-tracker). | ||
| The DHT-based approach reuses existing infrastructure and requires no coordination between nodes. | ||
|
|
||
| ## Test fixtures | ||
|
|
||
| Not applicable. This IPIP defines node behavior, not content-addressed data | ||
| formats. | ||
|
|
||
| ### Copyright | ||
|
|
||
| Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably the pinning node needs to execute a provide for the CID too? Otherwise peers trying to fetch the content will not know that the pinning node has it.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. It would wait for the next re-provide interval (worst case 22h).
Immediate explicit provide call after pinning is better.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right - I think I misread things - does this spec use the term "pinning" when it means "providing"?
It's certainly possible to pin something without providing it, and vice versa.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check.
My perception is:
Pinning = store content permanently on the node. Do not garbage collect.
Providing = write a provider record to the DHT (advertise content), pointing to my node.
It is possible to pin something without providing it, until re-provide picks it up.
It is possible to provide something without pinning, until garbage collector hits it.
The Spec assumes providing to be implicit after at most 22h, due to the re-provide system providing all pinned data by default.
I agree the spec should explicitly state the provide operation and not assume implicit providing. A explicit provide right after pinning makes the content discover-able early and prevents others from unnecessary pinning.
Please check if my perception is accurate. Thank you for your detailed review so far!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be implementation-specific - js-libp2p's kad-dht implementation only re-provides previously provided CIDs, it doesn't look at what's pinned or not, or even if the blocks are present.
One day this might let a delegated endpoint publish provider records on behalf of less powerful/well connected nodes, for example.