Document OHTTP relay random selection guidance#1380
Document OHTTP relay random selection guidance#1380bc1cindy wants to merge 1 commit intopayjoin:masterfrom
Conversation
Fixed relay order creates a fingerprintable pattern at the network layer even though OHTTP protects the user's IP. Document that callers SHOULD select relays randomly and track failed ones for resilience, following the existing RelayManager implementation in payjoin-cli. Closes payjoin#1328
Pull Request Test Coverage Report for Build 22671363022Details
💛 - Coveralls |
DanGould
left a comment
There was a problem hiding this comment.
This is the right direction but raised some questions I don't have the answers for yet about whether the pattern this is documenting is actually a desirable one sufficient for reference.
| /// Relays are selected randomly from the configured list to avoid creating | ||
| /// a fingerprintable pattern at the network layer. Even though OHTTP protects | ||
| /// the user's IP address, always contacting the same relay in the same order | ||
| /// reveals metadata about the client's behavior. |
There was a problem hiding this comment.
- The random selection is done in
fetch_ohttp_keysas you've documented, technically, not in the manager itself which just holds state. This comment'd make sense at the module level (referencing both) but makes less sense on the Struct which isn't doing randomization (though as you'll find in my second point perhaps it should) - If the config
ohttp_keys is setno randomization happens at all,ohttp_relays[0]is selected, which also makes this doc wrong and tbqh feels like a bug. Doesn't make sense to me to only randomize theohttp_relayyou use the bootstrap fetch. and imho it doesn't make any sense for the fetch function to determine what relay gets used in the future. @benalleng could you take a look at this? - Sending and receiving then use the same selected relay over and over, which also seems like it may be suboptimal.
There was a problem hiding this comment.
I believe the fix can be to add pick_relay to RelayManager to centralize random selection, exclude failed relays, remove selected_relay from struct state, and also fix the ohttp_relays[0] bug as a consequence
ohttp.rs
/// Selects OHTTP relays randomly from a configured list, excluding those that have failed.
///
/// Random selection prevents a fixed contact pattern at the network layer.
/// Even though OHTTP protects the client's IP, always using the same relay
/// allows a passive observer monitoring relay traffic to correlate sessions
/// belonging to the same client.
///
/// Note: When only one relay is configured, selection is deterministic.
/// Configure multiple relays to benefit from random selection.
///
/// Failed relays are tracked and excluded for the lifetime of the [`RelayManager`].
#[derive(Debug, Clone)]
pub struct RelayManager {
failed_relays: Vec<url::Url>,
}
impl RelayManager {
pub fn new() -> Self { RelayManager { failed_relays: Vec::new() } }
pub fn pick_relay(&self, relays: &[url::Url]) -> Option<url::Url> {
use payjoin::bitcoin::secp256k1::rand::prelude::SliceRandom;
let available: Vec<_> =
relays.iter().filter(|r| !self.failed_relays.contains(r)).cloned().collect();
available.choose(&mut payjoin::bitcoin::key::rand::thread_rng()).cloned()
}
pub fn add_failed_relay(&mut self, relay: url::Url) { self.failed_relays.push(relay); }
}
mod.rs
async fn select_relay(&self) -> Result<url::Url> {
let relays = self.config.v2()?.ohttp_relays.clone();
self.relay_manager
.lock()
.expect("Lock should not be poisoned")
.pick_relay(&relays)
.ok_or(anyhow!("No valid relays available"))
}
renamed unwrap_relay_or_else_fetch to select_relay since relay selection is now a simple config lookup with no fetch involved. OHTTP key fetching still happens separately via unwrap_ohttp_keys_or_else_fetch
switching relays per-request is safe because the relay is a stateless proxy, session state lives in the directory
I've tested locally and all tests are passing. does this look like the right path to you?
There was a problem hiding this comment.
hey guys,
I was digging whether relay selection should be per-session rather than per-request, and I found that in the current code, selected_relay is stored in RelayManager state, and in resume_payjoins, self.clone() shares the same Arc<Mutex<RelayManager>> across all spawned tasks. the first session to call unwrap_relay_or_else_fetch randomly selects a relay and writes it to selected_relay, either randomly via fetch_ohttp_keys, or as @DanGould said, deterministically as ohttp_relays[0] if ohttp_keys is config. Every subsequent session finds Some(relay) and skips the random entirely
let selected_relay =
self.relay_manager.lock().expect("Lock should not be poisoned").get_selected_relay();
let ohttp_relay = match selected_relay {
Some(relay) => relay, // sessions 2, 3... never randomize
None =>
unwrap_ohttp_keys_or_else_fetch(&self.config, directory, self.relay_manager.clone())
.await?
.relay_url,
};
Ok(ohttp_relay)
}
so resume with 3 interrupted sessions always ends up using the same relay for all of them, which also comes back to the fingerprinting concern. for send and receive this doesn't manifest, since each command processes only a single session and returns
the pick_relay approach fixes this naturally since no state is stored between calls, each call it's independently randomized, benefiting not only privacy, but design too. what do you think about it?
as a separate design question: should the random happen once per session, at the top of process_sender_session and process_receiver_session, passed down as a parameter?
in payjoin the natural boundary is the session, it has a well-defined start and end, the relay holds no state between sessions by design, and sessions are already isolated. rotating relays across requests within the same session it's safe, but would be an unusual pattern that could itself become a fingerprint
There was a problem hiding this comment.
If the config ohttp_keys is set no randomization happens at all
The intention here was that the key fetching is simply skipped as it would be hard-coded by the config, but I think you're correct in thinking this is a bug as any followup connection would just use the 0 index. We should probably still randomize the relay selection still in that case.
so resume with 3 interrupted sessions always ends up using the same relay for all of them, which also comes back to the fingerprinting concern. for send and receive this doesn't manifest, since each command processes only a single session and returns
We had decided that once a relay was selected it should be held onto just to improve reliability instead of always retrying a new set of relays but maybe we should reassess this.
| /// Failed relays are tracked and excluded from future selections until | ||
| /// the session is reset, providing resilience without sacrificing privacy. |
There was a problem hiding this comment.
"until the session is reset" is doing a bit of assumption, I think it'd be more precise to say "for the lifetime of the [RelayManager]."
"resilience without sacrificing privacy" is also pretty vague for in-code docs. It might make sense in a blog post but I think here it's important to be specific about what the resilience protects against exactly and what privacy sacrifices are otherwise made if this is to be mentioned at all. It's the right direction but I could see it being misread as a formal guarantee.
| /// When multiple relays are available, callers SHOULD select one at random rather than | ||
| /// always using the same relay. A fixed selection order creates a fingerprintable pattern |
There was a problem hiding this comment.
- what's the fingerprint, precisely? (maybe not to document here but important to articulate)
- This comment seems applicable to every function that takes an ohttp_relay. Seems overkill to document it with a type so it's there in every function but maybe a
RelayManagerthat actually randomized could document the right pattern.
There was a problem hiding this comment.
I understand that the ISP sees which relay the client connects to. randomizing across relays fragments this pattern, but if all configured relays are publicly known as payjoin relays, the ISP can still infer that the client is using payjoin
in my view, an interesting private scenario is private relay + randomization among private relays or tor
additionally, the TLS handshake exposes a fingerprint of the client's software stack before any request is made, so the relay can compute a JA3/JA4 hash from the ClientHello and maybe identify which wallet or software is being used
this does not identify the individual user, but combined with IP, timestamps(frequency), and encrypted payload size, it adds another dimension to the behavioral profile, that's why randomizing across independent relays(operated by different entities) prevents any single relay from accumulating enough data to reconstruct the pattern
Fixed relay order creates a fingerprintable pattern at the network layer, even though OHTTP protects the user's IP. Document that callers should select relays randomly and track failed ones for resilience, following the existing
RelayManagerimplementation inpayjoin-cli.Claude sonnet 4.6 helped me doing the research of the issue.
Closes #1328
Please confirm the following before requesting review:
AI
in the body of this PR.